scholarly journals Natural Language Processing, Corpus Linguistics, Corpus Based Grammar Research

2010 ◽  
Vol 61 (1) ◽  
Author(s):  
Katarína Gajdošová ◽  
Agáta Karčová ◽  
Daniela Majchráková
2004 ◽  
Vol 9 (1) ◽  
pp. 53-68 ◽  
Author(s):  
Montserrat Arévalo Rodríguez ◽  
Montserrat Civit Torruella ◽  
Maria Antònia Martí

In the field of corpus linguistics, Named Entity treatment includes the recognition and classification of different types of discursive elements like proper names, date, time, etc. These discursive elements play an important role in different Natural Language Processing applications and techniques such as Information Retrieval, Information Extraction, translations memories, document routers, etc.


2016 ◽  
pp. 255-275 ◽  
Author(s):  
Alison L. Bailey ◽  
Anne Blackstock-Bernstein ◽  
Eve Ryan ◽  
Despina Pitsoulakis

2015 ◽  
Vol 1 (1) ◽  
Author(s):  
Patricia Murrieta-Flores ◽  
Ian Gregory

AbstractAlthough the use of Geographic Information Systems (GIS) has a long history in archaeology, spatial technologies have been rarely used to analyse the content of textual collections. A newly developed approach termed Geographic Text Analysis (GTA) is now allowing the semi-automated exploration of large corpora incorporating a combination of Natural Language Processing techniques, Corpus Linguistics, and GIS. In this article we explain the development of GTA, propose possible uses of this methodology in the field of archaeology, and give a summary of the challenges that emerge from this type of analysis.


ICAME Journal ◽  
2015 ◽  
Vol 39 (1) ◽  
pp. 5-24 ◽  
Author(s):  
Dawn Archer ◽  
Merja Kytö ◽  
Alistair Baron ◽  
Paul Rayson

Abstract Corpora of Early Modern English have been collected and released for research for a number of years. With large scale digitisation activities gathering pace in the last decade, much more historical textual data is now available for research on numerous topics including historical linguistics and conceptual history. We summarise previous research which has shown that it is necessary to map historical spelling variants to modern equivalents in order to successfully apply natural language processing and corpus linguistics methods. Manual and semiautomatic methods have been devised to support this normalisation and standardisation process. We argue that it is important to develop a linguistically meaningful rationale to achieve good results from this process. In order to do so, we propose a number of guidelines for normalising corpora and show how these guidelines have been applied in the Corpus of English Dialogues.


2014 ◽  
Vol 2014 ◽  
pp. 1-10
Author(s):  
Abdulmohsen Al-Thubaity ◽  
Hend Al-Khalifa ◽  
Reem Alqifari ◽  
Manal Almazrua

Despite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP), corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free standalone corpora processing systems are available to process such corpora. In this study, we first propose a framework for the evaluation of standalone corpora processing systems and then use it to evaluate seven freely available systems. The proposed framework considers the usability, functionality, and performance of the evaluated systems while taking into consideration their suitability for Arabic corpora. While the results show that most of the evaluated systems exhibited comparable usability scores, the scores for functionality and performance were substantially different with respect to support for the Arabic language andN-grams profile generation. The results of our evaluation will help potential users of the evaluated systems to choose the system that best meets their needs. More importantly, the results will help the developers of the evaluated systems to enhance their systems and developers of new corpora processing systems by providing them with a reference framework.


2020 ◽  
pp. 3-17
Author(s):  
Peter Nabende

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.


Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 1243-P
Author(s):  
JIANMIN WU ◽  
FRITHA J. MORRISON ◽  
ZHENXIANG ZHAO ◽  
XUANYAO HE ◽  
MARIA SHUBINA ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document