A NEW FORMAL DEFINITION OF LANGUAGE FOR NATURAL LANGUAGE PROCESSING

Automatic paraphrasing is an important component in many natural language processing tasks. In this article we present a new parallel corpus with paraphrase annotations. We adopt a definition of paraphrase based on word alignments and show that it yields high inter-annotator agreement. As Kappa is suited to nominal data, we employ an alternative agreement statistic which is appropriate for structured alignment tasks. We discuss how the corpus can be usefully employed in evaluating paraphrase systems automatically (e.g., by measuring precision, recall, and F1) and also in developing linguistically rich paraphrase models based on syntactic structure.

Download Full-text

Tu1276 Improving Case Definition of Crohn's Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processing - a Novel Informatics Approach

Gastroenterology ◽

10.1016/s0016-5085(12)63070-4 ◽

2012 ◽

Vol 142 (5) ◽

pp. S-791 ◽

Cited By ~ 2

Author(s):

Ashwin N. Ananthakrishnan ◽

Tianxi Cai ◽

Su-Chun Cheng ◽

Pei Jun Chen ◽

Guergana Savova ◽

...

Keyword(s):

Ulcerative Colitis ◽

Crohn’S Disease ◽

Natural Language Processing ◽

Crohn's Disease ◽

Natural Language ◽

Electronic Medical Records ◽

Language Processing ◽

Medical Records ◽

Case Definition ◽

Definition Of

Download Full-text

Improving Case Definition of Crohnʼs Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processing

Inflammatory Bowel Diseases ◽

10.1097/mib.0b013e31828133fd ◽

2013 ◽

Vol 19 (7) ◽

pp. 1411-1420 ◽

Cited By ~ 79

Author(s):

Ashwin N. Ananthakrishnan ◽

Tianxi Cai ◽

Guergana Savova ◽

Su-Chun Cheng ◽

Pei Chen ◽

...

Keyword(s):

Ulcerative Colitis ◽

Natural Language Processing ◽

Natural Language ◽

Electronic Medical Records ◽

Language Processing ◽

Medical Records ◽

Case Definition ◽

Definition Of

Download Full-text

Systematic community of Practice activities evaluation through Natural Language Processing: application to research projects

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060419000076 ◽

2019 ◽

Vol 33 (02) ◽

pp. 160-171

Author(s):

Virginie Goepp ◽

Nada Matta ◽

Emmanuel Caillaud ◽

Françoise Feugeas

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Community Of Practice ◽

Language Processing ◽

Speech Acts ◽

Interaction Matrix ◽

Efficiency Evaluation ◽

Processing Application ◽

Set Up ◽

Definition Of

AbstractCommunity of Practice (CoP) efficiency evaluation is a great deal in research. Indeed, having the possibility to know if a given CoP is successful or not is essential to better manage it over time. The existing approaches for efficiency evaluation are difficult and time-consuming to put into action on real CoPs. They require either to evaluate subjective constructs making the analysis unreliable, either to work out a knowledge interaction matrix that is difficult to set up. However, these approaches build their evaluation on the fact that a CoP is successful if knowledge is exchanged between the members. It is the case if there are some interactions between the actors involved in the CoP. Therefore, we propose to analyze these interactions through the exchanges of emails thanks to Natural Language Processing. Our approach is systematic and semi-automated. It requires the e-mails exchanged and the definition of the speech-acts that will be retrieved. We apply it on a real project-based CoP: the SEPOLBE research project that involves different expertise fields. It allows us to identify the CoP core group and to emphasize learning processes between members with different backgrounds (Microbiology, Electrochemistry and Civil engineering).

Download Full-text

CYK Parsing over Distributed Representations

Algorithms ◽

10.3390/a13100262 ◽

2020 ◽

Vol 13 (10) ◽

pp. 262

Author(s):

Fabio Massimo Zanzotto ◽

Giorgio Satta ◽

Giordano Cristini

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Matrix Multiplication ◽

General Context ◽

Distributed Representations ◽

Syntactic Pattern ◽

Definition Of ◽

Context Free

Parsing is a key task in computer science, with applications in compilers, natural language processing, syntactic pattern matching, and formal language theory. With the recent development of deep learning techniques, several artificial intelligence applications, especially in natural language processing, have combined traditional parsing methods with neural networks to drive the search in the parsing space, resulting in hybrid architectures using both symbolic and distributed representations. In this article, we show that existing symbolic parsing algorithms for context-free languages can cross the border and be entirely formulated over distributed representations. To this end, we introduce a version of the traditional Cocke–Younger–Kasami (CYK) algorithm, called distributed (D)-CYK, which is entirely defined over distributed representations. D-CYK uses matrix multiplication on real number matrices of a size independent of the length of the input string. These operations are compatible with recurrent neural networks. Preliminary experiments show that D-CYK approximates the original CYK algorithm. By showing that CYK can be entirely performed on distributed representations, we open the way to the definition of recurrent layer neural networks that can process general context-free languages.

Download Full-text

Variations in terminology

Terminology ◽

10.1075/term.16.1.02con ◽

2010 ◽

Vol 16 (1) ◽

pp. 30-50 ◽

Cited By ~ 12

Author(s):

Anne Condamines

Keyword(s):

Risk Management ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Language Use ◽

Knowledge Engineering ◽

Field Of Study ◽

New Approach ◽

Individual Variations ◽

Definition Of

The study of variation in terminology came to the fore over the last fifteen years in connection with advances in textual terminology. This new approach to terminology could be a way of improving the management of risk related to language use in the workplace and to contribute to the definition of a “linguistics of the workplace”. As a theoretical field of study, linguistics has hardly found any application in the workplace. Two of its applied branches, however, Sociolinguistics and Natural Language Processing (NLP) are relevant. Both deal with lexical phenomena, — i.e. terminology — sociolinguistics taking into account very subtle inter-individual variations and NLP being more interested in stability in the use. So, taking into account variations in building terminologies could be a means of considering both description and prescription, use and norm. This approach to terminology, which has been made possible thanks to NLP and Knowledge Engineering could be a way of meeting needs in the workplace concerning risk management related to language use.

Download Full-text

Natural Language Processing and Enhanced Clinical Decision Making Radiology and VINCI

PsycEXTRA Dataset ◽

10.1037/e615572012-015 ◽

2012 ◽

Author(s):

Eliot Siegel

Keyword(s):

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision

Download Full-text

Natural Language Processing in the Clinical Setting

PsycEXTRA Dataset ◽

10.1037/e615572012-013 ◽

2012 ◽

Author(s):

Thomas H. Payne

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Setting

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text