Natural Language Parsing

2009 ◽  
pp. 158-175
Author(s):  
Pauli Brattico ◽  
Mikko Maatta

Automatic natural language processing captures a lion’s share of the attention in open information management. In one way or another, many applications have to deal with natural language input. In this chapter the authors investigate the problem of natural language parsing from the perspective of biolinguistics. They argue that the human mind succeeds in the parsing task without the help of languagespecific rules of parsing and language-specific rules of grammar. Instead, there is a universal parser incorporating a universal grammar. The main argument comes from language acquisition: Children cannot learn language specific parsing rules by rule induction due to the complexity of unconstrained inductive learning. They suggest that the universal parser presents a manageable solution to the problem of automatic natural language processing when compared with parsers tinkered for specific purposes. A model for a completely language independent parser is presented, taking a recent minimalist theory as a starting point.

2020 ◽  
Vol 4 (1) ◽  
pp. 18-43
Author(s):  
Liuqing Li ◽  
Jack Geissinger ◽  
William A. Ingram ◽  
Edward A. Fox

AbstractNatural language processing (NLP) covers a large number of topics and tasks related to data and information management, leading to a complex and challenging teaching process. Meanwhile, problem-based learning is a teaching technique specifically designed to motivate students to learn efficiently, work collaboratively, and communicate effectively. With this aim, we developed a problem-based learning course for both undergraduate and graduate students to teach NLP. We provided student teams with big data sets, basic guidelines, cloud computing resources, and other aids to help different teams in summarizing two types of big collections: Web pages related to events, and electronic theses and dissertations (ETDs). Student teams then deployed different libraries, tools, methods, and algorithms to solve the task of big data text summarization. Summarization is an ideal problem to address learning NLP since it involves all levels of linguistics, as well as many of the tools and techniques used by NLP practitioners. The evaluation results showed that all teams generated coherent and readable summaries. Many summaries were of high quality and accurately described their corresponding events or ETD chapters, and the teams produced them along with NLP pipelines in a single semester. Further, both undergraduate and graduate students gave statistically significant positive feedback, relative to other courses in the Department of Computer Science. Accordingly, we encourage educators in the data and information management field to use our approach or similar methods in their teaching and hope that other researchers will also use our data sets and synergistic solutions to approach the new and challenging tasks we addressed.


2021 ◽  
Vol 50 (3) ◽  
pp. 27-28
Author(s):  
Immanuel Trummer

Introduction. We have seen significant advances in the state of the art in natural language processing (NLP) over the past few years [20]. These advances have been driven by new neural network architectures, in particular the Transformer model [19], as well as the successful application of transfer learning approaches to NLP [13]. Typically, training for specific NLP tasks starts from large language models that have been pre-trained on generic tasks (e.g., predicting obfuscated words in text [5]) for which large amounts of training data are available. Using such models as a starting point reduces task-specific training cost as well as the number of required training samples by orders of magnitude [7]. These advances motivate new use cases for NLP methods in the context of databases.


2021 ◽  
Vol 10 (5) ◽  
pp. 2857-2865
Author(s):  
Moanda Diana Pholo ◽  
Yskandar Hamam ◽  
Abdel Baset Khalaf ◽  
Chunling Du

Available literature reports several lymphoma cases misdiagnosed as tuberculosis, especially in countries with a heavy TB burden. This frequent misdiagnosis is due to the fact that the two diseases can present with similar symptoms. The present study therefore aims to analyse and explore TB as well as lymphoma case reports using Natural Language Processing tools and evaluate the use of machine learning to differentiate between the two diseases. As a starting point in the study, case reports were collected for each disease using web scraping. Natural language processing tools and text clustering were then used to explore the created dataset. Finally, six machine learning algorithms were trained and tested on the collected data, which contained 765 lymphoma and 546 tuberculosis case reports. Each method was evaluated using various performance metrics. The results indicated that the multi-layer perceptron model achieved the best accuracy (93.1%), recall (91.9%) and precision score (93.7%), thus outperforming other algorithms in terms of correctly classifying the different case reports.


Author(s):  
Matthew N.O. Sadiku ◽  
Yu Zhou ◽  
Sarhan M. Musa

Natural language processing (NLP)  refers to the process of using of computer algorithms to identify key elements in everyday language and extract meaning from unstructured spoken or written communication.  Healthcare is the biggest user of the NLP tools. It is expected that NLP tools should  be able to bridge the gap between the mountain of data generated daily and the limited cognitive capacity of the human mind.  This paper provides a brief introduction on the use of NLP in healthcare.


2021 ◽  
Vol 54 (3) ◽  
pp. 1-41
Author(s):  
Liping Zhao ◽  
Waad Alhoshan ◽  
Alessio Ferrari ◽  
Keletso J. Letsholo ◽  
Muideen A. Ajagbe ◽  
...  

Natural Language Processing for Requirements Engineering (NLP4RE) is an area of research and development that seeks to apply natural language processing (NLP) techniques, tools, and resources to the requirements engineering (RE) process, to support human analysts to carry out various linguistic analysis tasks on textual requirements documents, such as detecting language issues, identifying key domain concepts, and establishing requirements traceability links. This article reports on a mapping study that surveys the landscape of NLP4RE research to provide a holistic understanding of the field. Following the guidance of systematic review, the mapping study is directed by five research questions, cutting across five aspects of NLP4RE research, concerning the state of the literature, the state of empirical research, the research focus, the state of tool development, and the usage of NLP technologies. Our main results are as follows: (i) we identify a total of 404 primary studies relevant to NLP4RE, which were published over the past 36 years and from 170 different venues; (ii) most of these studies (67.08%) are solution proposals, assessed by a laboratory experiment or an example application, while only a small percentage (7%) are assessed in industrial settings; (iii) a large proportion of the studies (42.70%) focus on the requirements analysis phase, with quality defect detection as their central task and requirements specification as their commonly processed document type; (iv) 130 NLP4RE tools (i.e., RE specific NLP tools) are extracted from these studies, but only 17 of them (13.08%) are available for download; (v) 231 different NLP technologies are also identified, comprising 140 NLP techniques, 66 NLP tools, and 25 NLP resources, but most of them—particularly those novel NLP techniques and specialized tools—are used infrequently; by contrast, commonly used NLP technologies are traditional analysis techniques (e.g., POS tagging and tokenization), general-purpose tools (e.g., Stanford CoreNLP and GATE) and generic language lexicons (WordNet and British National Corpus). The mapping study not only provides a collection of the literature in NLP4RE but also, more importantly, establishes a structure to frame the existing literature through categorization, synthesis and conceptualization of the main theoretical concepts and relationships that encompass both RE and NLP aspects. Our work thus produces a conceptual framework of NLP4RE. The framework is used to identify research gaps and directions, highlight technology transfer needs, and encourage more synergies between the RE community, the NLP one, and the software and systems practitioners. Our results can be used as a starting point to frame future studies according to a well-defined terminology and can be expanded as new technologies and novel solutions emerge.


Sign in / Sign up

Export Citation Format

Share Document