Morphological disambiguation of Hebrew: a case study in classifier combination

2012 ◽  
Vol 20 (1) ◽  
pp. 69-97 ◽  
Author(s):  
GENNADI LEMBERSKY ◽  
DANNY SHACHAM ◽  
SHULY WINTNER

AbstractMorphological analysis and disambiguation are crucial stages in a variety of natural language processing applications, especially when languages with complex morphology are concerned. We present a system which disambiguates the output of a morphological analyzer for Hebrew. It consists of several simple classifiers and a module that combines them under the constraints imposed by the analyzer. We explore several approaches to classifier combination, as well as a back-off mechanism that relies on a large unannotated corpus. Our best result, around 83 percent accuracy, compares favorably with the state of the art on this task.

2021 ◽  
Vol 3 ◽  
Author(s):  
Marieke van Erp ◽  
Christian Reynolds ◽  
Diana Maynard ◽  
Alain Starke ◽  
Rebeca Ibáñez Martín ◽  
...  

In this paper, we discuss the use of natural language processing and artificial intelligence to analyze nutritional and sustainability aspects of recipes and food. We present the state-of-the-art and some use cases, followed by a discussion of challenges. Our perspective on addressing these is that while they typically have a technical nature, they nevertheless require an interdisciplinary approach combining natural language processing and artificial intelligence with expert domain knowledge to create practical tools and comprehensive analysis for the food domain.


2021 ◽  
pp. 1-13
Author(s):  
Deguang Chen ◽  
Ziping Ma ◽  
Lin Wei ◽  
Yanbin Zhu ◽  
Jinlin Ma ◽  
...  

Text-based reading comprehension models have great research significance and market value and are one of the main directions of natural language processing. Reading comprehension models of single-span answers have recently attracted more attention and achieved significant results. In contrast, multi-span answer models for reading comprehension have been less investigated and their performances need improvement. To address this issue, in this paper, we propose a text-based multi-span network for reading comprehension, ALBERT_SBoundary, and build a multi-span answer corpus, MultiSpan_NMU. We also conduct extensive experiments on the public multi-span corpus, MultiSpan_DROP, and our multi-span answer corpus, MultiSpan_NMU, and compare the proposed method with the state-of-the-art. The experimental results show that our proposed method achieves F1 scores of 84.10 and 92.88 on MultiSpan_DROP and MultiSpan_NMU datasets, respectively, while it also has fewer parameters and a shorter training time.


Author(s):  
Zixuan Ke ◽  
Vincent Ng

Despite being investigated for over 50 years, the task of automated essay scoring is far from being solved. Nevertheless, it continues to draw a lot of attention in the natural language processing community in part because of its commercial and educational values as well as the associated research challenges. This paper presents an overview of the major milestones made in automated essay scoring research since its inception.


Author(s):  
Amal Zouaq

This chapter gives an overview over the state-of-the-art in natural language processing for ontology learning. It presents two main NLP techniques for knowledge extraction from text, namely shallow techniques and deep techniques, and explains their usefulness for each step of the ontology learning process. The chapter also advocates the interest of deeper semantic analysis methods for ontology learning. In fact, there have been very few attempts to create ontologies using deep NLP. After a brief introduction to the main semantic analysis approaches, the chapter focuses on lexico-syntactic patterns based on dependency grammars and explains how these patterns can be considered as a step towards deeper semantic analysis. Finally, the chapter addresses the “ontologization” task that is the ability to filter important concepts and relationships among the mass of extracted knowledge.


2020 ◽  
pp. 29-33
Author(s):  
Gabriel AGUILERA-GONZÁLEZ ◽  
Christian PADILLA-NAVARRO ◽  
Carlos ZARATE-TREJO ◽  
Georges KHALAF

Suicide prevention is one of the great issues of the current era. Institutions such as the World Health Organization, have continued to search for all possible alternatives for early detection and timely prevention. Suicide rates have grown more and more in the world, and Mexico, although it is not the country with the most suicides, is one of the countries with the highest growth in recent years. At present, the use of social networks has generated great changes in the way we communicate. Expressing yourself through a social network begins to be more common than expressing ourselves to human beings. Several studies, which will be presented later, show that it is possible to determine from the content of social networks: cases of depression, risk of suicide, and other mental problems. The use of technological tools, such as Natural Language Processing, has served as an effective ally for the early detection of risks, such as abuse, bullying or even detecting emotional problems. The present research seeks to carry out an in-depth analysis in the state of the art of the application of Natural Language Processing as an ally for the detection of suicide risk from the analysis of texts for Mexican Spanish in Social Networks.


2019 ◽  
Vol 53 (2) ◽  
pp. 3-10
Author(s):  
Muthu Kumar Chandrasekaran ◽  
Philipp Mayr

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.


Author(s):  
Jacqueline Peng ◽  
Mengge Zhao ◽  
James Havrilla ◽  
Cong Liu ◽  
Chunhua Weng ◽  
...  

Abstract Background Natural language processing (NLP) tools can facilitate the extraction of biomedical concepts from unstructured free texts, such as research articles or clinical notes. The NLP software tools CLAMP, cTAKES, and MetaMap are among the most widely used tools to extract biomedical concept entities. However, their performance in extracting disease-specific terminology from literature has not been compared extensively, especially for complex neuropsychiatric disorders with a diverse set of phenotypic and clinical manifestations. Methods We comparatively evaluated these NLP tools using autism spectrum disorder (ASD) as a case study. We collected 827 ASD-related terms based on previous literature as the benchmark list for performance evaluation. Then, we applied CLAMP, cTAKES, and MetaMap on 544 full-text articles and 20,408 abstracts from PubMed to extract ASD-related terms. We evaluated the predictive performance using precision, recall, and F1 score. Results We found that CLAMP has the best performance in terms of F1 score followed by cTAKES and then MetaMap. Our results show that CLAMP has much higher precision than cTAKES and MetaMap, while cTAKES and MetaMap have higher recall than CLAMP. Conclusion The analysis protocols used in this study can be applied to other neuropsychiatric or neurodevelopmental disorders that lack well-defined terminology sets to describe their phenotypic presentations.


Author(s):  
Sourajit Roy ◽  
Pankaj Pathak ◽  
S. Nithya

During the advent of the 21st century, technical breakthroughs and developments took place. Natural Language Processing or NLP is one of their promising disciplines that has been increasingly dynamic via groundbreaking findings on most computer networks. Because of the digital revolution the amounts of data generated by M2M communication across devices and platforms such as Amazon Alexa, Apple Siri, Microsoft Cortana, etc. were significantly increased. This causes a great deal of unstructured data to be processed that does not fit in with standard computational models. In addition, the increasing problems of language complexity, data variability and voice ambiguity make implementing models increasingly harder. The current study provides an overview of the potential and breadth of the NLP market and its acceptance in industry-wide, in particular after Covid-19. It also gives a macroscopic picture of progress in natural language processing research, development and implementation.


Sign in / Sign up

Export Citation Format

Share Document