Researching on Parsing

2013 ◽  
Vol 846-847 ◽  
pp. 1376-1379
Author(s):  
Li Fei Geng ◽  
Hong Lian Li

Syntactic analysis is the core technology of natural language processing and it is the cornerstone for further linguistic analysis. This paper, first introduces the basic grammatical system and summary the technology of current parsing. Then analysis the characteristics of probabilistic context-free grammars deep and introduce the method of improving for probabilistic context-free. The last we point the difficulty of Chinese parsing.

2018 ◽  
pp. 35-38
Author(s):  
O. Hyryn

The article deals with natural language processing, namely that of an English sentence. The article describes the problems, which might arise during the process and which are connected with graphic, semantic, and syntactic ambiguity. The article provides the description of how the problems had been solved before the automatic syntactic analysis was applied and the way, such analysis methods could be helpful in developing new analysis algorithms. The analysis focuses on the issues, blocking the basis for the natural language processing — parsing — the process of sentence analysis according to their structure, content and meaning, which aims to analyze the grammatical structure of the sentence, the division of sentences into constituent components and defining links between them.


2021 ◽  
Author(s):  
Carolinne Roque e Faria ◽  
Cinthyan Renata Sachs Camerlengo de Barb

Technology is becoming expressively popular among agribusiness producers and is progressing in all agricultural area. One of the difficulties in this context is to handle data in natural language to solve problems in the field of agriculture. In order to build up dialogs and provide rich researchers, the present work uses Natural Language Processing (NLP) techniques to develop an automatic and effective computer system to interact with the user and assist in the identification of pests and diseases in the soybean farming, stored in a database repository to provide accurate diagnoses to simplify the work of the agricultural professional and also for those who deal with a lot of information in this area. Information on 108 pests and 19 diseases that damage Brazilian soybean was collected from Brazilian bibliographic manuals with the purpose to optimize the data and improve production, using the spaCy library for syntactic analysis of NLP, which allowed the pre-process the texts, recognize the named entities, calculate the similarity between the words, verify dependency parsing and also provided the support for the development requirements of the CAROLINA tool (Robotized Agronomic Conversation in Natural Language) using the language belonging to the agricultural area.


Author(s):  
Alfonso Ortega ◽  
Emilio del Rosal ◽  
Diana Pérez ◽  
Robert Mercaş ◽  
Alexander Perekrestenko ◽  
...  

Author(s):  
John Carroll

This article introduces the concepts and techniques for natural language (NL) parsing, which signifies, using a grammar to assign a syntactic analysis to a string of words, a lattice of word hypotheses output by a speech recognizer or similar. The level of detail required depends on the language processing task being performed and the particular approach to the task that is being pursued. This article further describes approaches that produce ‘shallow’ analyses. It also outlines approaches to parsing that analyse the input in terms of labelled dependencies between words. Producing hierarchical phrase structure requires grammars that have at least context-free (CF) power. CF algorithms that are widely used in parsing of NL are described in this article. To support detailed semantic interpretation more powerful grammar formalisms are required, but these are usually parsed using extensions of CF parsing algorithms. Furthermore, this article describes unification-based parsing. Finally, it discusses three important issues that have to be tackled in real-world applications of parsing: evaluation of parser accuracy, parser efficiency, and measurement of grammar/parser coverage.


2013 ◽  
Vol 21 (2) ◽  
pp. 167-200 ◽  
Author(s):  
SEBASTIAN PADÓ ◽  
TAE-GIL NOH ◽  
ASHER STERN ◽  
RUI WANG ◽  
ROBERTO ZANOLI

AbstractA key challenge at the core of many Natural Language Processing (NLP) tasks is the ability to determine which conclusions can be inferred from a given natural language text. This problem, called theRecognition of Textual Entailment (RTE), has initiated the development of a range of algorithms, methods, and technologies. Unfortunately, research on Textual Entailment (TE), like semantics research more generally, is fragmented into studies focussing on various aspects of semantics such as world knowledge, lexical and syntactic relations, or more specialized kinds of inference. This fragmentation has problematic practical consequences. Notably, interoperability among the existing RTE systems is poor, and reuse of resources and algorithms is mostly infeasible. This also makes systematic evaluations very difficult to carry out. Finally, textual entailment presents a wide array of approaches to potential end users with little guidance on which to pick. Our contribution to this situation is the novel EXCITEMENT architecture, which was developed to enable and encourage the consolidation of methods and resources in the textual entailment area. It decomposes RTE into components with strongly typed interfaces. We specify (a) a modular linguistic analysis pipeline and (b) a decomposition of the ‘core’ RTE methods into top-level algorithms and subcomponents. We identify four major subcomponent types, including knowledge bases and alignment methods. The architecture was developed with a focus on generality, supporting all major approaches to RTE and encouraging language independence. We illustrate the feasibility of the architecture by constructing mappings of major existing systems onto the architecture. The practical implementation of this architecture forms the EXCITEMENT open platform. It is a suite of textual entailment algorithms and components which contains the three systems named above, including linguistic-analysis pipelines for three languages (English, German, and Italian), and comprises a number of linguistic resources. By addressing the problems outlined above, the platform provides a comprehensive and flexible basis for research and experimentation in textual entailment and is available as open source software under the GNU General Public License.


2018 ◽  
Vol 24 (3) ◽  
pp. 393-413 ◽  
Author(s):  
STELLA FRANK ◽  
DESMOND ELLIOTT ◽  
LUCIA SPECIA

AbstractTwo studies on multilingual multimodal image description provide empirical evidence towards two questions at the core of the task: (i) whether target language speakers prefer descriptions generated directly in their native language, as compared to descriptions translated from a different language; (ii) whether images improve human translation of descriptions. These results provide guidance for future work in multimodal natural language processing by first showing that on the whole, translations are not distinguished from native language descriptions, and second delineating and quantifying the information gained from the image during the human translation task.


2020 ◽  
pp. 41-45
Author(s):  
O. Hyryn

The article proceeds from the intended use of parsing for the purposes of automatic information search, question answering, logical conclusions, authorship verification, text authenticity verification, grammar check, natural language synthesis and other related tasks, such as ungrammatical speech analysis, morphological class definition, anaphora resolution etc. The study covers natural language processing challenges, namely of an English sentence. The article describes formal and linguistic problems, which might arise during the process and which are connected with graphic, semantic, and syntactic ambiguity. The article provides the description of how the problems had been solved before the automatic syntactic analysis was applied and the way, such analysis methods could be helpful in developing new analysis algorithms today. The analysis focuses on the issues, blocking the basis for the natural language processing — parsing — the process of sentence analysis according to their structure, content and meaning, which aims to examine the grammatical structure of the sentence, the division of sentences into constituent components and defining links between them. The analysis identifies a number of linguistic issues that will contribute to the development of an improved model of automatic syntactic analysis: lexical and grammatical synonymy and homonymy, hypo- and hyperonymy, lexical and semantic fields, anaphora resolution, ellipsis, inversion etc. The scope of natural language processing reveals obvious directions for the improvement of parsing models. The improvement will consequently expand the scope and improve the results in areas that already employ automatic parsing. Indispensable achievements in vocabulary and morphology processing shall not be neglected while improving automatic syntactic analysis mechanisms for natural languages.


Author(s):  
JungHo Jeon ◽  
Xin Xu ◽  
Yuxi Zhang ◽  
Liu Yang ◽  
Hubo Cai

Construction inspection is an essential component of the quality assurance programs of state transportation agencies (STAs), and the guidelines for this process reside in lengthy textual specifications. In the current practice, engineers and inspectors must manually go through these documents to plan, conduct, and document their inspections, which is time-consuming, very subjective, inconsistent, and prone to error. A promising alternative to this manual process is the application of natural language processing (NLP) techniques (e.g., text parsing, sentence classification, and syntactic analysis) to automatically extract construction inspection requirements from textual documents and present them as straightforward check questions. This paper introduces an NLP-based method that: 1) extracts individual sentences from the construction specification; 2) preprocesses the resulting sentences; 3) applies Word2Vec and GloVe algorithms to extract vector features; 4) uses a convolutional neural network (CNN) and recurrent neural network to classify sentences; and 5) converts the requirement sentences into check questions via syntactic analysis. The overall methodology was assessed using the Indiana Department of Transportation (DOT) specification as a test case. Our results revealed that the CNN + GloVe combination led to the highest accuracy, at 91.9%, and the lowest loss, at 11.7%. To further validate its use across STAs nationwide, we applied it to the construction specification of the South Carolina DOT as a test case, and our average accuracy was 92.6%.


Author(s):  
Matthew W. Crocker

Traditional approaches to natural language processing (NLP) can be considered construction-based. That is to say, they employ surface oriented, language specific rules, whether in the form of an Augmented Transition Network (ATN), logic grammar or some other grammar/parsing formalism. The problems of such approaches have always been apparent; they involve large sets of rules, often ad hoc, and their adequacy with respect to the grammar of the language is difficult to ensure.


Sign in / Sign up

Export Citation Format

Share Document