scholarly journals Recognition of spatial data from natural language texts for the purpose of visualization

2021 ◽  
Vol 12 (5-2021) ◽  
pp. 50-56
Author(s):  
Boris M. Pileckiy ◽  

This paper describes one of the possible implementation options for the recognition of spatial data from natural language texts. The proposed option is based on the lexico-syntactic analysis of texts, which requires the use of special grammars and dictionaries. Spatial data recognition is carried out for their subsequent geocoding and visualization. The practical implementation of spatial data recognition is done using a free, freely distributed software tool. Also, some applications of spatial data are considered in the work and preliminary results of spatial data recognition are given.

2018 ◽  
pp. 35-38
Author(s):  
O. Hyryn

The article deals with natural language processing, namely that of an English sentence. The article describes the problems, which might arise during the process and which are connected with graphic, semantic, and syntactic ambiguity. The article provides the description of how the problems had been solved before the automatic syntactic analysis was applied and the way, such analysis methods could be helpful in developing new analysis algorithms. The analysis focuses on the issues, blocking the basis for the natural language processing — parsing — the process of sentence analysis according to their structure, content and meaning, which aims to analyze the grammatical structure of the sentence, the division of sentences into constituent components and defining links between them.


2013 ◽  
Vol 48 ◽  
pp. 1-22 ◽  
Author(s):  
M. Alabbas ◽  
A. Ramsay

Many natural language processing (NLP) applications require the computation of similarities between pairs of syntactic or semantic trees. Many researchers have used tree edit distance for this task, but this technique suffers from the drawback that it deals with single node operations only. We have extended the standard tree edit distance algorithm to deal with subtree transformation operations as well as single nodes. The extended algorithm with subtree operations, TED+ST, is more effective and flexible than the standard algorithm, especially for applications that pay attention to relations among nodes (e.g. in linguistic trees, deleting a modifier subtree should be cheaper than the sum of deleting its components individually). We describe the use of TED+ST for checking entailment between two Arabic text snippets. The preliminary results of using TED+ST were encouraging when compared with two string-based approaches and with the standard algorithm.


2021 ◽  
Author(s):  
Carolinne Roque e Faria ◽  
Cinthyan Renata Sachs Camerlengo de Barb

Technology is becoming expressively popular among agribusiness producers and is progressing in all agricultural area. One of the difficulties in this context is to handle data in natural language to solve problems in the field of agriculture. In order to build up dialogs and provide rich researchers, the present work uses Natural Language Processing (NLP) techniques to develop an automatic and effective computer system to interact with the user and assist in the identification of pests and diseases in the soybean farming, stored in a database repository to provide accurate diagnoses to simplify the work of the agricultural professional and also for those who deal with a lot of information in this area. Information on 108 pests and 19 diseases that damage Brazilian soybean was collected from Brazilian bibliographic manuals with the purpose to optimize the data and improve production, using the spaCy library for syntactic analysis of NLP, which allowed the pre-process the texts, recognize the named entities, calculate the similarity between the words, verify dependency parsing and also provided the support for the development requirements of the CAROLINA tool (Robotized Agronomic Conversation in Natural Language) using the language belonging to the agricultural area.


Author(s):  
John Carroll

This article introduces the concepts and techniques for natural language (NL) parsing, which signifies, using a grammar to assign a syntactic analysis to a string of words, a lattice of word hypotheses output by a speech recognizer or similar. The level of detail required depends on the language processing task being performed and the particular approach to the task that is being pursued. This article further describes approaches that produce ‘shallow’ analyses. It also outlines approaches to parsing that analyse the input in terms of labelled dependencies between words. Producing hierarchical phrase structure requires grammars that have at least context-free (CF) power. CF algorithms that are widely used in parsing of NL are described in this article. To support detailed semantic interpretation more powerful grammar formalisms are required, but these are usually parsed using extensions of CF parsing algorithms. Furthermore, this article describes unification-based parsing. Finally, it discusses three important issues that have to be tackled in real-world applications of parsing: evaluation of parser accuracy, parser efficiency, and measurement of grammar/parser coverage.


2007 ◽  
Vol 15 (3) ◽  
pp. 199-213 ◽  
Author(s):  
Arthur C. Graesser ◽  
Moongee Jeon ◽  
Yan Yan ◽  
Zhiqiang Cai

Discourse cohesion is presumably an important facilitator of comprehension when individuals read texts and hold conversations. This study investigated components of cohesion and language in different types of discourse about Newtonian physics: A textbook, textoids written by experimental psychologists, naturalistic tutorial dialoguebetween expert human tutors and college students, andAutoTutor tutorial dialogue between a computer tutor and students (AutoTutor is an animated pedagogical agent that helps students learn about physics by holding conversations in natural language). We analyzed the four types of discourse with Coh-Metrix, a software tool that measures discourse on different components of cohesion, language, and readability. The cohesion indices included co-reference, syntactic and semantic similarity, causal cohesion, incidence of cohesion signals (e.g., connectives, logical operators), and many other measures. Cohesion data were quite similar for the two forms of discourse in expository monologue (textbooks and textoids) and for the two types of tutorial dialogue (i.e., students interacting with human tutors and AutoTutor), but very different between the discourse of expository monologue and tutorial dialogue. Coh-Metrix was also able to detect subtle differences in the language and discourse of AutoTutor versus human tutoring.


2013 ◽  
Vol 21 (2) ◽  
pp. 167-200 ◽  
Author(s):  
SEBASTIAN PADÓ ◽  
TAE-GIL NOH ◽  
ASHER STERN ◽  
RUI WANG ◽  
ROBERTO ZANOLI

AbstractA key challenge at the core of many Natural Language Processing (NLP) tasks is the ability to determine which conclusions can be inferred from a given natural language text. This problem, called theRecognition of Textual Entailment (RTE), has initiated the development of a range of algorithms, methods, and technologies. Unfortunately, research on Textual Entailment (TE), like semantics research more generally, is fragmented into studies focussing on various aspects of semantics such as world knowledge, lexical and syntactic relations, or more specialized kinds of inference. This fragmentation has problematic practical consequences. Notably, interoperability among the existing RTE systems is poor, and reuse of resources and algorithms is mostly infeasible. This also makes systematic evaluations very difficult to carry out. Finally, textual entailment presents a wide array of approaches to potential end users with little guidance on which to pick. Our contribution to this situation is the novel EXCITEMENT architecture, which was developed to enable and encourage the consolidation of methods and resources in the textual entailment area. It decomposes RTE into components with strongly typed interfaces. We specify (a) a modular linguistic analysis pipeline and (b) a decomposition of the ‘core’ RTE methods into top-level algorithms and subcomponents. We identify four major subcomponent types, including knowledge bases and alignment methods. The architecture was developed with a focus on generality, supporting all major approaches to RTE and encouraging language independence. We illustrate the feasibility of the architecture by constructing mappings of major existing systems onto the architecture. The practical implementation of this architecture forms the EXCITEMENT open platform. It is a suite of textual entailment algorithms and components which contains the three systems named above, including linguistic-analysis pipelines for three languages (English, German, and Italian), and comprises a number of linguistic resources. By addressing the problems outlined above, the platform provides a comprehensive and flexible basis for research and experimentation in textual entailment and is available as open source software under the GNU General Public License.


2003 ◽  
Vol 2 (2) ◽  
pp. 126-139 ◽  
Author(s):  
Sanjay Rana ◽  
Jason Dykes

Animated sequences of raster images that represent continuously varying surfaces, such as a temporal series of an evolving landform or an attribute series of socio-economic variation, are often used in an attempt to gain insight from ordered sequences of raster spatial data. Despite their aesthetic appeal and condensed nature, such representations are limited in terms of their suitability for prompting ideas and offering insight due to their poor information delivery and the lack of the levels of interactivity that are required to support visualization. Cartographic techniques aim to assist users of geographic information through processes of abstraction, by selecting, simplifying, smoothing and exaggerating when representing an underlying spatial data set graphically. Here we suggest a number of transformations and abstractions that take advantage of these techniques in a specific context–that of addressing the limitations associated with using animated raster surfaces for visualization, and propose them in the context of a framework that can be used to inform practice. The five techniques proposed are spatial and attribute smoothing, temporal interpolation, transformation of the surfaces into a network of morphometric features, the use of a graphic lag or fading and the employment of techniques for conditional interactivity that are appropriate for visualization. These efforts allow us to generate graphical environments that support visualization when using animated sequences of images representing continuous surfaces and are analogous to traditional cartographic techniques, namely, smoothing and exaggeration, simplification, enhancement and the various issues of design. By developing a framework for considering cartography in support of visualization from this particular type of data and phenomenon we aim to highlight the utility of a generically cartographic approach to information visualization. A number of particular techniques originating from computer science and conventional cartography are used in an application of the framework. A suitably interactive software tool is offered for evaluation–to establish the results of applying the framework and demonstrate ways in which we may augment the visualization of dynamic raster surfaces through animation and more generally aim to offer opportunity for insight through cartographic design.


Sign in / Sign up

Export Citation Format

Share Document