Multi-word term variation

Author(s):  
Melania Cabezas-García ◽  
Santiago Chambó

Abstract Complex nominals (CNs) are frequently found in specialized discourse in all languages, since they are a productive method of creating terms by combining existing lexical units. In Spanish, a conceptual combination may often be rendered with a prepositional CN (PCN) or an equivalent adjectival CN (ACN), e.g., demanda de electricidad vs. demanda eléctrica [electricity demand]. Adjectives in ACNs – usually derived from nouns – are known as ‘relational adjectives’ because they encode semantic relations with other concepts. With recent exceptions, research has focused on the underlying semantic relations in CNs. In natural language processing, several works have dealt with the automatic detection of relation adjectives in Romance and Germanic languages. However, there is no discourse studies of these CNs, to our knowledge, for the goal of establishing writer recommendations. This study analyzed the co-text of equivalent PCNs and ACNs to identify factors governing the use of a certain form. EcoLexicon ES, a corpus of Spanish environmental specialized texts, was used to extract 6 relational adjectives and, subsequently, a set of 12 pairs of equivalent CNs. Their behavior in co-text was analyzed by querying EcoLexicon ES and a general language corpus with 20 expressions in CQP-syntax. Our results showed that immediate linguistic co-text determined the preference for a particular structure. Based on these findings, we provide writing guidelines to assist in the production of CNs.

2020 ◽  
Author(s):  
Vadim V. Korolev ◽  
Artem Mitrofanov ◽  
Kirill Karpov ◽  
Valery Tkachenko

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.


Terminology ◽  
2007 ◽  
Vol 13 (2) ◽  
pp. 201-223 ◽  
Author(s):  
Jeanne Eugenie Dancette

Understanding the semantic relations between terms in specialized texts is of critical importance in translation and terminology, and generally speaking in learning from texts. Our research highlights the advantages of formalizing them in order to build hierarchies and sets of horizontal conceptual relations (i.e. process-oriented relations) for knowledge acquisition. This paper discusses a method for extracting domain-specific semantic relations in specialized texts. Obviously, some texts are more appropriate than others in this regard. ‘Knowledge-rich’ texts such as encyclopaedia and textbooks are considered good materials because of the density and richness of thematic information. Considering them as such, we used the encyclopaedic articles of the Dictionnaire analytique de la distribution/Analytical Dictionary of Retailing. We retrieved over 3000 terms semantically related to all 350 headwords of the Dictionary, and grouped them into 28 classes of relations (paradigmatic, i.e. generic, specific, agent, goal, instrument, recipient, location, etc., and also syntagmatic, such as related verbs and adjectives). This paper discusses in particular the generic, agent and property relations and examines the linguistic markers that permit their retrieval.


2018 ◽  
Vol 25 (6) ◽  
pp. 726-733
Author(s):  
Maria S. Karyaeva ◽  
Pavel I. Braslavski ◽  
Valery A. Sokolov

The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted as similar words. It allows to establish semantic relations (synonymy, relations of hypernymy and hyponymy and other semantic relations) by applying an automatic extraction. The extraction of semantic relations by hand is considered as a time-consuming and biased task, requiring a large amount of time and some help of experts. Unfortunately, the word2vec model provides an associative list of words which does not consist of relative words only. In this paper, we show some additional criteria that may be applicable to solve this problem. Observations and experiments with well-known characteristics, such as word frequency, a position in an associative list, might be useful for improving results for the task of extraction of semantic relations for the Russian language by using word embedding. In the experiments, the word2vec model trained on the Flibusta and pairs from Wiktionary are used as examples with semantic relationships. Semantically related words are applicable to thesauri, ontologies and intelligent systems for natural language processing.


Author(s):  
Kaan Ant ◽  
Ugur Sogukpinar ◽  
Mehmet Fatif Amasyali

The use of databases those containing semantic relationships between words is becoming increasingly widespread in order to make natural language processing work more effective. Instead of the word-bag approach, the suggested semantic spaces give the distances between words, but they do not express the relation types. In this study, it is shown how semantic spaces can be used to find the type of relationship and it is compared with the template method. According to the results obtained on a very large scale, while is_a and opposite are more successful for semantic spaces for relations, the approach of templates is more successful in the relation types at_location, made_of and non relational.


2016 ◽  
Vol 6 (3) ◽  
pp. 258
Author(s):  
Gabriela Mariel Zunino

In order to promote the practical application of psycholinguistic data in educational fields and expecting that this transfer would enhance the development of both the pedagogical field and the investigation in experimental psycholinguistics, we present two experiments to analyse the production of semantic relations in discourse, especially the causality/countercausality dimension. We found that the pattern of causal advantage is cross-wise and consistent in subjects with different levels of formal education, so it could be a suitable scaffold to develop other aspects of discourse comprehension and production. We compare our results with previous findings about discourse comprehension and interpret the data in the framework of educational processes. To use of empirical evidence about language processing on educational fields allows not only to review specific issues such as the characteristics of teaching materials, but also to improve educational process in a comprehensive way, making possible to adapt different approaches to populations with different characteristics.


10.2196/20443 ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. e20443
Author(s):  
Xiaoying Li ◽  
Xin Lin ◽  
Huiling Ren ◽  
Jinjing Guo

Background Licensed drugs may cause unexpected adverse reactions in patients, resulting in morbidity, risk of mortality, therapy disruptions, and prolonged hospital stays. Officially approved drug package inserts list the adverse reactions identified from randomized controlled clinical trials with high evidence levels and worldwide postmarketing surveillance. Formal representation of the adverse drug reaction (ADR) enclosed in semistructured package inserts will enable deep recognition of side effects and rational drug use, substantially reduce morbidity, and decrease societal costs. Objective This paper aims to present an ontological organization of traceable ADR information extracted from licensed package inserts. In addition, it will provide machine-understandable knowledge for bioinformatics analysis, semantic retrieval, and intelligent clinical applications. Methods Based on the essential content of package inserts, a generic ADR ontology model is proposed from two dimensions (and nine subdimensions), covering the ADR information and medication instructions. This is followed by a customized natural language processing method programmed with Python to retrieve the relevant information enclosed in package inserts. After the biocuration and identification of retrieved data from the package insert, an ADR ontology is automatically built for further bioinformatic analysis. Results We collected 165 package inserts of quinolone drugs from the National Medical Products Administration and other drug databases in China, and built a specialized ADR ontology containing 2879 classes and 15,711 semantic relations. For each quinolone drug, the reported ADR information and medication instructions have been logically represented and formally organized in an ADR ontology. To demonstrate its usage, the source data were further bioinformatically analyzed. For example, the number of drug-ADR triples and major ADRs associated with each active ingredient were recorded. The 10 ADRs most frequently observed among quinolones were identified and categorized based on the 18 categories defined in the proposal. The occurrence frequency, severity, and ADR mitigation method explicitly stated in package inserts were also analyzed, as well as the top 5 specific populations with contraindications for quinolone drugs. Conclusions Ontological representation and organization using officially approved information from drug package inserts enables the identification and bioinformatic analysis of adverse reactions caused by a specific drug with regard to predefined ADR ontology classes and semantic relations. The resulting ontology-based ADR knowledge source classifies drug-specific adverse reactions, and supports a better understanding of ADRs and safer prescription of medications.


2015 ◽  
Vol 21 (5) ◽  
pp. 661-664
Author(s):  
ZORNITSA KOZAREVA ◽  
VIVI NASTASE ◽  
RADA MIHALCEA

Graph structures naturally model connections. In natural language processing (NLP) connections are ubiquitous, on anything between small and web scale. We find them between words – as grammatical, collocation or semantic relations – contributing to the overall meaning, and maintaining the cohesive structure of the text and the discourse unity. We find them between concepts in ontologies or other knowledge repositories – since the early ages of artificial intelligence, associative or semantic networks have been proposed and used as knowledge stores, because they naturally capture the language units and relations between them, and allow for a variety of inference and reasoning processes, simulating some of the functionalities of the human mind. We find them between complete texts or web pages, and between entities in a social network, where they model relations at the web scale. Beyond the more often encountered ‘regular’ graphs, hypergraphs have also appeared in our field to model relations between more than two units.


2018 ◽  
Vol 46 (2) ◽  
pp. 120-126 ◽  
Author(s):  
Shutian Ma ◽  
Yingyi Zhang ◽  
Chengzhi Zhang

Purpose The purpose of this paper is to classify Chinese word semantic relations, which are synonyms, antonyms, hyponyms and meronymys. Design/methodology/approach Basically, four simple methods are applied, ontology-based, dictionary-based, pattern-based and morpho-syntactic method. The authors make good use of search engine to build lexical and semantic resources for dictionary-based and pattern-based methods. To improve classification performance with more external resources, they also classify the given word pairs in Chinese and in English at the same time by using machine translation. Findings Experimental results show that the approach achieved an average F1 score of 50.87 per cent, an average accuracy of 70.36 per cent and an average recall of 40.05 per cent over all classification tasks. Synonym and antonym classification achieved high accuracy, i.e. above 90 per cent. Moreover, dictionary-based and pattern-based approaches work effectively on final data set. Originality/value For many natural language processing (NLP) tasks, the step of distinguishing word semantic relation can help to improve system performance, such as information extraction and knowledge graph generation. Currently, common methods for this task rely on large corpora for training or dictionaries and thesauri for inference, where limitation lies in freely data access and keeping built lexical resources up-date. This paper builds a primary system for classifying Chinese word semantic relations by seeking new ways to obtain the external resources efficiently.


2019 ◽  
Vol 5 (5) ◽  
pp. 212-215
Author(s):  
Abeer AlArfaj

Semantic relation extraction is an important component of ontologies that can support many applications e.g. text mining, question answering, and information extraction. However, extracting semantic relations between concepts is not trivial and one of the main challenges in Natural Language Processing (NLP) Field. The Arabic language has complex morphological, grammatical, and semantic aspects since it is a highly inflectional and derivational language, which makes task even more challenging. In this paper, we present a review of the state of the art for relation extraction from texts, addressing the progress and difficulties in this field. We discuss several aspects related to this task, considering the taxonomic and non-taxonomic relation extraction methods. Majority of relation extraction approaches implement a combination of statistical and linguistic techniques to extract semantic relations from text. We also give special attention to the state of the work on relation extraction from Arabic texts, which need further progress.


Sign in / Sign up

Export Citation Format

Share Document