scholarly journals DEXTER: A workbench for automatic term extraction with specialized corpora

2017 ◽  
Vol 24 (2) ◽  
pp. 163-198 ◽  
Author(s):  
CARLOS PERIÑAN-PASCUAL

AbstractAutomatic term extraction has become a priority area of research within corpus processing. Despite the extensive literature in this field, there are still some outstanding issues that should be dealt with during the construction of term extractors, particularly those oriented to support research in terminology and terminography. In this regard, this article describes the design and development of DEXTER, an online workbench for the extraction of simple and complex terms from domain-specific corpora in English, French, Italian and Spanish. In this framework, three issues contribute to placing the most important terms in the foreground. First, unlike the elaborate morphosyntactic patterns proposed by most previous research, shallow lexical filters have been constructed to discard term candidates. Second, a large number of common stopwords are automatically detected by means of a method that relies on the IATE database together with the frequency distribution of the domain-specific corpus and a general corpus. Third, the term-ranking metric, which is grounded on the notions of salience, relevance and cohesion, is guided by the IATE database to display an adequate distribution of terms.

Terminology ◽  
2015 ◽  
Vol 21 (2) ◽  
pp. 151-179 ◽  
Author(s):  
Carlos Periñán-Pascual

The corpus-based identification of those lexical units which serve to describe a given specialized domain usually becomes a complex task, where an analysis oriented to the frequency of words and the likelihood of lexical associations is often ineffective. The goal of this article is to demonstrate that a user-adjustable composite metric such as SRC can accommodate to the diversity of domain-specific glossaries to be constructed from small- and medium-sized specialized corpora of non-structured texts. Unlike for most of the research in automatic term extraction, where single metrics are usually combined indiscriminately to produce the best results, SRC is grounded on the theoretical principles of salience, relevance and cohesion, which have been rationally implemented in the three components of this metric.


Terminology ◽  
2018 ◽  
Vol 24 (1) ◽  
pp. 122-147
Author(s):  
Mercè Vàzquez ◽  
Antoni Oliver

Abstract The identification of reliable terms from domain-specific corpora using computational methods is a task that has to be validated manually by specialists, which is a highly time-consuming activity. To reduce this effort and improve term candidate selection, we implemented the Token Slot Recognition method, a filtering method based on terminological tokens which is used to rank extracted term candidates from domain-specific corpora. This paper presents the implementation of the term candidates filtering method we developed in linguistic and statistical approaches applied for automatic term extraction using several domain-specific corpora in different languages. We observed that the filtering method outperforms term candidate selection by ranking a higher number of terms at the top of the term candidate list than raw frequency, and for statistical term extraction the improvement is between 15% and 25% both in precision and recall. Our analyses further revealed a reduction in the number of term candidates to be validated manually by specialists. In conclusion, the number of term candidates extracted automatically from domain-specific corpora has been reduced significantly using the Token Slot Recognition filtering method, so term candidates can be easily and quickly validated by specialists.


2020 ◽  
Vol 93 (4) ◽  
pp. 161-169
Author(s):  
I. I. Chesnitskiy ◽  

The article presents an analysis of the state and problems of implementing the socio-economic rights of population of the Khabarovsk territory as a priority area for reducing poverty. Attention of the authorities was drawn to the situation of poverty in a number of northern municipalities, where the population is experiencing difficulties in realizing their socio-economic rights due to the lack of jobs. Concern was expressed about the socio-economic rights of persons released from the places of deprivation of liberty. The Commissioner for human rights in the Khabarovsk territory, taking into account the study of situation in the region, sets out his vision for solving the problem of reducing poverty in the Khabarovsk territory and makes proposals that, in his opinion, can be used by the regional state authorities to achieve the indicators set by the President of the Russian Federation.


Terminology ◽  
2014 ◽  
Vol 20 (2) ◽  
pp. 151-170 ◽  
Author(s):  
Katia Peruzzo

The paper examines the possible usage of event templates derived from Frame-Based Terminology (Faber et al. 2005, 2006, 2007) as an aid to the extraction and management of legal terminology embedded in the multi-level legal system of the European Union. The method proposed here, which combines semi-automatic term extraction and a simplified event template containing six categories, is applied to an English corpus of EU texts focusing on victims of crime and their rights. Such a combination allows for the extraction of category-relevant terminological units and additional information, which can then be used for populating a terminological knowledge base organised on the basis of the same event template, but which also employs additional classification criteria to account for the multidimensionality encountered in the corpus.


2020 ◽  
Vol 33 (10) ◽  
pp. 649
Author(s):  
Filipa Martins Silva ◽  
Ana Filipa Lopes ◽  
Vânia Carneiro ◽  
Álvaro Campelo

Introduction: Scientific evidence regarding children’s understanding of the concept of death is scarce. This has recently been pointed out by the International Children’s Palliative Care Network as a priority area of research. In particular, the avoidance of emotion in this area of research is an important shortcoming. This study aims to develop an in-depth view of the emotional dimension of the child’s understanding of death, also seeking to relate it to the cognitive dimension.Material and Methods: We interviewed children (three to six years old) using a book illustrating a hypothetical scenario in which a child faced the death of a relative. We asked questions to assess the cognitive subconcepts of death and the emotional dimension (what the child would feel and what parents should say).Results: Of the 54 participants, the majority said that the child would feel sad (n = 46, 85%) and that parents should inform her/him (n = 47, 87%); these responses did not vary significantly with age. The cognitive understanding of the concept of death in children who reported sadness was significantly higher.Discussion: Even the youngest children feel death, and it is not possible to disconnect cognitive and emotional understanding. Additionally, children should be informed in order to foster a proper and multidimensional elaboration of death.Conclusion: This study provides valuable information to health professionals and other interested adults about the way preschoolers position themselves in relation to death.


Sign in / Sign up

Export Citation Format

Share Document