Automatic Term Extraction for Sentiment Classification of Dynamically Updated Text Collections into Three Classes

Author(s):  
Yuliya Rubtsova
Information ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 184 ◽  
Author(s):  
Yuliya Rubtsova

The research identifies and substantiates the problem of quality deterioration in the sentiment classification of text collections identical in composition and characteristics, but staggered over time. It is shown that the quality of sentiment classification can drop up to 15% in terms of the F-measure over a year and a half. This paper presents three different approaches to improving text classification by sentiment in continuously-updated text collections in Russian: using a weighing scheme with linear computational complexity, adding lexicons of emotional vocabulary to the feature space and distributed word representation. All methods are compared, and it is shown which method is most applicable in certain cases. Experiments comparing the methods on sufficiently representative text collections are described. It is shown that suggested approaches could reduce the deterioration of sentiment classification results for collections staggered over time.


2021 ◽  
Vol 19 (2) ◽  
pp. 5-16
Author(s):  
E. P. Bruches ◽  
T. V. Batura

We propose a method for scientific terms extraction from the texts in Russian based on weakly supervised learning. This approach doesn't require a large amount of hand-labeled data. To implement this method we collected a list of terms in a semi-automatic way and then annotated texts of scientific articles with these terms. These texts we used to train a model. Then we used predictions of this model on another part of the text collection to extend the train set. The second model was trained on both text collections: annotated with a dictionary and by a second model. Obtained results showed that giving additional data, annotated even in an automatic way, improves the quality of scientific terms extraction.


2018 ◽  
Vol 4 (26) ◽  
pp. 5534-5538
Author(s):  
Semra AKTAŞ POLAT

Terminology ◽  
2014 ◽  
Vol 20 (2) ◽  
pp. 151-170 ◽  
Author(s):  
Katia Peruzzo

The paper examines the possible usage of event templates derived from Frame-Based Terminology (Faber et al. 2005, 2006, 2007) as an aid to the extraction and management of legal terminology embedded in the multi-level legal system of the European Union. The method proposed here, which combines semi-automatic term extraction and a simplified event template containing six categories, is applied to an English corpus of EU texts focusing on victims of crime and their rights. Such a combination allows for the extraction of category-relevant terminological units and additional information, which can then be used for populating a terminological knowledge base organised on the basis of the same event template, but which also employs additional classification criteria to account for the multidimensionality encountered in the corpus.


2021 ◽  
Author(s):  
Yongxue Shan ◽  
Zhaoqian Zhong ◽  
Chao Che ◽  
Bo Jin ◽  
Xiaopeng Wei

Sign in / Sign up

Export Citation Format

Share Document