enrich target
Recently Published Documents


TOTAL DOCUMENTS

2
(FIVE YEARS 2)

H-INDEX

0
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Eliseu Guimarães ◽  
Daniela Vianna ◽  
Aline Paes ◽  
Alexandre Plastino

Sentiment analysis in tweets is a research field of great importance, mainly due to the popularity of Twitter. However, collecting and annotating tweets is an expensive and time-consuming task, making that some domains have only a limited set of labeled data. A promising strategy to handle this issue is to leverage labeled domains rich in data to select instances that enrich target datasets. This paper proposes different strategies for selecting instances from a set of labeled source datasets in order to improve the performance of classifiers trained only with the target dataset. Different approaches are proposed, including similarity metrics and variations in the number of selected instances. The results show that the size of the training set plays an essential role in the predictive capacity of the classifier. Furthermore, the results point out the importance of taking into account diversity criteria when selecting the instances.


Sign in / Sign up

Export Citation Format

Share Document