Distant Domain Adaptation for Text Classification

Author(s):  
Zhenlong Zhu ◽  
Yuhua Li ◽  
Ruixuan Li ◽  
Xiwu Gu
Author(s):  
Chengcheng Han ◽  
Zeqiu Fan ◽  
Dongxiang Zhang ◽  
Minghui Qiu ◽  
Ming Gao ◽  
...  

Author(s):  
Rui Xia ◽  
Zhenchun Pan ◽  
Feng Xu

Domain adaptation is an important problem in natural language processing (NLP) due to the distributional difference between the labeled source domain and the target domain. In this paper, we study the domain adaptation problem from the instance weighting perspective. By using density ratio as the instance weight, the traditional instance weighting approaches can potentially correct the sample selection bias in domain adaptation. However, researchers often failed to achieve good performance when applying instance weighting to domain adaptation in NLP and many negative results were reported in the literature. In this work, we conduct an in-depth study on the causes of the failure, and find that previous work only focused on reducing the sample selection bias, but ignored another important factor, sample selection variance, in domain adaptation. On this basis, we propose a new instance weighting framework by trading off two factors in instance weight learning. We evaluate our approach on two cross-domain text classification tasks and compare it with eight instance weighting methods. The results prove our approach's advantages in domain adaptation performance, optimization efficiency and parameter stability.


2015 ◽  
Author(s):  
Raghuraman Gopalan ◽  
Ruonan Li ◽  
Vishal M. Patel ◽  
Rama Chellappa

Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


Sign in / Sign up

Export Citation Format

Share Document