scholarly journals Research on multi-feature fusion text classification model based on self-attention mechanism

2020 ◽  
Vol 1693 ◽  
pp. 012071
Author(s):  
Xiaoxia Luo ◽  
XuanHao Wang
2020 ◽  
Vol 309 ◽  
pp. 03015
Author(s):  
Wenbin Liu ◽  
Bojian Wen ◽  
Shang Gao ◽  
Jiesheng Zheng ◽  
Yinlong Zheng

Text classification is a common application in natural language processing. We proposed a multi-label text classification model based on ELMo and attention mechanism which help solve the problem for the sentiment classification task that there is no grammar or writing convention in power supply related text and the sentiment related information disperses in the text. Firstly, we use pre-trained word embedding vector to extract the feature of text from the Internet. Secondly, the analyzed deep information features are weighted according to the attention mechanism. Finally, an improved ELMo model in which we replace the LSTM module with GRU module is used to characterize the text and information is classified. The experimental results on Kaggle’s toxic comment classification data set show that the accuracy of sentiment classification is as high as 98%.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Yi Liu ◽  
Yue Zhang ◽  
Haidong Hu ◽  
Xiaodong Liu ◽  
Lun Zhang ◽  
...  

With the rise and rapid development of short video sharing websites, the number of short videos on the Internet has been growing explosively. The organization and classification of short videos have become the basis for the effective use of short videos, which is also a problem faced by major short video platforms. Aiming at the characteristics of complex short video content categories and rich extended text information, this paper uses methods in the text classification field to solve the short video classification problem. Compared with the traditional way of classifying and understanding short video key frames, this method has the characteristics of lower computational cost, more accurate classification results, and easier application. This paper proposes a text classification model based on the attention mechanism of multitext embedding short video extension. The experiment first uses the training language model Albert to extract sentence-level vectors and then uses the attention mechanism to study the text information in various short video extensions in a short video classification weight factor. And this research applied Google’s unsupervised data augmentation (UDA) method based on unsupervised data, creatively combining it with the Chinese knowledge graph, and realized TF-IDF word replacement. During the training process, we introduced a large amount of unlabeled data, which significantly improved the accuracy of model classification. The final series of related experiments is aimed at comparing with the existing short video title classification methods, classification methods based on video key frames, and hybrid methods, and proving that the method proposed in this article is more accurate and robust on the test set.


Sign in / Sign up

Export Citation Format

Share Document