Research on multi-feature fusion text classification model based on self-attention mechanism

Text classification is a common application in natural language processing. We proposed a multi-label text classification model based on ELMo and attention mechanism which help solve the problem for the sentiment classification task that there is no grammar or writing convention in power supply related text and the sentiment related information disperses in the text. Firstly, we use pre-trained word embedding vector to extract the feature of text from the Internet. Secondly, the analyzed deep information features are weighted according to the attention mechanism. Finally, an improved ELMo model in which we replace the LSTM module with GRU module is used to characterize the text and information is classified. The experimental results on Kaggle’s toxic comment classification data set show that the accuracy of sentiment classification is as high as 98%.

Download Full-text

Hybrid Chinese text classification model based on pretraining model

Journal of Physics Conference Series ◽

10.1088/1742-6596/1961/1/012002 ◽

2021 ◽

Vol 1961 (1) ◽

pp. 012002

Author(s):

Xing Zhaoye ◽

Liu Xiaoqun ◽

Sun Peijie

Keyword(s):

Chinese Text ◽

Text Classification ◽

Classification Model ◽

Chinese Text Classification ◽

Model Based

Download Full-text

A New Multitasking Malware Classification Model Based on Feature Fusion

2018 2nd IEEE Advanced Information Management,Communicates,Electronic and Automation Control Conference (IMCEC) ◽

10.1109/imcec.2018.8469663 ◽

2018 ◽

Author(s):

Wei Shi ◽

Xin Zhou ◽

Jianmin Pang ◽

Guanghui Liang ◽

Haoran Gu

Keyword(s):

Feature Fusion ◽

Classification Model ◽

Malware Classification ◽

Model Based

Download Full-text

A Complaint Text Classification Model Based on Character-Level Convolutional Network

2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS) ◽

10.1109/icsess.2018.8663873 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xuesong Tong ◽

Bin Wu ◽

Shuyang Wang ◽

Jinna Lv

Keyword(s):

Text Classification ◽

Classification Model ◽

Convolutional Network ◽

Model Based

Download Full-text

Research of Text Classification Model Based on Latent Semantic Analysis and Improved HS-SVM

2010 2nd International Workshop on Intelligent Systems and Applications ◽

10.1109/iwisa.2010.5473702 ◽

2010 ◽

Author(s):

Yu-feng Zhang ◽

Chao He

Keyword(s):

Text Classification ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Classification Model ◽

Model Based

Download Full-text

An Extended Text Combination Classification Model for Short Video Based on Albert

Journal of Sensors ◽

10.1155/2021/8013337 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Yi Liu ◽

Yue Zhang ◽

Haidong Hu ◽

Xiaodong Liu ◽

Lun Zhang ◽

...

Keyword(s):

Text Classification ◽

Rapid Development ◽

Computational Cost ◽

Attention Mechanism ◽

Classification Model ◽

Classification Methods ◽

Video Classification ◽

Short Video ◽

Text Information ◽

Key Frames

With the rise and rapid development of short video sharing websites, the number of short videos on the Internet has been growing explosively. The organization and classification of short videos have become the basis for the effective use of short videos, which is also a problem faced by major short video platforms. Aiming at the characteristics of complex short video content categories and rich extended text information, this paper uses methods in the text classification field to solve the short video classification problem. Compared with the traditional way of classifying and understanding short video key frames, this method has the characteristics of lower computational cost, more accurate classification results, and easier application. This paper proposes a text classification model based on the attention mechanism of multitext embedding short video extension. The experiment first uses the training language model Albert to extract sentence-level vectors and then uses the attention mechanism to study the text information in various short video extensions in a short video classification weight factor. And this research applied Google’s unsupervised data augmentation (UDA) method based on unsupervised data, creatively combining it with the Chinese knowledge graph, and realized TF-IDF word replacement. During the training process, we introduced a large amount of unlabeled data, which significantly improved the accuracy of model classification. The final series of related experiments is aimed at comparing with the existing short video title classification methods, classification methods based on video key frames, and hybrid methods, and proving that the method proposed in this article is more accurate and robust on the test set.

Download Full-text