A Federated Learning Based Chinese Text Classification Model with Parameter Factorization Weighting

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.

Download Full-text

A Radical-Aware Attention-Based Model for Chinese Text Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015125 ◽

2019 ◽

Vol 33 ◽

pp. 5125-5132 ◽

Cited By ~ 10

Author(s):

Hanqing Tao ◽

Shiwei Tong ◽

Hongke Zhao ◽

Tong Xu ◽

Binbin Jin ◽

...

Keyword(s):

Chinese Text ◽

Text Classification ◽

Special Kind ◽

Chinese Characters ◽

Research Attention ◽

Chinese Text Classification ◽

Word Level ◽

Multiple Granularity ◽

Chinese And English ◽

Sequence Characteristics

Recent years, Chinese text classification has attracted more and more research attention. However, most existing techniques which specifically aim at English materials may lose effectiveness on this task due to the huge difference between Chinese and English. Actually, as a special kind of hieroglyphics, Chinese characters and radicals are semantically useful but still unexplored in the task of text classification. To that end, in this paper, we first analyze the motives of using multiple granularity features to represent a Chinese text by inspecting the characteristics of radicals, characters and words. For better representing the Chinese text and then implementing Chinese text classification, we propose a novel Radicalaware Attention-based Four-Granularity (RAFG) model to take full advantages of Chinese characters, words, characterlevel radicals, word-level radicals simultaneously. Specifically, RAFG applies a serialized BLSTM structure which is context-aware and able to capture the long-range information to model the character sharing property of Chinese and sequence characteristics in texts. Further, we design an attention mechanism to enhance the effects of radicals thus model the radical sharing property when integrating granularities. Finally, we conduct extensive experiments, where the experimental results not only show the superiority of our model, but also validate the effectiveness of radicals in the task of Chinese text classification.

Download Full-text

Dynamic feature selection strategy in incremental Chinese text classification

2012 2nd International Conference on Applied Robotics for the Power Industry (CARPI) ◽

10.1109/carpi.2012.6356526 ◽

2012 ◽

Author(s):

Dan Yang ◽

Xinghua Fan

Keyword(s):

Feature Selection ◽

Chinese Text ◽

Text Classification ◽

Selection Strategy ◽

Dynamic Feature ◽

Chinese Text Classification

Download Full-text