Bi-Level Masked Multi-scale CNN-RNN Networks for Short Text Representation

Short text classification is an important foundation for natural language processing (NLP) tasks. Though, the text classification based on deep language models (DLMs) has made a significant headway, in practical applications however, some texts are ambiguous and hard to classify in multi-class classification especially, for short texts whose context length is limited. The mainstream method improves the distinction of ambiguous text by adding context information. However, these methods rely only the text representation, and ignore that the categories overlap and are not completely independent of each other. In this paper, we establish a new general method to solve the problem of ambiguous text classification by introducing label embedding to represent each category, which makes measurable difference between the categories. Further, a new compositional loss function is proposed to train the model, which makes the text representation closer to the ground-truth label and farther away from others. Finally, a constraint is obtained by calculating the similarity between the text representation and label embedding. Errors caused by ambiguous text can be corrected by adding constraints to the output layer of the model. We apply the method to three classical models and conduct experiments on six public datasets. Experiments show that our method can effectively improve the classification accuracy of the ambiguous texts. In addition, combining our method with BERT, we obtain the state-of-the-art results on the CNT dataset.

Download Full-text

Strategies for Short Text Representation in the Word Vector Space

2018 7th Brazilian Conference on Intelligent Systems (BRACIS) ◽

10.1109/bracis.2018.00053 ◽

2018 ◽

Author(s):

Marcelo Pita ◽

Gisele L. Pappa

Keyword(s):

Vector Space ◽

Text Representation ◽

Short Text

Download Full-text

Cognitive Aspects-Based Short Text Representation with Named Entity, Concept and Knowledge

Applied Sciences ◽

10.3390/app10144893 ◽

2020 ◽

Vol 10 (14) ◽

pp. 4893 ◽

Cited By ~ 1

Author(s):

Wenfeng Hou ◽

Qing Liu ◽

Longbing Cao

Keyword(s):

Text Classification ◽

Semantic Representation ◽

Semantic Features ◽

Text Representation ◽

Short Text ◽

Individual Level ◽

Named Entity ◽

Concept Knowledge ◽

Representation Model ◽

Multi Level

Short text is widely seen in applications including Internet of Things (IoT). The appropriate representation and classification of short text could be severely disrupted by the sparsity and shortness of short text. One important solution is to enrich short text representation by involving cognitive aspects of text, including semantic concept, knowledge, and category. In this paper, we propose a named Entity-based Concept Knowledge-Aware (ECKA) representation model which incorporates semantic information into short text representation. ECKA is a multi-level short text semantic representation model, which extracts the semantic features from the word, entity, concept and knowledge levels by CNN, respectively. Since word, entity, concept and knowledge entity in the same short text have different cognitive informativeness for short text classification, attention networks are formed to capture these category-related attentive representations from the multi-level textual features, respectively. The final multi-level semantic representations are formed by concatenating all of these individual-level representations, which are used for text classification. Experiments on three tasks demonstrate our method significantly outperforms the state-of-the-art methods.

Download Full-text

Prediction of Perceived Utility of Consumer Online Reviews Based on LSTM Neural Network

Mobile Information Systems ◽

10.1155/2021/5482662 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Hu Wang ◽

Tianbao Liang ◽

Yanxia Cheng

Keyword(s):

Neural Network ◽

Subjective Evaluation ◽

Perceived Value ◽

Online Reviews ◽

Classification Model ◽

Text Representation ◽

Labor Costs ◽

Data Set ◽

Short Text ◽

Perceived Utility

Perceived value is the customer’s subjective understanding of the value they obtain and is their subjective evaluation of the product or service they enjoy. This value is deducted from the cost of the product or service. In order to understand and predict the specific cognition of consumers on the value of products or services and distinguish it from the objective value of products or services in the general sense, this paper uses the in-depth learning method based on LSTM to establish a model to predict the perceived benefits of consumers. It is a challenging task to analyze the emotion of consumers or recognize the perceived value of consumers from various texts of online trading platforms. This paper proposes a new short-text representation method based on bidirectional LSTM. This method is very effective for forecasting research. In addition, we also use the attention mechanism to learn the specific emotional vocabulary. Short-text representation can be used for emotion classification and emotion intensity prediction. This paper evaluates the proposed classification model and regression data set. Compared with the baseline of the corresponding data set, the contrast of the results was 93%. The research shows that using deep neural network to predict the perceived utility of consumer comments can reduce the intervention of artificial features and labor costs and help predict the perceived utility of products to consumers.

Download Full-text

Short-text representation using diffusion wavelets

Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion ◽

10.1145/2567948.2577345 ◽

2014 ◽

Author(s):

Vidit Jain ◽

Jay Mahadeokar

Keyword(s):

Text Representation ◽

Short Text

Download Full-text

Short Text Representation Model Construction Method Based on Novel Semantic Aggregation Technology

Communications in Computer and Information Science - Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health ◽

10.1007/978-981-15-1922-2_7 ◽

2019 ◽

pp. 107-118

Author(s):

Dong Yi ◽

Zhai Jia ◽

Li Xin ◽

Chen Feng

Keyword(s):

Construction Method ◽

Model Construction ◽

Text Representation ◽

Short Text ◽

Representation Model ◽

Aggregation Technology

Download Full-text

A Method of Short Text Representation Based on the Feature Probability Embedded Vector

Sensors ◽

10.3390/s19173728 ◽

2019 ◽

Vol 19 (17) ◽

pp. 3728 ◽

Cited By ~ 5

Author(s):

Zhou ◽

Wang ◽

Sun ◽

Sun

Keyword(s):

Language Processing ◽

Text Categorization ◽

Topic Model ◽

Main Idea ◽

Feature Weighting ◽

Word Embedding ◽

Text Representation ◽

Short Text ◽

Representation Method ◽

Feature Probability

Text representation is one of the key tasks in the field of natural language processing (NLP). Traditional feature extraction and weighting methods often use the bag-of-words (BoW) model, which may lead to a lack of semantic information as well as the problems of high dimensionality and high sparsity. At present, to solve these problems, a popular idea is to utilize deep learning methods. In this paper, feature weighting, word embedding, and topic models are combined to propose an unsupervised text representation method named the feature, probability, and word embedding method. The main idea is to use the word embedding technology Word2Vec to obtain the word vector, and then combine this with the feature weighted TF-IDF and the topic model LDA. Compared with traditional feature engineering, the proposed method not only increases the expressive ability of the vector space model, but also reduces the dimensions of the document vector. Besides this, it can be used to solve the problems of the insufficient information, high dimensions, and high sparsity of BoW. We use the proposed method for the task of text categorization and verify the validity of the method.

Download Full-text

Improving short-text representation in convolutional networks by dependency parsing

Knowledge and Information Systems ◽

10.1007/s10115-018-1312-9 ◽

2018 ◽

Vol 61 (1) ◽

pp. 463-484

Author(s):

Siheng Zhang ◽

Wensheng Zhang ◽

Jinghao Niu

Keyword(s):

Text Representation ◽

Dependency Parsing ◽

Short Text ◽

Convolutional Networks

Download Full-text

Long Text QA Matching Model Based on BiGRU–DAttention–DSSM

Mathematics ◽

10.3390/math9101129 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1129

Author(s):

Shihong Chen ◽

Tianjiao Xu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Matching Task ◽

Semantic Model ◽

Text Representation ◽

Short Text ◽

Questions And Answers ◽

Text Features ◽

Text Matching

QA matching is a very important task in natural language processing, but current research on text matching focuses more on short text matching rather than long text matching. Compared with short text matching, long text matching is rich in information, but distracting information is frequent. This paper extracted question-and-answer pairs about psychological counseling to research long text QA-matching technology based on deep learning. We adjusted DSSM (Deep Structured Semantic Model) to make it suitable for the QA-matching task. Moreover, for better extraction of long text features, we also improved DSSM by enriching the text representation layer, using a bidirectional neural network and attention mechanism. The experimental results show that BiGRU–Dattention–DSSM performs better at matching questions and answers.

Download Full-text

Fuzzy Bag-of-Topics Model for Short Text Representation

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-04221-9_42 ◽

2018 ◽

pp. 473-482 ◽

Cited By ~ 1

Author(s):

Hao Jia ◽

Qing Li

Keyword(s):

Text Representation ◽

Short Text

Download Full-text