scholarly journals Cognitive Aspects-Based Short Text Representation with Named Entity, Concept and Knowledge

2020 ◽  
Vol 10 (14) ◽  
pp. 4893 ◽  
Author(s):  
Wenfeng Hou ◽  
Qing Liu ◽  
Longbing Cao

Short text is widely seen in applications including Internet of Things (IoT). The appropriate representation and classification of short text could be severely disrupted by the sparsity and shortness of short text. One important solution is to enrich short text representation by involving cognitive aspects of text, including semantic concept, knowledge, and category. In this paper, we propose a named Entity-based Concept Knowledge-Aware (ECKA) representation model which incorporates semantic information into short text representation. ECKA is a multi-level short text semantic representation model, which extracts the semantic features from the word, entity, concept and knowledge levels by CNN, respectively. Since word, entity, concept and knowledge entity in the same short text have different cognitive informativeness for short text classification, attention networks are formed to capture these category-related attentive representations from the multi-level textual features, respectively. The final multi-level semantic representations are formed by concatenating all of these individual-level representations, which are used for text classification. Experiments on three tasks demonstrate our method significantly outperforms the state-of-the-art methods.


Author(s):  
Ming Hao ◽  
Weijing Wang ◽  
Fang Zhou

Short text classification is an important foundation for natural language processing (NLP) tasks. Though, the text classification based on deep language models (DLMs) has made a significant headway, in practical applications however, some texts are ambiguous and hard to classify in multi-class classification especially, for short texts whose context length is limited. The mainstream method improves the distinction of ambiguous text by adding context information. However, these methods rely only the text representation, and ignore that the categories overlap and are not completely independent of each other. In this paper, we establish a new general method to solve the problem of ambiguous text classification by introducing label embedding to represent each category, which makes measurable difference between the categories. Further, a new compositional loss function is proposed to train the model, which makes the text representation closer to the ground-truth label and farther away from others. Finally, a constraint is obtained by calculating the similarity between the text representation and label embedding. Errors caused by ambiguous text can be corrected by adding constraints to the output layer of the model. We apply the method to three classical models and conduct experiments on six public datasets. Experiments show that our method can effectively improve the classification accuracy of the ambiguous texts. In addition, combining our method with BERT, we obtain the state-of-the-art results on the CNT dataset.



2020 ◽  
Vol 50 (8) ◽  
pp. 2339-2351 ◽  
Author(s):  
Tianshi Wang ◽  
Li Liu ◽  
Naiwen Liu ◽  
Huaxiang Zhang ◽  
Long Zhang ◽  
...  


2013 ◽  
Vol 22 ◽  
pp. 78-86 ◽  
Author(s):  
Lili Yang ◽  
Chunping Li ◽  
Qiang Ding ◽  
Li Li


2013 ◽  
Vol 415 ◽  
pp. 396-401
Author(s):  
Wei Dong Huang ◽  
Yi Zi Wang ◽  
Yin Mao Liu

The efficiency and accuracy of professional retrieval are closely related to semantic representation model in professional field. Representing retrieval requirement accurately is very important to guaranteeing the accuracy of retrieval. In case of professional emergency field, a query reformulation model used to clearing and improving user retrieval requirement is researched. Constructing a latent semantic word space of emergency field, semantic features are clustered by using SOM, and CI index using to determining the optimal cluster boundary is proposed. On this basis, a new query reformulation method is designed. The effectiveness of model is verified by experiment.









Sign in / Sign up

Export Citation Format

Share Document