Text Representation and Compression

Integration of Inference-Validating Knowledge and Text Representation

PsycEXTRA Dataset ◽

10.1037/e665412011-491 ◽

1992 ◽

Author(s):

Michael Halldorson ◽

Murray Singer

Keyword(s):

Text Representation

Download Full-text

Geographic information. Well-known text representation of coordinate reference systems

10.3403/30277525 ◽

2015 ◽

Keyword(s):

Geographic Information ◽

Text Representation ◽

Reference Systems

Download Full-text

Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion

Proceedings of the Web Conference 2021 ◽

10.1145/3442381.3450043 ◽

2021 ◽

Author(s):

Bo Wang ◽

Tao Shen ◽

Guodong Long ◽

Tianyi Zhou ◽

Ying Wang ◽

...

Keyword(s):

Representation Learning ◽

Knowledge Graph ◽

Text Representation

Download Full-text

Text representation model of scientific papers based on fusing multi-viewpoint information and its quality assessment

Scientometrics ◽

10.1007/s11192-021-04028-4 ◽

2021 ◽

Author(s):

Yonghe Lu ◽

Jiayi Luo ◽

Ying Xiao ◽

Hou Zhu

Keyword(s):

Quality Assessment ◽

Text Representation ◽

Representation Model ◽

Scientific Papers

Download Full-text

Cancer Research Trend Analysis Based on Fusion Feature Representation

Entropy ◽

10.3390/e23030338 ◽

2021 ◽

Vol 23 (3) ◽

pp. 338

Author(s):

Jingqiao Wu ◽

Xiaoyue Feng ◽

Renchu Guan ◽

Yanchun Liang

Keyword(s):

Cancer Research ◽

Trend Analysis ◽

Research Trends ◽

Feature Representation ◽

Research Trend ◽

Good Representation ◽

Text Representation ◽

Classical Text ◽

Representation Model ◽

Text Feature

Machine learning models can automatically discover biomedical research trends and promote the dissemination of information and knowledge. Text feature representation is a critical and challenging task in natural language processing. Most methods of text feature representation are based on word representation. A good representation can capture semantic and structural information. In this paper, two fusion algorithms are proposed, namely, the Tr-W2v and Ti-W2v algorithms. They are based on the classical text feature representation model and consider the importance of words. The results show that the effectiveness of the two fusion text representation models is better than the classical text representation model, and the results based on the Tr-W2v algorithm are the best. Furthermore, based on the Tr-W2v algorithm, trend analyses of cancer research are conducted, including correlation analysis, keyword trend analysis, and improved keyword trend analysis. The discovery of the research trends and the evolution of hotspots for cancers can help doctors and biological researchers collect information and provide guidance for further research.

Download Full-text

Text representation and classification based on bi-gram alphabet

Journal of King Saud University - Computer and Information Sciences ◽

10.1016/j.jksuci.2019.01.005 ◽

2019 ◽

Cited By ~ 2

Author(s):

Fatma Elghannam

Keyword(s):

Text Representation

Download Full-text

Automatic text representation, classification and labeling in European law

Proceedings of the 8th international conference on Artificial intelligence and law - ICAIL '01 ◽

10.1145/383535.383544 ◽

2001 ◽

Cited By ~ 15

Author(s):

Erich Schweighofer ◽

Andreas Rauber ◽

Michael Dittenbach

Keyword(s):

Text Representation ◽

European Law ◽

Classification And Labeling ◽

Automatic Text

Download Full-text

Design of Text Categorization System Based on SVM

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1191 ◽

2012 ◽

Vol 532-533 ◽

pp. 1191-1195 ◽

Cited By ~ 1

Author(s):

Zhen Yan Liu ◽

Wei Ping Wang ◽

Yong Wang

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Extraction Methods ◽

Support Vector ◽

Text Representation ◽

Text Feature ◽

Categorization System ◽

Classifier Training

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.

Download Full-text

Repetition Effects From Paraphrased Text: Evidence for an Integrated Representation Model of Text Representation

Discourse Processes ◽

10.1207/s15326950dp2901_4 ◽

2000 ◽

Vol 29 (1) ◽

pp. 61-81 ◽

Cited By ~ 25

Author(s):

Gary E. Raney ◽

David J. Therriault ◽

Scott R. B. Minkoff

Keyword(s):

Text Representation ◽

Repetition Effects ◽

Representation Model

Download Full-text

Latent Topic Model for Indexing Arabic Documents

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2014010102 ◽

2014 ◽

Vol 4 (1) ◽

pp. 29-45 ◽

Cited By ~ 3

Author(s):

Rami Ayadi ◽

Mohsen Maraoui ◽

Mounir Zrigui

Keyword(s):

Topic Model ◽

Inflectional Morphology ◽

Arabic Text ◽

Text Representation ◽

Text Documents ◽

Latent Topic ◽

Latent Topics ◽

F Measure

In this paper, the authors present latent topic model to index and represent the Arabic text documents reflecting more semantics. Text representation in a language with high inflectional morphology such as Arabic is not a trivial task and requires some special treatments. The authors describe our approach for analyzing and preprocessing Arabic text then we describe the stemming process. Finally, the latent model (LDA) is adapted to extract Arabic latent topics, the authors extracted significant topics of all texts, each theme is described by a particular distribution of descriptors then each text is represented on the vectors of these topics. The experiment of classification is conducted on in house corpus; latent topics are learned with LDA for different topic numbers K (25, 50, 75, and 100) then the authors compare this result with classification in the full words space. The results show that performances, in terms of precision, recall and f-measure, of classification in the reduced topics space outperform classification in full words space and when using LSI reduction.

Download Full-text