Parallel Dynamic Topic Modeling via Evolving Topic Adjustment and Term Weighting Scheme

In this paper, through analysis of the structure of web news texts, we have proposed an improvement measure for term weighting in hot topics detection, and a topic weighting scheme for hot topics ranking. Experiment result comparison shows that our method is effective and ranking of hot topics is closer to reality.

Download Full-text

A Supervised Term Weighting Scheme for Multi-class Text Categorization

Intelligent Computing Methodologies - Lecture Notes in Computer Science ◽

10.1007/978-3-319-63315-2_38 ◽

2017 ◽

pp. 436-447 ◽

Cited By ~ 1

Author(s):

Yiwei Gu ◽

Xiaodong Gu

Keyword(s):

Text Categorization ◽

Weighting Scheme ◽

Term Weighting

Download Full-text

Design Consideration for Improved Term Weighting Scheme for Pornographic Web sites

Pattern Analysis, Intelligent Security and the Internet of Things - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-17398-6_25 ◽

2015 ◽

pp. 275-285

Author(s):

Hafsah Salam ◽

Mohd Aizaini Maarof ◽

Anazida Zainal

Keyword(s):

Web Sites ◽

Weighting Scheme ◽

Design Consideration ◽

Term Weighting

Download Full-text

Customized term weighting scheme for document classification1

2008 International Conference on Computer and Communication Engineering ◽

10.1109/iccce.2008.4580615 ◽

2008 ◽

Cited By ~ 2

Author(s):

C.M.X. Benjamin ◽

W.L. Woon ◽

K.S.D. Wong

Keyword(s):

Weighting Scheme ◽

Term Weighting

Download Full-text

A Part-Of-Speech term weighting scheme for biomedical information retrieval

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2016.08.026 ◽

2016 ◽

Vol 63 ◽

pp. 379-389 ◽

Cited By ~ 22

Author(s):

Yanshan Wang ◽

Stephen Wu ◽

Dingcheng Li ◽

Saeed Mehrabi ◽

Hongfang Liu

Keyword(s):

Information Retrieval ◽

Weighting Scheme ◽

Term Weighting ◽

Part Of Speech ◽

Biomedical Information Retrieval

Download Full-text

A novel term weighting scheme for text classification: TF-MONO

Journal of Informetrics ◽

10.1016/j.joi.2020.101076 ◽

2020 ◽

Vol 14 (4) ◽

pp. 101076 ◽

Cited By ~ 1

Author(s):

Turgut Dogan ◽

Alper Kursat Uysal

Keyword(s):

Text Classification ◽

Weighting Scheme ◽

Term Weighting

Download Full-text

Selective word encoding for effective text representation

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES ◽

10.3906/elk-1805-138 ◽

2019 ◽

pp. 1028-1040

Author(s):

SAVAŞ ÖZKAN ◽

AKIN ÖZKAN

Keyword(s):

Loss Function ◽

High Performance ◽

Class Imbalance ◽

Semantic Content ◽

Weighting Scheme ◽

Compact Representation ◽

Text Representation ◽

Class Imbalance Problem ◽

Term Weighting ◽

Word Level

Determining the category of a text document from its semantic content is highly motivated in the literature and it has been extensively studied in various applications. Also, the compact representation of the text is a funda- mental step in achieving precise results for the applications and the studies are generously concentrated to improve its performance. In particular, the studies which exploit the aggregation of word-level representations are the mainstream techniques used in the problem. In this paper, we tackle text representation to achieve high performance in different text classification tasks. Throughout the paper, three critical contributions are presented. First, to encode the word- level representations for each text, we adapt a trainable orderless aggregation algorithm to obtain a more discriminative abstract representation by transforming word vectors to the text-level representation. Second, we propose an effective term-weighting scheme to compute the relative importance of words from the context based on their conjunction with the problem in an end-to-end learning manner. Third, we present a weighted loss function to mitigate the class-imbalance problem between the categories. To evaluate the performance, we collect two distinct datasets as Turkish parliament records (i.e. written speeches of four major political parties including 30731/7683 train and test documents) and newspa- per articles (i.e. daily articles of the columnists including 16000/3200 train and test documents) whose data is available on the web. From the results, the proposed method introduces significant performance improvements to the baseline techniques (i.e. VLAD and Fisher Vector) and achieves 0.823% and 0.878% true prediction accuracies for the party membership and the estimation of the category of articles respectively. The performance validates that the proposed con- tributions (i.e. trainable word-encoding model, trainable term-weighting scheme and weighted loss function) significantly outperform the baselines.

Download Full-text