A Novel Multilingual Text Categorization System using Latent Semantic Indexing

2011 ◽

Vol 181-182 ◽

pp. 830-835

Author(s):

Min Song Li

Keyword(s):

Support Vector Machine ◽

Text Categorization ◽

Latent Semantic Indexing ◽

Classification Performance ◽

Compact Representation ◽

Support Vector ◽

Semantic Features ◽

Semantic Indexing ◽

Feature Extraction Method ◽

Feature Subspace

Latent Semantic Indexing(LSI) is an effective feature extraction method which can capture the underlying latent semantic structure between words in documents. However, it is probably not the most appropriate for text categorization to use the method to select feature subspace, since the method orders extracted features according to their variance,not the classification power. We proposed a method based on support vector machine to extract features and select a Latent Semantic Indexing that be suited for classification. Experimental results indicate that the method improves classification performance with more compact representation.

Download Full-text

Ensemble multi-label text categorization based on rotation forest and latent semantic indexing

Expert Systems with Applications ◽

10.1016/j.eswa.2016.03.041 ◽

2016 ◽

Vol 57 ◽

pp. 1-11 ◽

Cited By ~ 28

Author(s):

Haytham Elghazel ◽

Alex Aussem ◽

Ouadie Gharroudi ◽

Wafa Saadaoui

Keyword(s):

Text Categorization ◽

Latent Semantic Indexing ◽

Semantic Indexing ◽

Rotation Forest

Download Full-text

Class Selection Based Iterative Supervised Latent Semantic Indexing for Text Categorization

2009 International Conference on Information Engineering and Computer Science ◽

10.1109/iciecs.2009.5366100 ◽

2009 ◽

Author(s):

Ming-Bo Wang ◽

Cheng-Lin Liu

Keyword(s):

Text Categorization ◽

Latent Semantic Indexing ◽

Semantic Indexing

Download Full-text

Review on Semantic Text Categorization

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.644-650.2323 ◽

2014 ◽

Vol 644-650 ◽

pp. 2323-2328

Author(s):

Gui Xian Xu ◽

Chun Jie Li ◽

Yong Ji Li ◽

Yue Ma ◽

Xiao Lan Ma ◽

...

Keyword(s):

Text Classification ◽

Text Categorization ◽

Latent Semantic Indexing ◽

Semantic Indexing ◽

Classification Methods ◽

History Of ◽

Similarity Computation

Text classification has been a hot research in recent years. This text reviewed the history of text classification. It summarized some common classification methods and mainly introduced classification methods based on semantic. Especially, it elaborated the text classification based on ontology, the text classification based on similarity computation and the text classification based on latent semantic indexing.

Download Full-text

A two-stage feature selection method for text categorization by using category correlation degree and latent semantic indexing

Journal of Shanghai Jiaotong University (Science) ◽

10.1007/s12204-015-1586-y ◽

2015 ◽

Vol 20 (1) ◽

pp. 44-50 ◽

Cited By ~ 5

Author(s):

Fei Wang ◽

Cai-hong Li ◽

Jing-shan Wang ◽

Jiao Xu ◽

Lian Li

Keyword(s):

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Latent Semantic Indexing ◽

Selection Method ◽

Semantic Indexing ◽

Two Stage ◽

Correlation Degree

Download Full-text

Support vector machine for customized email filtering based on improving latent semantic indexing

2005 International Conference on Machine Learning and Cybernetics ◽

10.1109/icmlc.2005.1527599 ◽

2005 ◽

Cited By ~ 1

Author(s):

Qing Yang ◽

Fang-Min Li

Keyword(s):

Support Vector Machine ◽

Latent Semantic Indexing ◽

Support Vector ◽

Semantic Indexing ◽

Email Filtering

Download Full-text

Analyzing Large-Scale Proteomics Projects with Latent Semantic Indexing

Journal of Proteome Research ◽

10.1021/pr070461k ◽

2008 ◽

Vol 7 (1) ◽

pp. 182-191 ◽

Cited By ~ 33

Author(s):

Sebastian Klie ◽

Lennart Martens ◽

Juan Antonio Vizcaíno ◽

Richard Côté ◽

Phil Jones ◽

...

Keyword(s):

Large Scale ◽

Latent Semantic Indexing ◽

Semantic Indexing

Download Full-text

Design of Text Categorization System Based on SVM

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1191 ◽

2012 ◽

Vol 532-533 ◽

pp. 1191-1195 ◽

Cited By ~ 1

Author(s):

Zhen Yan Liu ◽

Wei Ping Wang ◽

Yong Wang

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Extraction Methods ◽

Support Vector ◽

Text Representation ◽

Text Feature ◽

Categorization System ◽

Classifier Training

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.

Download Full-text

Adaptive label-driven scaling for latent semantic indexing

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08 ◽

10.1145/1390334.1390525 ◽

2008 ◽

Cited By ~ 1

Author(s):

Xiaojun Quan ◽

Enhong Chen ◽

Qiming Luo ◽

Hui Xiong

Keyword(s):

Latent Semantic Indexing ◽

Semantic Indexing

Download Full-text

Biomedical Document Clustering Based on Accelerated Symbiotic Organisms Search Algorithm

International Journal of Swarm Intelligence Research ◽

10.4018/ijsir.2021100109 ◽

2021 ◽

Vol 12 (4) ◽

pp. 169-185

Author(s):

Saida Ishak Boushaki ◽

Omar Bendjeghaba ◽

Nadjet Kamel

Keyword(s):

Clustering Algorithm ◽

Search Algorithm ◽

Clustering Algorithms ◽

Document Clustering ◽

Latent Semantic Indexing ◽

Research Area ◽

Semantic Indexing ◽

Local Optima ◽

Symbiotic Organisms Search ◽

Symbiotic Organisms

Clustering is an important unsupervised analysis technique for big data mining. It finds its application in several domains including biomedical documents of the MEDLINE database. Document clustering algorithms based on metaheuristics is an active research area. However, these algorithms suffer from the problems of getting trapped in local optima, need many parameters to adjust, and the documents should be indexed by a high dimensionality matrix using the traditional vector space model. In order to overcome these limitations, in this paper a new documents clustering algorithm (ASOS-LSI) with no parameters is proposed. It is based on the recent symbiotic organisms search metaheuristic (SOS) and enhanced by an acceleration technique. Furthermore, the documents are represented by semantic indexing based on the famous latent semantic indexing (LSI). Conducted experiments on well-known biomedical documents datasets show the significant superiority of ASOS-LSI over five famous algorithms in terms of compactness, f-measure, purity, misclassified documents, entropy, and runtime.

Download Full-text

A Novel Multilingual Text Categorization System using Latent Semantic Indexing

A Method Based on Support Vector Machine for Feature Selection of Latent Semantic Features

Ensemble multi-label text categorization based on rotation forest and latent semantic indexing

Class Selection Based Iterative Supervised Latent Semantic Indexing for Text Categorization

Review on Semantic Text Categorization

A two-stage feature selection method for text categorization by using category correlation degree and latent semantic indexing

Support vector machine for customized email filtering based on improving latent semantic indexing

Analyzing Large-Scale Proteomics Projects with Latent Semantic Indexing

Design of Text Categorization System Based on SVM

Adaptive label-driven scaling for latent semantic indexing

Biomedical Document Clustering Based on Accelerated Symbiotic Organisms Search Algorithm

Export Citation Format