English and Taiwanese text categorization using N-gram based on Vector Space Model

2010 International Symposium On Information Theory & Its Applications ◽

10.1109/isita.2010.5649453 ◽

2010 ◽

Author(s):

Makoto Suzuki ◽

Naohide Yamagishi ◽

Yi-Ching Tsai ◽

Takashi Ishida ◽

Masayuki Goto

Keyword(s):

Vector Space ◽

Text Categorization ◽

Vector Space Model ◽

Space Model ◽

Download Full-text

A Comprehensive Comparative Study Using Vector Space Model with K-Nearest Neighbor on Text Categorization Data

Asian Journal of Information Management ◽

10.3923/ajim.2008.14.22 ◽

2007 ◽

Vol 2 (1) ◽

pp. 14-22 ◽

Author(s):

Wa`el Musa Hadi ◽

Fadi Thabtah ◽

Salahideen Mousa ◽

Samer Al Hawari ◽

Ghassan Kanaan ◽

...

Keyword(s):

Comparative Study ◽

Vector Space ◽

Text Categorization ◽

Nearest Neighbor ◽

Vector Space Model ◽

K Nearest Neighbor ◽

Download Full-text

A Study on Analysis of Malicious Codes Similarity Using N-Gram and Vector Space Model

2011 International Conference on Information Science and Applications ◽

10.1109/icisa.2011.5772331 ◽

2011 ◽

Author(s):

Donghwi Lee ◽

Won Hyung Park ◽

K J Kim

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model ◽

Malicious Codes ◽

Download Full-text

The integration of a newly defined N-gram concept and vector space model for documents ranking

International Journal of Business Intelligence and Data Mining ◽

10.1504/ijbidm.2017.10007893 ◽

2017 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Mostafa Salama ◽

Wafaa Salah

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model ◽

Download Full-text

A Text Categorization Method Based on SVM and Improved K-Means

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.2449 ◽

2013 ◽

Vol 427-429 ◽

pp. 2449-2453

Author(s):

Rong Ze Xia ◽

Yan Jia ◽

Hu Li

Keyword(s):

Support Vector Machine ◽

Vector Space ◽

High Performance ◽

Supervised Classification ◽

Text Categorization ◽

Clustering Algorithm ◽

Vector Space Model ◽

Classification Method ◽

Support Vector ◽

Traditional supervised classification method such as support vector machine (SVM) could achieve high performance in text categorization. However, we should first hand-labeled the samples before classifying. Its a time-consuming task. Unsupervised method such as k-means could also be used for handling the text categorization problem. However, Traditional k-means could easily be affected by several isolated observations. In this paper, we proposed a new text categorization method. First we improved the traditional k-means clustering algorithm. The improved k-means is used for clustering vectors in our vector space model. After that, we use the SVM to categorize vectors which are preprocessed by improved k-means. The experiments show that our algorithm could out-perform the traditional SVM text categorization method.

Download Full-text

A CONCEPT VECTOR SPACE MODEL FOR SEMANTIC KERNELS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213009000123 ◽

2009 ◽

Vol 18 (02) ◽

pp. 239-272 ◽

Author(s):

SUJEEVAN ASEERVATHAM

Keyword(s):

Vector Space ◽

Language Processing ◽

Text Categorization ◽

Semantic Analysis ◽

Similarity Measures ◽

Vector Space Model ◽

Inner Product ◽

Support Vector ◽

Linear Kernel ◽

Kernels are widely used in Natural Language Processing as similarity measures within inner-product based learning methods like the Support Vector Machine. The Vector Space Model (VSM) is extensively used for the spatial representation of the documents. However, it is purely a statistical representation. In this paper, we present a Concept Vector Space Model (CVSM) representation which uses linguistic prior knowledge to capture the meanings of the documents. We also propose a linear kernel and a latent kernel for this space. The linear kernel takes advantage of the linguistic concepts whereas the latent kernel combines statistical and linguistic concepts. Indeed, the latter kernel uses latent concepts extracted by the Latent Semantic Analysis (LSA) in the CVSM. The kernels were evaluated on a text categorization task in the biomedical domain. The Ohsumed corpus, well known for being difficult to categorize, was used. The results have shown that the CVSM improves performance compared to the VSM.

Download Full-text

An improved classification method for the common OLE file by N-gram analysis and vector space model

IET Conference on Wireless, Mobile and Sensor Networks 2007 (CCWMSN07) ◽

10.1049/cp:20070315 ◽

2007 ◽

Author(s):

Hong-Rong Yang ◽

Ming Xu ◽

Ning Zheng

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Classification Method ◽

Space Model ◽

Download Full-text

On a new model for automatic text categorization based on Vector Space Model

2010 IEEE International Conference on Systems, Man and Cybernetics ◽

10.1109/icsmc.2010.5642259 ◽

2010 ◽

Author(s):

Makoto Suzuki ◽

Naohide Yamagishi ◽

Takashi Ishida ◽

Masayuki Goto ◽

Shigeichi Hirasawa

Keyword(s):

Vector Space ◽

Text Categorization ◽

Vector Space Model ◽

New Model ◽

Space Model ◽

Download Full-text

A new similarity measure for automatic text categorization based on vector space model

Proceedings of the Second International Conference on Advanced Wireless Information, Data, and Communication Technologies - AWICT 2017 ◽

10.1145/3231830.3231833 ◽

2017 ◽

Author(s):

Said Bahassine ◽

Abdellah Madani ◽

Mohamed Kissi

Keyword(s):

Vector Space ◽

Similarity Measure ◽

Text Categorization ◽

Vector Space Model ◽

Space Model ◽

Download Full-text

An open-source framework for ExpFinder integrating N-gram vector space model and μCO-HITS

Software Impacts ◽

10.1016/j.simpa.2021.100069 ◽

2021 ◽

Vol 8 ◽

pp. 100069

Author(s):

Hung Du ◽

Yong-Bin Kang

Keyword(s):

Vector Space ◽

Open Source ◽

Vector Space Model ◽

Space Model ◽

Open Source Framework ◽

Download Full-text

Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v1i2.372 ◽

2015 ◽

Vol 1 (2) ◽

Author(s):

Oscar Karnalim

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Semantic Relatedness ◽

Download Full-text