An Intelligent framework for E-Recruitment System Based on Text Categorization and Semantic Analysis

In this paper the authors propose a semantic approach to document categorization. The idea is to create for each category a semantic index (representative term vector) by performing a local Latent Semantic Analysis (LSA) followed by a clustering process. A second use of LSA (Global LSA) is adopted on a term-Class matrix in order to retrieve the class which is the most similar to the query (document to classify) in the same way where the LSA is used to retrieve documents which are the most similar to a query in Information Retrieval. The proposed system is evaluated on a popular dataset which is 20 Newsgroup corpus. Obtained results show the effectiveness of the method compared with those obtained with the classic KNN and SVM classifiers as well as with methods presented in the literature. Experimental results show that the new method has high precision and recall rates and classification accuracy is significantly improved.

Download Full-text

An Application of Latent Semantic Analysis for Text Categorization

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2015.3.1923 ◽

2015 ◽

Vol 10 (3) ◽

pp. 357 ◽

Cited By ~ 6

Author(s):

Gang Kou ◽

Yi Peng

Keyword(s):

Latent Semantic Analysis ◽

Text Categorization ◽

Semantic Analysis

Download Full-text

Local Latent Semantic Analysis Based on Support Vector Machine for Imbalanced Text Categorization

Communications in Computer and Information Science - Applied Informatics and Communication ◽

10.1007/978-3-642-23235-0_42 ◽

2011 ◽

pp. 321-329 ◽

Cited By ~ 1

Author(s):

Yuan Wan ◽

Hengqing Tong ◽

Yanfang Deng

Keyword(s):

Support Vector Machine ◽

Latent Semantic Analysis ◽

Text Categorization ◽

Semantic Analysis ◽

Support Vector

Download Full-text

A CONCEPT VECTOR SPACE MODEL FOR SEMANTIC KERNELS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213009000123 ◽

2009 ◽

Vol 18 (02) ◽

pp. 239-272 ◽

Cited By ~ 1

Author(s):

SUJEEVAN ASEERVATHAM

Keyword(s):

Vector Space ◽

Language Processing ◽

Text Categorization ◽

Semantic Analysis ◽

Similarity Measures ◽

Vector Space Model ◽

Inner Product ◽

Support Vector ◽

Linear Kernel ◽

Space Model

Kernels are widely used in Natural Language Processing as similarity measures within inner-product based learning methods like the Support Vector Machine. The Vector Space Model (VSM) is extensively used for the spatial representation of the documents. However, it is purely a statistical representation. In this paper, we present a Concept Vector Space Model (CVSM) representation which uses linguistic prior knowledge to capture the meanings of the documents. We also propose a linear kernel and a latent kernel for this space. The linear kernel takes advantage of the linguistic concepts whereas the latent kernel combines statistical and linguistic concepts. Indeed, the latter kernel uses latent concepts extracted by the Latent Semantic Analysis (LSA) in the CVSM. The kernels were evaluated on a text categorization task in the biomedical domain. The Ohsumed corpus, well known for being difficult to categorize, was used. The results have shown that the CVSM improves performance compared to the VSM.

Download Full-text

Fast text categorization using concise semantic analysis

Pattern Recognition Letters ◽

10.1016/j.patrec.2010.11.001 ◽

2011 ◽

Vol 32 (3) ◽

pp. 441-448 ◽

Cited By ~ 26

Author(s):

Zhixing Li ◽

Zhongyang Xiong ◽

Yufang Zhang ◽

Chunyong Liu ◽

Kuan Li

Keyword(s):

Text Categorization ◽

Semantic Analysis

Download Full-text

Latent semantic analysis for text categorization using neural network

Knowledge-Based Systems ◽

10.1016/j.knosys.2008.03.045 ◽

2008 ◽

Vol 21 (8) ◽

pp. 900-904 ◽

Cited By ~ 70

Author(s):

Bo Yu ◽

Zong-ben Xu ◽

Cheng-hua Li

Keyword(s):

Neural Network ◽

Latent Semantic Analysis ◽

Text Categorization ◽

Semantic Analysis

Download Full-text

Text categorization based on combination of modified back propagation neural network and latent semantic analysis

Neural Computing and Applications ◽

10.1007/s00521-008-0193-3 ◽

2008 ◽

Vol 18 (8) ◽

pp. 875-881 ◽

Cited By ~ 24

Author(s):

Wei Wang ◽

Bo Yu

Keyword(s):

Neural Network ◽

Latent Semantic Analysis ◽

Text Categorization ◽

Semantic Analysis ◽

Back Propagation ◽

Back Propagation Neural Network

Download Full-text

Wikipedia-based Semantic Interpretation for Natural Language Processing

Journal of Artificial Intelligence Research ◽

10.1613/jair.2669 ◽

2009 ◽

Vol 34 ◽

pp. 443-498 ◽

Cited By ~ 137

Author(s):

E. Gabrilovich ◽

S. Markovitch

Keyword(s):

Natural Language ◽

Language Processing ◽

Text Categorization ◽

Semantic Analysis ◽

Dimensional Space ◽

Semantic Relatedness ◽

Knowledge Bases ◽

Semantic Interpretation ◽

World Knowledge ◽

Fine Grained

Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was based on purely statistical techniques that did not make use of background knowledge, on limited lexicographic knowledge bases such as WordNet, or on huge manual efforts such as the CYC project. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic interpretation of unrestricted natural language texts. Our method represents meaning in a high-dimensional space of concepts derived from Wikipedia, the largest encyclopedia in existence. We explicitly represent the meaning of any text in terms of Wikipedia-based concepts. We evaluate the effectiveness of our method on text categorization and on computing the degree of semantic relatedness between fragments of natural language text. Using ESA results in significant improvements over the previous state of the art in both tasks. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.

Download Full-text

Concise semantic analysis based text categorization using modified hybrid union feature selection approach

2018 4th International Conference on Recent Advances in Information Technology (RAIT) ◽

10.1109/rait.2018.8389057 ◽

2018 ◽

Cited By ~ 2

Author(s):

Amol P. Bhopale ◽

Sowmya Kamath S. ◽

Ashish Tiwari

Keyword(s):

Feature Selection ◽

Text Categorization ◽

Semantic Analysis ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Local and Global Latent Semantic Analysis for Text Categorization

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch060 ◽

2018 ◽

pp. 1360-1374 ◽

Cited By ~ 1

Author(s):

Khadoudja Ghanem

Keyword(s):

Information Retrieval ◽

High Precision ◽

Latent Semantic Analysis ◽

Classification Accuracy ◽

Text Categorization ◽

Semantic Analysis ◽

Experimental Results ◽

New Method ◽

Semantic Approach ◽

Document Categorization

In this paper the authors propose a semantic approach to document categorization. The idea is to create for each category a semantic index (representative term vector) by performing a local Latent Semantic Analysis (LSA) followed by a clustering process. A second use of LSA (Global LSA) is adopted on a term-Class matrix in order to retrieve the class which is the most similar to the query (document to classify) in the same way where the LSA is used to retrieve documents which are the most similar to a query in Information Retrieval. The proposed system is evaluated on a popular dataset which is 20 Newsgroup corpus. Obtained results show the effectiveness of the method compared with those obtained with the classic KNN and SVM classifiers as well as with methods presented in the literature. Experimental results show that the new method has high precision and recall rates and classification accuracy is significantly improved.

Download Full-text