Concept-Based Information Retrieval Using Explicit Semantic Analysis

Ofer Egozi; Shaul Markovitch; Evgeniy Gabrilovich

doi:10.1145/1961209.1961211

Semantic Relatedness Estimation using the Layout Information of Wikipedia Articles

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.2013040103 ◽

2013 ◽

Vol 7 (2) ◽

pp. 30-48 ◽

Cited By ~ 2

Author(s):

Patrick Chan ◽

Yoshinori Hijikata ◽

Toshiya Kuramochi ◽

Shogo Nishida

Keyword(s):

Information Retrieval ◽

Word Frequency ◽

Language Processing ◽

Semantic Analysis ◽

State Of The Art ◽

Empirical Evaluation ◽

Low Frequency ◽

Semantic Relatedness ◽

Conventional Approach ◽

Explicit Semantic Analysis

Computing the semantic relatedness between two words or phrases is an important problem in fields such as information retrieval and natural language processing. Explicit Semantic Analysis (ESA), a state-of-the-art approach to solve the problem uses word frequency to estimate relevance. Therefore, the relevance of words with low frequency cannot always be well estimated. To improve the relevance estimate of low-frequency words and concepts, the authors apply regression to word frequency, its location in an article, and its text style to calculate the relevance. The relevance value is subsequently used to compute semantic relatedness. Empirical evaluation shows that, for low-frequency words, the authors’ method achieves better estimate of semantic relatedness over ESA. Furthermore, when all words of the dataset are considered, the combination of the authors’ proposed method and the conventional approach outperforms the conventional approach alone.

Download Full-text

Designing a Chat-Bot for College Information using Information Retrieval and Automatic Text Summarization Techniques

Current Chinese Computer Science ◽

10.2174/2665997201999201022191540 ◽

2020 ◽

Vol 01 ◽

Author(s):

Radha Guha

Keyword(s):

Information Retrieval ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Semantic Analysis ◽

Text Summarization ◽

The Internet ◽

Specific Domain ◽

User Query ◽

College Information ◽

Chat Bot

Background:: In the era of information overload it is very difficult for a human reader to make sense of the vast information available in the internet quickly. Even for a specific domain like college or university website it may be difficult for a user to browse through all the links to get the relevant answers quickly. Objective:: In this scenario, design of a chat-bot which can answer questions related to college information and compare between colleges will be very useful and novel. Methods:: In this paper a novel conversational interface chat-bot application with information retrieval and text summariza-tion skill is designed and implemented. Firstly this chat-bot has a simple dialog skill when it can understand the user query intent, it responds from the stored collection of answers. Secondly for unknown queries, this chat-bot can search the internet and then perform text summarization using advanced techniques of natural language processing (NLP) and text mining (TM). Results:: The advancement of NLP capability of information retrieval and text summarization using machine learning tech-niques of Latent Semantic Analysis(LSI), Latent Dirichlet Allocation (LDA), Word2Vec, Global Vector (GloVe) and Tex-tRank are reviewed and compared in this paper first before implementing them for the chat-bot design. This chat-bot im-proves user experience tremendously by getting answers to specific queries concisely which takes less time than to read the entire document. Students, parents and faculty can get the answers for variety of information like admission criteria, fees, course offerings, notice board, attendance, grades, placements, faculty profile, research papers and patents etc. more effi-ciently. Conclusion:: The purpose of this paper was to follow the advancement in NLP technologies and implement them in a novel application.

Download Full-text

Cross-Language Source Code Plagiarism Detection using Explicit Semantic Analysis and Scored Greedy String Tilling

Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 ◽

10.1145/3383583.3398594 ◽

2020 ◽

Cited By ~ 1

Author(s):

Tomáš Foltýnek ◽

Richard Všianský ◽

Norman Meuschke ◽

Dita Dlabolová ◽

Bela Gipp

Keyword(s):

Semantic Analysis ◽

Source Code ◽

Plagiarism Detection ◽

Cross Language ◽

Explicit Semantic Analysis

Download Full-text

Contribution to Semantic Analysis of Arabic Language

Advances in Artificial Intelligence ◽

10.1155/2012/620461 ◽

2012 ◽

Vol 2012 ◽

pp. 1-8 ◽

Cited By ~ 6

Author(s):

Anis Zouaghi ◽

Mounir Zrigui ◽

Georges Antoniadis ◽

Laroussi Merhbene

Keyword(s):

Information Retrieval ◽

Semantic Analysis ◽

String Matching ◽

Ambiguous Word ◽

Arabic Language ◽

New Approach ◽

Matching Algorithm ◽

Ambiguous Words ◽

Lesk Algorithm ◽

Context Of Use

We propose a new approach for determining the adequate sense of Arabic words. For that, we propose an algorithm based on information retrieval measures to identify the context of use that is the closest to the sentence containing the word to be disambiguated. The contexts of use represent a set of sentences that indicates a particular sense of the ambiguous word. These contexts are generated using the words that define the senses of the ambiguous words, the exact string-matching algorithm, and the corpus. We use the measures employed in the domain of information retrieval, Harman, Croft, and Okapi combined to the Lesk algorithm, to assign the correct sense of those proposed.

Download Full-text

Concept-based document models using explicit semantic analysis

2012 IEEE International Conference on Granular Computing ◽

10.1109/grc.2012.6468647 ◽

2012 ◽

Author(s):

Jing Luo ◽

Bo Meng ◽

Xinhui Tu ◽

Maofu Liu

Keyword(s):

Semantic Analysis ◽

Document Models ◽

Explicit Semantic Analysis

Download Full-text

Kernel latent semantic analysis using an information retrieval based kernel

Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09 ◽

10.1145/1645953.1646214 ◽

2009 ◽

Cited By ~ 1

Author(s):

Laurence A.F. Park ◽

Kotagiri Ramamohanarao

Keyword(s):

Information Retrieval ◽

Latent Semantic Analysis ◽

Semantic Analysis

Download Full-text

Information Retrieval by Semantic Analysis and Visualization of the Concept Space of D-Lib® Magazine

D-Lib Magazine ◽

10.1045/october2002-zhang ◽

2002 ◽

Vol 8 (10) ◽

Cited By ~ 8

Author(s):

Junliang Zhang ◽

Javed Mostafa ◽

Himansu Tripathy

Keyword(s):

Information Retrieval ◽

Semantic Analysis ◽

Concept Space

Download Full-text

BUILD KNOWLEDGE GRAPH FROM HETEROGENEOUS DOCUMENTS

Journal of Science and Technology - IUH ◽

10.46242/jst-iuh.v47i05.761 ◽

2021 ◽

Vol 47 (05) ◽

Author(s):

NGUYỄN CHÍ HIẾU

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Analysis ◽

Knowledge Graph ◽

Question Answering Systems ◽

Knowledge Graphs

Knowledge Graphs are applied in many fields such as search engines, semantic analysis, and question answering in recent years. However, there are many obstacles for building knowledge graphs as methodologies, data and tools. This paper introduces a novel methodology to build knowledge graph from heterogeneous documents. We use the methodologies of Natural Language Processing and deep learning to build this graph. The knowledge graph can use in Question answering systems and Information retrieval especially in Computing domain

Download Full-text

ALBIS

Sociotechnical Enterprise Information Systems Design and Integration ◽

10.4018/978-1-4666-3664-4.ch012 ◽

2013 ◽

pp. 188-206

Author(s):

Lerina Aversano ◽

Carmine Grasso ◽

Maria Tortorella

Keyword(s):

Information Retrieval ◽

Software Maintenance ◽

Business Processes ◽

Semantic Analysis ◽

Software Systems ◽

Complex Task ◽

Process Performance ◽

Definition Of ◽

Support Software

The evaluation of the alignment level existing between a business process and the supporting software systems is a critical concern for an organization, as the higher the alignment level is, the better the process performance is. Monitoring the alignment implies the characterization of all the items it involves and definition of measures for evaluating it. This is a complex task, and the availability of automatic tools for supporting evaluation and evolution activities may be precious. This chapter presents the ALBIS Environment (Aligning Business Processes and Information Systems), designed to support software maintenance tasks. In particular, the proposed environment allows the modeling and tracing between business and software entities and the measurement of their alignment degree. An information retrieval approach is embedded in ALBIS based on two processing phases including syntactic and semantic analysis. The usefulness of the environment is discussed through two case studies.

Download Full-text

Local and Global Latent Semantic Analysis for Text Categorization

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2014070101 ◽

2014 ◽

Vol 4 (3) ◽

pp. 1-13

Author(s):

Khadoudja Ghanem

Keyword(s):

Information Retrieval ◽

High Precision ◽

Latent Semantic Analysis ◽

Classification Accuracy ◽

Text Categorization ◽

Semantic Analysis ◽

Experimental Results ◽

Semantic Approach ◽

Document Categorization ◽

Second Use

In this paper the authors propose a semantic approach to document categorization. The idea is to create for each category a semantic index (representative term vector) by performing a local Latent Semantic Analysis (LSA) followed by a clustering process. A second use of LSA (Global LSA) is adopted on a term-Class matrix in order to retrieve the class which is the most similar to the query (document to classify) in the same way where the LSA is used to retrieve documents which are the most similar to a query in Information Retrieval. The proposed system is evaluated on a popular dataset which is 20 Newsgroup corpus. Obtained results show the effectiveness of the method compared with those obtained with the classic KNN and SVM classifiers as well as with methods presented in the literature. Experimental results show that the new method has high precision and recall rates and classification accuracy is significantly improved.

Download Full-text