LIS4: Lesk Inspired Sense Specific Semantic Similarity using WordNet

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500064 ◽

2021 ◽

pp. 2150006

Author(s):

Saravanakumar Kandasamy ◽

Aswani Kumar Cherukuri

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Gold Standard ◽

Question Answering ◽

Knowledge Based ◽

Benchmark Datasets ◽

Processing Information

Semantic similarity quantification between concepts is one of the inevitable parts in domains like Natural Language Processing, Information Retrieval, Question Answering, etc. to understand the text and their relationships better. Last few decades, many measures have been proposed by incorporating various corpus-based and knowledge-based resources. WordNet and Wikipedia are two of the Knowledge-based resources. The contribution of WordNet in the above said domain is enormous due to its richness in defining a word and all of its relationship with others. In this paper, we proposed an approach to quantify the similarity between concepts that exploits the synsets and the gloss definitions of different concepts using WordNet. Our method considers the gloss definitions, contextual words that are helping in defining a word, synsets of contextual word and the confidence of occurrence of a word in other word’s definition for calculating the similarity. The evaluation based on different gold standard benchmark datasets shows the efficiency of our system in comparison with other existing taxonomical and definitional measures.

Download Full-text

Report on the 4th Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries at SIGIR 2019

ACM SIGIR Forum ◽

10.1145/3458553.3458554 ◽

2019 ◽

Vol 53 (2) ◽

pp. 3-10

Author(s):

Muthu Kumar Chandrasekaran ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing ◽

Digital Libraries ◽

State Of The Art ◽

Shared Task ◽

Processing Information ◽

Joint Workshop

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.

Download Full-text

Applying Natural Language Processing, Information Retrieval and Machine Learning to Decision Support in Medical Coordination in an Emergency Medicine Context

2015 IEEE 28th International Symposium on Computer-Based Medical Systems ◽

10.1109/cbms.2015.82 ◽

2015 ◽

Cited By ~ 2

Author(s):

Juliana Tarossi Pollettini ◽

Hugo Cesar Pessotti ◽

Antonio Pazin Filho ◽

Evandro Eduardo Seron Ruiz ◽

Mario Sergio Adolfi Junior

Keyword(s):

Machine Learning ◽

Emergency Medicine ◽

Information Retrieval ◽

Natural Language Processing ◽

Decision Support ◽

Natural Language ◽

Language Processing ◽

Processing Information

Download Full-text

BUILD KNOWLEDGE GRAPH FROM HETEROGENEOUS DOCUMENTS

Journal of Science and Technology - IUH ◽

10.46242/jst-iuh.v47i05.761 ◽

2021 ◽

Vol 47 (05) ◽

Author(s):

NGUYỄN CHÍ HIẾU

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Analysis ◽

Knowledge Graph ◽

Question Answering Systems ◽

Knowledge Graphs

Knowledge Graphs are applied in many fields such as search engines, semantic analysis, and question answering in recent years. However, there are many obstacles for building knowledge graphs as methodologies, data and tools. This paper introduces a novel methodology to build knowledge graph from heterogeneous documents. We use the methodologies of Natural Language Processing and deep learning to build this graph. The knowledge graph can use in Question answering systems and Information retrieval especially in Computing domain

Download Full-text

A Composite Natural Language Processing and Information Retrieval Approach to Question Answering Using a Structured Knowledge Base

International Journal of Semantic Computing ◽

10.1142/s1793351x17400141 ◽

2017 ◽

Vol 11 (03) ◽

pp. 345-371

Author(s):

Avani Chandurkar ◽

Ajay Bansal

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Knowledge Base ◽

Language Processing ◽

Question Answering ◽

Automated System ◽

Free Form ◽

Question Answering System ◽

Novel Approach

With the inception of the World Wide Web, the amount of data present on the Internet is tremendous. This makes the task of navigating through this enormous amount of data quite difficult for the user. As users struggle to navigate through this wealth of information, the need for the development of an automated system that can extract the required information becomes urgent. This paper presents a Question Answering system to ease the process of information retrieval. Question Answering systems have been around for quite some time and are a sub-field of information retrieval and natural language processing. The task of any Question Answering system is to seek an answer to a free form factual question. The difficulty of pinpointing and verifying the precise answer makes question answering more challenging than simple information retrieval done by search engines. The research objective of this paper is to develop a novel approach to Question Answering based on a composition of conventional approaches of Information Retrieval (IR) and Natural Language processing (NLP). The focus is on using a structured and annotated knowledge base instead of an unstructured one. The knowledge base used here is DBpedia and the final system is evaluated on the Text REtrieval Conference (TREC) 2004 questions dataset.

Download Full-text

Knowledge-based sentence semantic similarity: algebraical properties

Progress in Artificial Intelligence ◽

10.1007/s13748-021-00248-0 ◽

2021 ◽

Author(s):

Mourad Oussalah ◽

Muhidin Mohamed

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Similarity Measures ◽

Canonical Extension ◽

Similarity Score ◽

Semantic Similarity Measure ◽

Sentence Similarity

AbstractDetermining the extent to which two text snippets are semantically equivalent is a well-researched topic in the areas of natural language processing, information retrieval and text summarization. The sentence-to-sentence similarity scoring is extensively used in both generic and query-based summarization of documents as a significance or a similarity indicator. Nevertheless, most of these applications utilize the concept of semantic similarity measure only as a tool, without paying importance to the inherent properties of such tools that ultimately restrict the scope and technical soundness of the underlined applications. This paper aims to contribute to fill in this gap. It investigates three popular WordNet hierarchical semantic similarity measures, namely path-length, Wu and Palmer and Leacock and Chodorow, from both algebraical and intuitive properties, highlighting their inherent limitations and theoretical constraints. We have especially examined properties related to range and scope of the semantic similarity score, incremental monotonicity evolution, monotonicity with respect to hyponymy/hypernymy relationship as well as a set of interactive properties. Extension from word semantic similarity to sentence similarity has also been investigated using a pairwise canonical extension. Properties of the underlined sentence-to-sentence similarity are examined and scrutinized. Next, to overcome inherent limitations of WordNet semantic similarity in terms of accounting for various Part-of-Speech word categories, a WordNet “All word-To-Noun conversion” that makes use of Categorial Variation Database (CatVar) is put forward and evaluated using a publicly available dataset with a comparison with some state-of-the-art methods. The finding demonstrates the feasibility of the proposal and opens up new opportunities in information retrieval and natural language processing tasks.

Download Full-text

A Brief Survey of Question Answering Systems

International Journal of Artificial Intelligence & Applications ◽

10.5121/ijaia.2021.12501 ◽

2021 ◽

Vol 12 (5) ◽

pp. 01-07

Author(s):

Michael Caballero

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Open Domain ◽

Knowledge Based ◽

Current State ◽

Introductory Overview ◽

Building Systems ◽

Question Answering Systems

Question Answering (QA) is a subfield of Natural Language Processing (NLP) and computer science focused on building systems that automatically answer questions from humans in natural language. This survey summarizes the history and current state of the field and is intended as an introductory overview of QA systems. After discussing QA history, this paper summarizes the different approaches to the architecture of QA systems -- whether they are closed or open-domain and whether they are text-based, knowledge-based, or hybrid systems. Lastly, some common datasets in this field are introduced and different evaluation metrics are discussed.

Download Full-text

Automated Identification of Semantic Similarity between Concepts of Textual Business Rules

International Journal of Intelligent Engineering and Systems ◽

10.22266/ijies2021.0228.15 ◽

2021 ◽

Vol 14 (1) ◽

pp. 147-156

Author(s):

Abdellatif Haj ◽

◽

Youssef Balouki ◽

Taoufiq Gadi ◽

◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Business Rules ◽

Automated Identification ◽

Standard Format ◽

Knowledge Based ◽

Special Case

Business Rules (BR) are usually written by different stakeholders, which makes them vulnerable to contain different designations for a same concept. Such problem can be the source of a not well orchestrated behaviors. Whereas identification of synonyms is manual or totally neglected in most approaches dealing with natural language Business Rules. In this paper, we present an automated approach to identify semantic similarity between terms in textual BR using Natural Language Processing and knowledge-based algorithm refined using heuristics. Our method is unique in that it also identifies abbreviations/expansions (as a special case of synonym) which is not possible using a dictionary. Then, results are saved in a standard format (SBVR) for reusability purposes. Our approach was applied on more than 160 BR statements divided on three cases with an accuracy between 69% and 87% which suggests it to be an indispensable enhancement for other methods dealing with textual BR.

Download Full-text