Knowledge bases and description logics applications to natural language texts analysis

PROBLEMS IN PROGRAMMING ◽

10.15407/pp2020.02-03.259 ◽

2020 ◽

pp. 259-269

Author(s):

H.I. Hoherchak ◽

Keyword(s):

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Analysis ◽

Description Logics ◽

Knowledge Bases ◽

Parts Of Speech ◽

Resolution Problem ◽

Concepts Of Knowledge ◽

Speech Tagging

The article describes some ways of knowledge bases application to natural language texts analysis and solving some of their processing tasks. The basic problems of natural language processing are considered, which are the basis for their semantic analysis: problems of tokenization, parts of speech tagging, dependency parsing, correference resolution. The basic concepts of knowledge bases theory are presented and the approach to their filling based on Universal Dependencies framework and the correference resolution problem is proposed. Examples of applications for knowledge bases filled with natural language texts in practical problems are given, including checking constructed syntactic and semantic models for consistency and question answering.

Download Full-text

BUILD KNOWLEDGE GRAPH FROM HETEROGENEOUS DOCUMENTS

Journal of Science and Technology - IUH ◽

10.46242/jst-iuh.v47i05.761 ◽

2021 ◽

Vol 47 (05) ◽

Author(s):

NGUYỄN CHÍ HIẾU

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Analysis ◽

Knowledge Graph ◽

Question Answering Systems ◽

Knowledge Graphs

Knowledge Graphs are applied in many fields such as search engines, semantic analysis, and question answering in recent years. However, there are many obstacles for building knowledge graphs as methodologies, data and tools. This paper introduces a novel methodology to build knowledge graph from heterogeneous documents. We use the methodologies of Natural Language Processing and deep learning to build this graph. The knowledge graph can use in Question answering systems and Information retrieval especially in Computing domain

Download Full-text

Text Analysis of Assembly Work Instructions

Volume 1B: 35th Computers and Information in Engineering Conference ◽

10.1115/detc2015-47246 ◽

2015 ◽

Cited By ~ 1

Author(s):

Rahul Sharan Renu ◽

Gregory Mocko

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Lead Times ◽

Parts Of Speech ◽

Assembly Work ◽

And Performance ◽

Quality Of Products ◽

Speech Tagging

The objective of this research is to investigate the requirements and performance of parts-of-speech tagging of assembly work instructions. Natural Language Processing of assembly work instructions is required to perform data mining with the objective of knowledge reuse. Assembly work instructions are key process engineering elements that allow for predictable assembly quality of products and predictable assembly lead times. Authoring of assembly work instructions is a subjective process. It has been observed that most assembly work instructions are not grammatically complete sentences. It is hypothesized that this can lead to false parts-of-speech tagging (by Natural Language Processing tools). To test this hypothesis, two parts-of-speech taggers are used to tag 500 assembly work instructions (obtained from the automotive industry). The first parts-of-speech tagger is obtained from Natural Language Processing Toolkit (nltk.org) and the second parts-of-speech tagger is obtained from Stanford Natural Language Processing Group (nlp.stanford.edu). For each of these taggers, two experiments are conducted. In the first experiment, the assembly work instructions are input to the each tagger in raw form. In the second experiment, the assembly work instructions are preprocessed to make them grammatically complete, and then input to the tagger. It is found that the Stanford Natural Language Processing tagger with the preprocessed assembly work instructions produced the least number of false parts-of-speech tags.

Download Full-text

Natural Language to SQL query Generation

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35804 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 5069-5072

Author(s):

Kiran Raj R

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

English Language ◽

Regular Expression ◽

Parts Of Speech ◽

Query Generation ◽

Sql Query ◽

Speech Tagging ◽

The Web

Today, everyone has a personal device to access the web. Every user tries to access the knowledge that they require through internet. Most of the knowledge is within the sort of a database. A user with limited knowledge of database will have difficulty in accessing the data in the database. Hence, there’s a requirement for a system that permits the users to access the knowledge within the database. The proposed method is to develop a system where the input be a natural language and receive an SQL query which is used to access the database and retrieve the information with ease. Tokenization, parts-of-speech tagging, lemmatization, parsing and mapping are the steps involved in the process. The project proposed would give a view of using of Natural Language Processing (NLP) and mapping the query in accordance with regular expression in English language to SQL.

Download Full-text

Understanding Romanian Texts by Using Gamification Methods

International Journal of Advanced Statistics and IT&C for Economics and Life Sciences ◽

10.2478/ijasitels-2019-0006 ◽

2019 ◽

Vol 9 (1) ◽

pp. 52-57

Author(s):

Ștefania-Eliza Berghia ◽

Bogdan Pahomi ◽

Daniel Volovici

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Choice Theory ◽

Social Choice Theory ◽

Parts Of Speech ◽

Automatic Translation ◽

E Learning ◽

Syntactic Function

AbstractIn recent years, there has been increasing interest in the field of natural language processing. Determining which syntactic function is right for a specific word is an important task in this field, being useful for a variety of applications like understanding texts, automatic translation and question-answering applications and even in e-learning systems. In the Romanian language, this is an even harder task because of the complexity of the grammar. The present paper falls within the field of “Natural Language Processing”, but it also blends with other concepts such as “Gamification”, “Social Choice Theory” and “Wisdom of the Crowd”. There are two main purposes for developing the application in this paper:a) For students to have at their disposal some support through which they can deepen their knowledge about the syntactic functions of the parts of speech, a knowledge that they have accumulated during the teaching hours at schoolb) For collecting data about how the students make their choices, how do they know which grammar role is correct for a specific word, these data being primordial for replicating the learning process

Download Full-text

Special Thematic Section on Semantic Models for Natural Language Processing (Preface)

Cybernetics and Information Technologies ◽

10.2478/cait-2018-0008 ◽

2018 ◽

Vol 18 (1) ◽

pp. 93-94

Author(s):

Kiril Simov ◽

Petya Osenova

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Open Data ◽

Knowledge Bases ◽

Lexical Resources ◽

Text Simplification ◽

Semantic Models ◽

Semantic Resources

Abstract With the availability of large language data online, cross-linked lexical resources (such as BabelNet, Predicate Matrix and UBY) and semantically annotated corpora (SemCor, OntoNotes, etc.), more and more applications in Natural Language Processing (NLP) have started to exploit various semantic models. The semantic models have been created on the base of LSA, clustering, word embeddings, deep learning, neural networks, etc., and abstract logical forms, such as Minimal Recursion Semantics (MRS) or Abstract Meaning Representation (AMR), etc. Additionally, the Linguistic Linked Open Data Cloud has been initiated (LLOD Cloud) which interlinks linguistic data for improving the tasks of NLP. This cloud has been expanding enormously for the last four-five years. It includes corpora, lexicons, thesauri, knowledge bases of various kinds, organized around appropriate ontologies, such as LEMON. The semantic models behind the data organization as well as the representation of the semantic resources themselves are a challenge to the NLP community. The NLP applications that extensively rely on the above discussed models include Machine Translation, Information Extraction, Question Answering, Text Simplification, etc.

Download Full-text

A Knowledge-Based Sense Disambiguation Method to Semantically Enhanced NL Question for Restricted Domain

Information ◽

10.3390/info12110452 ◽

2021 ◽

Vol 12 (11) ◽

pp. 452

Author(s):

Ammar Arbaaeen ◽

Asadullah Shah

Keyword(s):

Natural Language ◽

Language Processing ◽

Question Answering ◽

Word Sense Disambiguation ◽

Knowledge Bases ◽

Word Sense ◽

Intended Meaning ◽

Lexical Semantic ◽

Knowledge Based ◽

Sense Disambiguation

Within the space of question answering (QA) systems, the most critical module to improve overall performance is question analysis processing. Extracting the lexical semantic of a Natural Language (NL) question presents challenges at syntactic and semantic levels for most QA systems. This is due to the difference between the words posed by a user and the terms presently stored in the knowledge bases. Many studies have achieved encouraging results in lexical semantic resolution on the topic of word sense disambiguation (WSD), and several other works consider these challenges in the context of QA applications. Additionally, few scholars have examined the role of WSD in returning potential answers corresponding to particular questions. However, natural language processing (NLP) is still facing several challenges to determine the precise meaning of various ambiguities. Therefore, the motivation of this work is to propose a novel knowledge-based sense disambiguation (KSD) method for resolving the problem of lexical ambiguity associated with questions posed in QA systems. The major contribution is the proposed innovative method, which incorporates multiple knowledge sources. This includes the question’s metadata (date/GPS), context knowledge, and domain ontology into a shallow NLP. The proposed KSD method is developed into a unique tool for a mobile QA application that aims to determine the intended meaning of questions expressed by pilgrims. The experimental results reveal that our method obtained comparable and better accuracy performance than the baselines in the context of the pilgrimage domain.

Download Full-text

Software architecture of the question-answering subsystem with elements of self-learning

Artificial Intelligence ◽

10.15407/jai2021.02.088 ◽

2021 ◽

Vol 26 (jai2021.26(2)) ◽

pp. 88-95

Author(s):

Hlybovets A ◽

◽

Tsaruk A ◽

Keyword(s):

Natural Language ◽

Language Processing ◽

Speech Synthesis ◽

Question Answering ◽

Building Blocks ◽

Knowledge Bases ◽

Learning Technologies ◽

Software Systems ◽

Question Answering Systems ◽

Self Learning

Within the framework of this paper, the analysis of software systems of question-answering type and their basic architectures has been carried out. With the development of machine learning technologies, creation of natural language processing (NLP) engines, as well as the rising popularity of virtual personal assistant programs that use the capabilities of speech synthesis (text-to-speech), there is a growing need in developing question-answering systems which can provide personalized answers to users' questions. All modern cloud providers proposed frameworks for organization of question answering systems but still we have a problem with personalized dialogs. Personalization is very important, it can put forward additional demands to a question-answering system’s capabilities to take this information into account while processing users’ questions. Traditionally, a question-answering system (QAS) is developed in the form of an application that contains a knowledge base and a user interface, which provides a user with answers to questions, and a means of interaction with an expert. In this article we analyze modern approaches to architecture development and try to build system from the building blocks that already exist on the market. Main criteria for the NLP modules were: support of the Ukrainian language, natural language understanding, functions of automatic definition of entities (attributes), ability to construct a dialogue flow, quality and completeness of documentation, API capabilities and integration with external systems, possibilities of external knowledge bases integration After provided analyses article propose the detailed architecture of the question-answering subsystem with elements of self-learning in the Ukrainian language. In the work you can find detailed description of main semantic components of the system (architecture components)

Download Full-text

SWFQA Semantic Web Based Framework for Question Answering

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2019010106 ◽

2019 ◽

Vol 9 (1) ◽

pp. 88-106

Author(s):

Irphan Ali ◽

Divakar Yadav ◽

Ashok Kumar Sharma

Keyword(s):

Semantic Web ◽

Natural Language ◽

Knowledge Base ◽

Language Processing ◽

Question Answering ◽

Knowledge Bases ◽

Digital Information ◽

Web Based ◽

Question Answering System ◽

User Query

A question answering system aims to provide the correct and quick answer to users' query from a knowledge base. Due to the growth of digital information on the web, information retrieval system is the need of the day. Most recent question answering systems consult knowledge bases to answer a question, after parsing and transforming natural language queries to knowledge base-executable forms. In this article, the authors propose a semantic web-based approach for question answering system that uses natural language processing for analysis and understanding the user query. It employs a “Total Answer Relevance Score” to find the relevance of each answer returned by the system. The results obtained thereof are quite promising. The real-time performance of the system has been evaluated on the answers, extracted from the knowledge base.

Download Full-text

Wikipedia-based Semantic Interpretation for Natural Language Processing

Journal of Artificial Intelligence Research ◽

10.1613/jair.2669 ◽

2009 ◽

Vol 34 ◽

pp. 443-498 ◽

Cited By ~ 137

Author(s):

E. Gabrilovich ◽

S. Markovitch

Keyword(s):

Natural Language ◽

Language Processing ◽

Text Categorization ◽

Semantic Analysis ◽

Dimensional Space ◽

Semantic Relatedness ◽

Knowledge Bases ◽

Semantic Interpretation ◽

World Knowledge ◽

Fine Grained

Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was based on purely statistical techniques that did not make use of background knowledge, on limited lexicographic knowledge bases such as WordNet, or on huge manual efforts such as the CYC project. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic interpretation of unrestricted natural language texts. Our method represents meaning in a high-dimensional space of concepts derived from Wikipedia, the largest encyclopedia in existence. We explicitly represent the meaning of any text in terms of Wikipedia-based concepts. We evaluate the effectiveness of our method on text categorization and on computing the degree of semantic relatedness between fragments of natural language text. Using ESA results in significant improvements over the previous state of the art in both tasks. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.

Download Full-text

The Value of Paraphrase for Knowledge Base Predicates

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6475 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9346-9353

Author(s):

Bingcong Xue ◽

Sen Hu ◽

Lei Zou ◽

Jiashu Cheng

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Large Scale ◽

Question Answering ◽

Knowledge Bases ◽

Human Beings ◽

High Quality ◽

Language Generation

Paraphrase, i.e., differing textual realizations of the same meaning, has proven useful for many natural language processing (NLP) applications. Collecting paraphrase for predicates in knowledge bases (KBs) is the key to comprehend the RDF triples in KBs. Existing works have published some paraphrase datasets automatically extracted from large corpora, but have too many redundant pairs or don't cover enough predicates, which cannot be improved by computer only and need the help of human beings. This paper shows a full process of collecting large-scale and high-quality paraphrase dictionaries for predicates in knowledge bases, which takes advantage of existing datasets and combines the technologies of machine mining and crowdsourcing. Our dataset comprises 2284 distinct predicates in DBpedia and 31130 paraphrase pairs in total, the quality of which is a great leap over previous works. Then it is demonstrated that such good paraphrase dictionaries can do great help to natural language processing tasks such as question answering and language generation. We also publish our own dictionary for further research.

Download Full-text