Functional Partitioning of Ontologies for Natural Language Query Completion in Question Answering Systems

Query completion systems are well studied in the context of information retrieval systems that handle keyword queries. However, Natural Language Interface to Databases (NLIDB) systems that focus on syntactically correct and semantically complete queries to obtain high precision answers require a fundamentally different approach to the query completion problem as opposed to IR systems. To the best of our knowledge, we are first to focus on the problem of query completion for NLIDB systems. In particular, we introduce a novel concept of functional partitioning of an ontology and then design algorithms to intelligently use the components obtained from functional partitioning to extend a state-of-the-art NLIDB system to produce accurate and semantically meaningful query completions in the absence of query logs. We test the proposed query completion framework on multiple benchmark datasets and demonstrate the efficacy of our technique empirically.

Download Full-text

Single-shot Semantic Matching Network for Moment Localization in Videos

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3441577 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1-14

Author(s):

Xinfang Liu ◽

Xiushan Nie ◽

Junya Teng ◽

Li Lian ◽

Yilong Yin

Keyword(s):

Natural Language ◽

Fixed Number ◽

Single Shot ◽

Semantic Matching ◽

Matching Network ◽

Attention Model ◽

Semantic Relationships ◽

Natural Language Query ◽

Benchmark Datasets ◽

Short Memory

Moment localization in videos using natural language refers to finding the most relevant segment from videos given a natural language query. Most of the existing methods require video segment candidates for further matching with the query, which leads to extra computational costs, and they may also not locate the relevant moments under any length evaluated. To address these issues, we present a lightweight single-shot semantic matching network (SSMN) to avoid the complex computations required to match the query and the segment candidates, and the proposed SSMN can locate moments of any length theoretically. Using the proposed SSMN, video features are first uniformly sampled to a fixed number, while the query sentence features are generated and enhanced by GloVe, long-term short memory (LSTM), and soft-attention modules. Subsequently, the video features and sentence features are fed to an enhanced cross-modal attention model to mine the semantic relationships between vision and language. Finally, a score predictor and a location predictor are designed to locate the start and stop indexes of the query moment. We evaluate the proposed method on two benchmark datasets and the experimental results demonstrate that SSMN outperforms state-of-the-art methods in both precision and efficiency.

Download Full-text

Composing Questions through Conceptual Authoring

Computational Linguistics ◽

10.1162/coli.2007.33.1.105 ◽

2007 ◽

Vol 33 (1) ◽

pp. 105-133 ◽

Cited By ~ 26

Author(s):

Catalina Hallett ◽

Donia Scott ◽

Richard Power

Keyword(s):

Natural Language ◽

Question Answering ◽

Free Text ◽

Risk Averse ◽

Proof Of Concept ◽

Concept System ◽

Complex Queries ◽

Extensive Training ◽

Question Answering Systems ◽

Medical Histories

This article describes a method for composing fluent and complex natural language questions, while avoiding the standard pitfalls of free text queries. The method, based on Conceptual Authoring, is targeted at question-answering systems where reliability and transparency are critical, and where users cannot be expected to undergo extensive training in question composition. This scenario is found in most corporate domains, especially in applications that are risk-averse. We present a proof-of-concept system we have developed: a question-answering interface to a large repository of medical histories in the area of cancer. We show that the method allows users to successfully and reliably compose complex queries with minimal training.

Download Full-text

A terminological simplification transformation for natural language question-answering systems

10.3115/981131.981164 ◽

1986 ◽

Cited By ~ 4

Author(s):

David G. Stallard

Keyword(s):

Natural Language ◽

Question Answering ◽

Question Answering Systems ◽

Natural Language Question ◽

Language Question

Download Full-text

BUILD KNOWLEDGE GRAPH FROM HETEROGENEOUS DOCUMENTS

Journal of Science and Technology - IUH ◽

10.46242/jst-iuh.v47i05.761 ◽

2021 ◽

Vol 47 (05) ◽

Author(s):

NGUYỄN CHÍ HIẾU

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Analysis ◽

Knowledge Graph ◽

Question Answering Systems ◽

Knowledge Graphs

Knowledge Graphs are applied in many fields such as search engines, semantic analysis, and question answering in recent years. However, there are many obstacles for building knowledge graphs as methodologies, data and tools. This paper introduces a novel methodology to build knowledge graph from heterogeneous documents. We use the methodologies of Natural Language Processing and deep learning to build this graph. The knowledge graph can use in Question answering systems and Information retrieval especially in Computing domain

Download Full-text

LIS4: Lesk Inspired Sense Specific Semantic Similarity using WordNet

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500064 ◽

2021 ◽

pp. 2150006

Author(s):

Saravanakumar Kandasamy ◽

Aswani Kumar Cherukuri

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Gold Standard ◽

Question Answering ◽

Knowledge Based ◽

Benchmark Datasets ◽

Processing Information

Semantic similarity quantification between concepts is one of the inevitable parts in domains like Natural Language Processing, Information Retrieval, Question Answering, etc. to understand the text and their relationships better. Last few decades, many measures have been proposed by incorporating various corpus-based and knowledge-based resources. WordNet and Wikipedia are two of the Knowledge-based resources. The contribution of WordNet in the above said domain is enormous due to its richness in defining a word and all of its relationship with others. In this paper, we proposed an approach to quantify the similarity between concepts that exploits the synsets and the gloss definitions of different concepts using WordNet. Our method considers the gloss definitions, contextual words that are helping in defining a word, synsets of contextual word and the confidence of occurrence of a word in other word’s definition for calculating the similarity. The evaluation based on different gold standard benchmark datasets shows the efficiency of our system in comparison with other existing taxonomical and definitional measures.

Download Full-text

A Review on Question Generation from Natural Language Text

ACM Transactions on Information Systems ◽

10.1145/3468889 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-43

Author(s):

Ruqing Zhang ◽

Jiafeng Guo ◽

Lu Chen ◽

Yixing Fan ◽

Xueqi Cheng

Keyword(s):

Natural Language ◽

Question Answering ◽

Data Augmentation ◽

Text Structure ◽

Current Status ◽

Question Generation ◽

Natural Language Text ◽

Question Answering Systems ◽

The Right ◽

Language Text

Question generation is an important yet challenging problem in Artificial Intelligence (AI), which aims to generate natural and relevant questions from various input formats, e.g., natural language text, structure database, knowledge base, and image. In this article, we focus on question generation from natural language text, which has received tremendous interest in recent years due to the widespread applications such as data augmentation for question answering systems. During the past decades, many different question generation models have been proposed, from traditional rule-based methods to advanced neural network-based methods. Since there have been a large variety of research works proposed, we believe it is the right time to summarize the current status, learn from existing methodologies, and gain some insights for future development. In contrast to existing reviews, in this survey, we try to provide a more comprehensive taxonomy of question generation tasks from three different perspectives, i.e., the types of the input context text, the target answer, and the generated question. We take a deep look into existing models from different dimensions to analyze their underlying ideas, major design principles, and training strategies We compare these models through benchmark tasks to obtain an empirical understanding of the existing techniques. Moreover, we discuss what is missing in the current literature and what are the promising and desired future directions.

Download Full-text

An Empirical Study of Content Understanding in Conversational Question Answering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6257 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7578-7585

Author(s):

Ting-Rui Chiang ◽

Hao-Tong Ye ◽

Yun-Nung Chen

Keyword(s):

Natural Language Processing ◽

Empirical Study ◽

Language Processing ◽

Question Answering ◽

Source Code ◽

Content Understanding ◽

Question Answering Systems ◽

Benchmark Datasets ◽

Context Free ◽

Answering Questions

With a lot of work about context-free question answering systems, there is an emerging trend of conversational question answering models in the natural language processing field. Thanks to the recently collected datasets, including QuAC and CoQA, there has been more work on conversational question answering, and recent work has achieved competitive performance on both datasets. However, to best of our knowledge, two important questions for conversational comprehension research have not been well studied: 1) How well can the benchmark dataset reflect models' content understanding? 2) Do the models well utilize the conversation content when answering questions? To investigate these questions, we design different training settings, testing settings, as well as an attack to verify the models' capability of content understanding on QuAC and CoQA. The experimental results indicate some potential hazards in the benchmark datasets, QuAC and CoQA, for conversational comprehension research. Our analysis also sheds light on both what models may learn and how datasets may bias the models. With deep investigation of the task, it is believed that this work can benefit the future progress of conversation comprehension. The source code is available at https://github.com/MiuLab/CQA-Study.

Download Full-text

Super Agent Chatbot “3S” Sebagai Media Informasi Menggunakan Metoda Natural Language Processing(NLP)

JURNAL TEKNOLOGI DAN OPEN SOURCE ◽

10.36378/jtos.v2i1.144 ◽

2019 ◽

Vol 2 (1) ◽

pp. 53-64

Author(s):

Herwin H Herwin

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Web Site ◽

Question Answering ◽

Question Answering Systems ◽

Portal Website

STMIK Amik Riau memiliki portal pada website http://www.sar.ac.id difungsikan sebagai media penyebaran informasi bagi sivitas akademika dan stakeholder. Rerata pengunjung setiap hari dalam 3 bulan terakhir adalah 150 kunjungan, namun terjadi peningkatan pada saat penerimaan mahasiswa di setiap tahun akademik. Hal ini mengindikasikan terjadinya peningkatan minat masyarakat untuk mengetahui informasi STMIK Amik Riau. Sayangnya, sampai saat ini pemanfaatan portal web site masih satu arah, dari STMIK Amik Riau ke stakeholder dan masyarakat, tidak terjadi sebaliknya. Komunikasi stakeholder dengan PT sehubungan dengan muatan yang ada di dalam portal menggunakan media sosial dan tidak terintegrasi dengan web. Begitu juga dengan masukan, koreksi, tanggapan, maupun komunikasi lain menggunakan media sosial. Sampai saat ini, masyarakat yang mengunjungi portal website baik masyarakat luas, maupun stakeholder tidak dapat dideteksi waktu berkunjung sehingga tidak dapat disapa dengan filosofi “3S”, padahal masyarakat luas yang telah berkunjung merupakan pasar potensial untuk di edukasi. Masyarakat yang berkunjung ke portal website, dengan sopan di sapa oleh sistem, kemudian dilanjutkan dengan komunikasi langsung, tersedia mesin yang siap memberikan salam dan melayani setiap pertanyaan yang diajukan oleh pengunjung. Penelitian ini bertujuan membuat chatbot yang mampu berkomunikasi dengan pengunjung website. Chatbot yang telah dibuat diberi nama STMIK Amik Riau Intelligence Virtual Information disingkat SILVI. Chatbot dibuat berdasarkan Question Answering Systems (QAS), bekerja dengan algoritma kemiripan antara dua teks. Penelitian ini menghasilkan aplikasi yang siap digunakan, diberi nama SILVI, mampu berkomunikasi dengan pengunjung website. Chatbot mengoptimalkan komunikasi seolah tidak menyadari, tetap menganggap lawan bicara adalah pegawai yang tepat dalam tugas pokok dan fungsi.

Download Full-text

MQALD: Evaluating the impact of modifiers in question answering over knowledge graphs

Semantic Web ◽

10.3233/sw-210440 ◽

2021 ◽

pp. 1-17

Author(s):

Lucia Siciliani ◽

Pierpaolo Basile ◽

Pasquale Lops ◽

Giovanni Semeraro

Keyword(s):

Natural Language ◽

Question Answering ◽

Query Language ◽

State Of The Art ◽

Translation Process ◽

Specific Data ◽

Data Query ◽

Question Answering Systems ◽

Knowledge Graphs ◽

The Impact

Question Answering (QA) over Knowledge Graphs (KG) aims to develop a system that is capable of answering users’ questions using the information coming from one or multiple Knowledge Graphs, like DBpedia, Wikidata, and so on. Question Answering systems need to translate the user’s question, written using natural language, into a query formulated through a specific data query language that is compliant with the underlying KG. This translation process is already non-trivial when trying to answer simple questions that involve a single triple pattern. It becomes even more troublesome when trying to cope with questions that require modifiers in the final query, i.e., aggregate functions, query forms, and so on. The attention over this last aspect is growing but has never been thoroughly addressed by the existing literature. Starting from the latest advances in this field, we want to further step in this direction. This work aims to provide a publicly available dataset designed for evaluating the performance of a QA system in translating articulated questions into a specific data query language. This dataset has also been used to evaluate three QA systems available at the state of the art.

Download Full-text

Building Graph for Events and Time in Natural Language Text

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8419.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 581-586

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Language Processing ◽

Question Answering ◽

Relation Extraction ◽

Event Extraction ◽

Event Time ◽

Time Graph ◽

Question Answering Systems

Events and time are two major key terms in natural language processing due to the various event-oriented tasks these are become an essential terms in information extraction. In natural language processing and information extraction or retrieval event and time leads to several applications like text summaries, documents summaries, and question answering systems. In this paper, we present events-time graph as a new way of construction for event-time based information from text. In this event-time graph nodes are events, whereas edges represent the temporal and co-reference relations between events. In many of the previous researches of natural language processing mainly individually focused on extraction tasks and in domain-specific way but in this work we present extraction and representation of the relationship between events- time by representing with event time graph construction. Our overall system construction is in three-step process that performs event extraction, time extraction, and representing relation extraction. Each step is at a performance level comparable with the state of the art. We present Event extraction on MUC data corpus annotated with events mentions on which we train and evaluate our model. Next, we present time extraction the model of times tested for several news articles from Wikipedia corpus. Next is to represent event time relation by representation by next constructing event time graphs. Finally, we evaluate the overall quality of event graphs with the evaluation metrics and conclude the observations of the entire work

Download Full-text