Natural language interface to video database

In this paper, we discuss a natural language interface to a database of structured textual descriptions in the form of annotations of video objects. The interface maps the natural language query input on to the annotation structures. The language processing is done in three phases of expectations and implications from the input word, disambiguation of noun implications and slot-filling of prepositional expectations, and finally, disambiguation of verbal expectations. The system has been tested with different types of user inputs, including ill-formed sentences, and studied for erroneous inputs and for different types of portability issues.

Download Full-text

An Arabic natural language interface for querying relational databases based on natural language processing and graph theory methods

International Journal of Reasoning-based Intelligent Systems ◽

10.1504/ijris.2018.092221 ◽

2018 ◽

Vol 10 (2) ◽

pp. 155 ◽

Cited By ~ 1

Author(s):

Hanane Bais ◽

Mustapha Machkour ◽

Lahcen Koutti

Keyword(s):

Graph Theory ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Relational Databases ◽

Natural Language Interface

Download Full-text

Natural Language Processing methods for Document Matching

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst061271 ◽

2020 ◽

Vol 6 (12) ◽

pp. 379-383

Author(s):

Maitri Patel and Dr Hemant D Vasava

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Original Work ◽

Educational Systems ◽

Processing Methods ◽

Similar Test ◽

Subject Examination ◽

Different Types ◽

Higher Educational

Data,Information or knoweldge,in this rapidly moving and growing world.we can find any kind of information on Internet.And this can be too useful,however for acedemic world too it is useful but along with it plagarism is highly in practice.Which makes orginality of work degrade and fraudly using someones original work and later not acknowleging them is becoming common.And some times teachers or professors could not identify the plagarised information provided.So higher educational systems nowadays use different types of tools to compare.Here we have an idea to match no of different documents like assignments of students to compare with each other to find out, did they copied each other’s work?Also an idea to compare ideal answeer sheet of particular subject examination to similar test sheets of students.Idea is to compare and on similarity basis we can rank them.Both approach is one kind and that is to compare documents.To identify plagarism there are many methods used already.So we could compare and develop them if needed.

Download Full-text

SQL Query Algorithm Based on Restricted Chinese

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.765-767.1684 ◽

2013 ◽

Vol 765-767 ◽

pp. 1684-1688

Author(s):

Xiao Na Ma ◽

Bo Wei Li ◽

Lin Wang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Formal Model ◽

Sentence Type ◽

Research Problem ◽

Research Topic ◽

Natural Language Query ◽

Query Algorithm ◽

Sql Query

In order to make the users who didnt study the computer-related major can find the useful information in their familiar manner, it becomes a hot research topic to search in an interactive way using Chinese natural language. But because the complexity of Chinese, how to build of a formal model of natural language is always a difficult research problem. In this paper, by analysis the query sentence type in SQL based on the special database, we bring forward the rules and the context irrespective grammars of the restrictive Chinese that resolves the difficult comprehension problem in natural language processing. And aiming at the comprehension processing, we give the detailed arithmetic describing that can translate a natural language query of the restrictive Chinese to SQL.

Download Full-text

MICE

International Journal of Corpus Linguistics ◽

10.1075/ijcl.9.1.03are ◽

2004 ◽

Vol 9 (1) ◽

pp. 53-68 ◽

Cited By ~ 6

Author(s):

Montserrat Arévalo Rodríguez ◽

Montserrat Civit Torruella ◽

Maria Antònia Martí

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Language Processing ◽

Corpus Linguistics ◽

Proper Names ◽

Named Entity ◽

Different Types

In the field of corpus linguistics, Named Entity treatment includes the recognition and classification of different types of discursive elements like proper names, date, time, etc. These discursive elements play an important role in different Natural Language Processing applications and techniques such as Information Retrieval, Information Extraction, translations memories, document routers, etc.

Download Full-text

NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories

10.1101/756304 ◽

2019 ◽

Author(s):

Yuda Munarko ◽

Dewan M. Sarwar ◽

Koray Atalag ◽

David P. Nickerson

Keyword(s):

Natural Language ◽

Semantic Annotation ◽

Apical Plasma Membrane ◽

Rdf Graph ◽

Text Feature ◽

Natural Language Interface ◽

Natural Language Query ◽

Description Framework ◽

Entity Discovery ◽

Resource Description

AbstractMotivationSemantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommend the use of the Resource Description Framework (RDF). The RDF implementation provides the flexibility of model entity searching (e.g. flux of sodium across apical plasma membrane) by utilising SPARQL. However, the rigidity and complexity of SPARQL syntax and the nature of semantic annotation which is not merely as a simple triple yet forming a tree-like structure may cause a difficulty. Therefore, the availability of an interface to convert a natural language query to SPARQL is beneficial.ResultsWe propose NLIMED, a natural language query to SPARQL interface to retrieve model entities from biosimulation models. Our interface can be applied to various repositories utilising RDF such as the PMR and Biomodels. We evaluate our interface by collecting RDF in the biosimulation models coded using CellML in PMR. First, we extract RDF as a tree structure and then store each subtree of a model entity as a modified triple of a model entity name, path, and class ontology into the RDF Graph Index. We also extract class ontology’s textual metadata from the BioPortal and CellML and manage it in the Text Feature Index. With the Text Feature Index, we annotate phrases resulted by the NLQ Parser (Stanford parser or NLTK parser) into class ontologies. Finally, the detected class ontologies then are composed as SPARQL by incorporating the RDF Graph Index. Our annotator performance is far more powerful compared to the available service provided by BioPortal with F-measure of 0.756 and our SPARQL composer can find all possible SPARQL in the collection based on the annotation results. Currently, we already implement our interface in Epithelial Modelling Platform tool.Availabilityhttps://github.com/napakalas/NLIMED

Download Full-text

NLP TOKEN MATCHING ON DATABASE USING BINARY SEARCH

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v3i1c.2766 ◽

2012 ◽

Vol 3 (1) ◽

pp. 140-143

Author(s):

Ekta Aggarwal ◽

Shreeja Nair

Keyword(s):

Natural Language ◽

Language Processing ◽

Time Complexity ◽

Binary Search ◽

Translation Process ◽

Query Translation ◽

Natural Language Text ◽

Natural Language Interface ◽

Reduced Time ◽

Language Text

Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. The paper deals with the concept of database where by the data resources data can be fetched and accessed accordingly with reduced time complexity. The retrieval techniques are pointed out based on the ideas of binary search. A natural language interface refers to words in its own dictionary as well as to the words in the standard dictionary, in order to interpret a query. The main contribution of this investigation is addressing the problem of improving the accuracy of the query translation process by using the information provided by the database schema.Â Â

Download Full-text

Hash Tables for a Digital Lexicon

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.107128 ◽

2021 ◽

Vol 28 (2) ◽

pp. 25-38

Author(s):

Fábio Carlos Moreno ◽

Cinthyan Sachs C. de Barbosa ◽

Edio Roberto Manfio

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Structures ◽

Response Speed ◽

Hash Tables ◽

Processing Data ◽

Natural Language Interface ◽

Performance Results ◽

And Storage

This paper deals with the construction of digital lexicons within the scope of Natural Language Processing. Data Structures called Hash Tables have demonstrated to generate good results for Natural Language Interface for Databases and have data dispersion, response speed and programming simplicity as main features. The storage of the desired information is done by associating a key through the hashing functions that is responsible for distributing the information in this table. The objective of this paper is to present the tool called Visual TaHs that uses a sparse table to a real lexicon (Lexicon of Herbs), improving performance results of several implemented hash functions. Such structure has achieved satisfactory results in terms of speed and storage when compared to conventional databases and can work in various media, such as desktop, Web and mobile.

Download Full-text

Semantic concept model using Wikipedia semantic features

Journal of Information Science ◽

10.1177/0165551517706231 ◽

2017 ◽

Vol 44 (4) ◽

pp. 526-551 ◽

Cited By ~ 3

Author(s):

Abdulgabbar Saif ◽

Nazlia Omar ◽

Mohd Juzaiddin Ab Aziz ◽

Ummi Zakiah Zainodin ◽

Naomie Salim

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Probabilistic Model ◽

Semantic Concept ◽

Semantic Features ◽

Concept Model ◽

Main Challenge ◽

Proposed Model ◽

Different Types

Wikipedia has become a high coverage knowledge source which has been used in many research areas such as natural language processing, text mining and information retrieval. Several methods have been introduced for extracting explicit or implicit relations from Wikipedia to represent semantics of concepts/words. However, the main challenge in semantic representation is how to incorporate different types of semantic relations to capture more semantic evidences of the associations of concepts. In this article, we propose a semantic concept model that incorporates different types of semantic features extracting from Wikipedia. For each concept that corresponds to an article, four semantic features are introduced: template links, categories, salient concepts and topics. The proposed model is based on the probability distributions that are defined for these semantic features of a Wikipedia concept. The template links and categories are the document-level features which are directly extracted from the structured information included in the article. On the other hand, the salient concepts and topics are corpus-level features which are extracted to capture implicit relations among concepts. For the salient concepts feature, the distributional-based method is utilised on the hypertext corpus to extract this feature for each Wikipedia concept. Then, the probability product kernel is used to improve the weight of each concept in this feature. For the topic feature, the Labelled latent Dirichlet allocation is adapted on the supervised multi-label of Wikipedia to train the probabilistic model of this feature. Finally, we used the linear interpolation for incorporating these semantic features into the probabilistic model to estimate the semantic relation probability of the specific concept over Wikipedia articles. The proposed model is evaluated on 12 benchmark datasets in three natural language processing tasks: measuring the semantic relatedness of concepts/words in general and in the biomedical domain, semantic textual relatedness measurement and measuring the semantic compositionality of noun compounds. The model is also compared with five methods that depends on separate semantic features in Wikipedia. Experimental results show that the proposed model achieves promising results in three tasks and outperforms the baseline methods in most of the evaluation datasets. This implies that incorporation of explicit and implicit semantic features is useful for representing semantics of concepts in Wikipedia.

Download Full-text

Competitive Author Profiling Using Compression-Based Strategies

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488517400086 ◽

2017 ◽

Vol 25 (Suppl. 2) ◽

pp. 5-20

Author(s):

Francisco Claude ◽

Daniil Galaktionov ◽

Roberto Konow ◽

Susana Ladra ◽

Óscar Pedreira

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Time Performance ◽

Different Types ◽

Art Methods ◽

Demographic Attributes ◽

Author Profiling ◽

Textual Content

Author profiling consists in determining some demographic attributes — such as gender, age, nationality, language, religion, and others — of an author for a given document. This task, which has applications in fields such as forensics, security, or marketing, has been approached from different areas, especially from linguistics and natural language processing, by extracting different types of features from training documents, usually content — and style-based features. In this paper we address the problem by using several compression-inspired strategies that generate different models without analyzing or extracting specific features from the textual content, making them style-oblivious approaches. We analyze the behavior of these techniques, combine them and compare them with other state-of-the-art methods. We show that they can be competitive in terms of accuracy, giving the best predictions for some domains, and they are efficient in time performance.

Download Full-text

Natural language query handling using extended knowledge provider system

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-210049 ◽

2021 ◽

Vol 25 (1) ◽

pp. 1-19

Author(s):

Prasenjit Mukherjee ◽

Atanu Chattopadhyay ◽

Baisakhi Chakraborty ◽

Debashis Nandi

Keyword(s):

Natural Language ◽

Query Processing ◽

Language Processing ◽

Noun Phrase ◽

Query Language ◽

Data Extraction ◽

Noun Phrases ◽

Structure Query Language ◽

Knowledge Database ◽

Natural Language Query

Extraction of knowledge data from knowledge database using natural language query is a difficult task. Different types of natural language processing (NLP) techniques have been developed to handle this knowledge data extraction task. This paper proposes an automated query-response model termed Extended Automated Knowledge Provider System (EAKPS) that can manage various types of natural language queries from user. The EAKPS uses combination based technique and it can handle assertive, interrogative, imperative, compound and complex type query sentences. The algorithm of EAKPS generates structure query language (SQL) for each natural language query to extract knowledge data from the knowledge database resident within the EAKPS. Extraction of noun or noun phrases is another issue in natural language query processing. Most of the times, determiner, preposition and conjunction are prefixed to a noun or noun phrase and it is difficult to identify the noun/noun phrase with prefix during query processing. The proposed system is able to identify these prefixes and extract exact noun or noun phrases from natural language queries without any manual intervention.

Download Full-text