Natural language interface to video database

2001 ◽  
Vol 7 (1) ◽  
pp. 1-27 ◽  
Author(s):  
T. R. GAYATRI ◽  
S. RAMAN

In this paper, we discuss a natural language interface to a database of structured textual descriptions in the form of annotations of video objects. The interface maps the natural language query input on to the annotation structures. The language processing is done in three phases of expectations and implications from the input word, disambiguation of noun implications and slot-filling of prepositional expectations, and finally, disambiguation of verbal expectations. The system has been tested with different types of user inputs, including ill-formed sentences, and studied for erroneous inputs and for different types of portability issues.

Author(s):  
Maitri Patel and Dr Hemant D Vasava

Data,Information or knoweldge,in this rapidly moving and growing world.we can find any kind of information on Internet.And this can be too useful,however for acedemic world too it is useful but along with it plagarism is highly in practice.Which makes orginality of work degrade and fraudly using someones original work and later not acknowleging them is becoming common.And some times teachers or professors could not identify the plagarised information provided.So higher educational systems nowadays use different types of tools to compare.Here we have an idea to match no of different documents like assignments of students to compare with each other to find out, did they copied each other’s work?Also an idea to compare ideal answeer sheet of particular subject examination to similar test sheets of students.Idea is to compare and on similarity basis we can rank them.Both approach is one kind and that is to compare documents.To identify plagarism there are many methods used already.So we could compare and develop them if needed.


2013 ◽  
Vol 765-767 ◽  
pp. 1684-1688
Author(s):  
Xiao Na Ma ◽  
Bo Wei Li ◽  
Lin Wang

In order to make the users who didnt study the computer-related major can find the useful information in their familiar manner, it becomes a hot research topic to search in an interactive way using Chinese natural language. But because the complexity of Chinese, how to build of a formal model of natural language is always a difficult research problem. In this paper, by analysis the query sentence type in SQL based on the special database, we bring forward the rules and the context irrespective grammars of the restrictive Chinese that resolves the difficult comprehension problem in natural language processing. And aiming at the comprehension processing, we give the detailed arithmetic describing that can translate a natural language query of the restrictive Chinese to SQL.


2004 ◽  
Vol 9 (1) ◽  
pp. 53-68 ◽  
Author(s):  
Montserrat Arévalo Rodríguez ◽  
Montserrat Civit Torruella ◽  
Maria Antònia Martí

In the field of corpus linguistics, Named Entity treatment includes the recognition and classification of different types of discursive elements like proper names, date, time, etc. These discursive elements play an important role in different Natural Language Processing applications and techniques such as Information Retrieval, Information Extraction, translations memories, document routers, etc.


2019 ◽  
Author(s):  
Yuda Munarko ◽  
Dewan M. Sarwar ◽  
Koray Atalag ◽  
David P. Nickerson

AbstractMotivationSemantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommend the use of the Resource Description Framework (RDF). The RDF implementation provides the flexibility of model entity searching (e.g. flux of sodium across apical plasma membrane) by utilising SPARQL. However, the rigidity and complexity of SPARQL syntax and the nature of semantic annotation which is not merely as a simple triple yet forming a tree-like structure may cause a difficulty. Therefore, the availability of an interface to convert a natural language query to SPARQL is beneficial.ResultsWe propose NLIMED, a natural language query to SPARQL interface to retrieve model entities from biosimulation models. Our interface can be applied to various repositories utilising RDF such as the PMR and Biomodels. We evaluate our interface by collecting RDF in the biosimulation models coded using CellML in PMR. First, we extract RDF as a tree structure and then store each subtree of a model entity as a modified triple of a model entity name, path, and class ontology into the RDF Graph Index. We also extract class ontology’s textual metadata from the BioPortal and CellML and manage it in the Text Feature Index. With the Text Feature Index, we annotate phrases resulted by the NLQ Parser (Stanford parser or NLTK parser) into class ontologies. Finally, the detected class ontologies then are composed as SPARQL by incorporating the RDF Graph Index. Our annotator performance is far more powerful compared to the available service provided by BioPortal with F-measure of 0.756 and our SPARQL composer can find all possible SPARQL in the collection based on the annotation results. Currently, we already implement our interface in Epithelial Modelling Platform tool.Availabilityhttps://github.com/napakalas/NLIMED


2012 ◽  
Vol 3 (1) ◽  
pp. 140-143
Author(s):  
Ekta Aggarwal ◽  
Shreeja Nair

Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. The paper deals with the concept of database where by the data resources data can be fetched and accessed accordingly with reduced time complexity. The retrieval techniques are pointed out based on the ideas of binary search. A natural language interface refers to words in its own dictionary as well as to the words in the standard dictionary, in order to interpret a query. The main contribution of this investigation is addressing the problem of improving the accuracy of the query translation process by using the information provided by the database schema.  


2021 ◽  
Vol 28 (2) ◽  
pp. 25-38
Author(s):  
Fábio Carlos Moreno ◽  
Cinthyan Sachs C. de Barbosa ◽  
Edio Roberto Manfio

This paper deals with the construction of digital lexicons within the scope of Natural Language Processing. Data Structures called Hash Tables have demonstrated to generate good results for Natural Language Interface for Databases and have data dispersion, response speed and programming simplicity as main features. The storage of the desired information is done by associating a key through the hashing functions that is responsible for distributing the information in this table. The objective of this paper is to present the tool called Visual TaHs that uses a sparse table to a real lexicon (Lexicon of Herbs), improving performance results of several implemented hash functions. Such structure has achieved satisfactory results in terms of speed and storage when compared to conventional databases and can work in various media, such as desktop, Web and mobile.


2017 ◽  
Vol 44 (4) ◽  
pp. 526-551 ◽  
Author(s):  
Abdulgabbar Saif ◽  
Nazlia Omar ◽  
Mohd Juzaiddin Ab Aziz ◽  
Ummi Zakiah Zainodin ◽  
Naomie Salim

Wikipedia has become a high coverage knowledge source which has been used in many research areas such as natural language processing, text mining and information retrieval. Several methods have been introduced for extracting explicit or implicit relations from Wikipedia to represent semantics of concepts/words. However, the main challenge in semantic representation is how to incorporate different types of semantic relations to capture more semantic evidences of the associations of concepts. In this article, we propose a semantic concept model that incorporates different types of semantic features extracting from Wikipedia. For each concept that corresponds to an article, four semantic features are introduced: template links, categories, salient concepts and topics. The proposed model is based on the probability distributions that are defined for these semantic features of a Wikipedia concept. The template links and categories are the document-level features which are directly extracted from the structured information included in the article. On the other hand, the salient concepts and topics are corpus-level features which are extracted to capture implicit relations among concepts. For the salient concepts feature, the distributional-based method is utilised on the hypertext corpus to extract this feature for each Wikipedia concept. Then, the probability product kernel is used to improve the weight of each concept in this feature. For the topic feature, the Labelled latent Dirichlet allocation is adapted on the supervised multi-label of Wikipedia to train the probabilistic model of this feature. Finally, we used the linear interpolation for incorporating these semantic features into the probabilistic model to estimate the semantic relation probability of the specific concept over Wikipedia articles. The proposed model is evaluated on 12 benchmark datasets in three natural language processing tasks: measuring the semantic relatedness of concepts/words in general and in the biomedical domain, semantic textual relatedness measurement and measuring the semantic compositionality of noun compounds. The model is also compared with five methods that depends on separate semantic features in Wikipedia. Experimental results show that the proposed model achieves promising results in three tasks and outperforms the baseline methods in most of the evaluation datasets. This implies that incorporation of explicit and implicit semantic features is useful for representing semantics of concepts in Wikipedia.


Author(s):  
Francisco Claude ◽  
Daniil Galaktionov ◽  
Roberto Konow ◽  
Susana Ladra ◽  
Óscar Pedreira

Author profiling consists in determining some demographic attributes — such as gender, age, nationality, language, religion, and others — of an author for a given document. This task, which has applications in fields such as forensics, security, or marketing, has been approached from different areas, especially from linguistics and natural language processing, by extracting different types of features from training documents, usually content — and style-based features. In this paper we address the problem by using several compression-inspired strategies that generate different models without analyzing or extracting specific features from the textual content, making them style-oblivious approaches. We analyze the behavior of these techniques, combine them and compare them with other state-of-the-art methods. We show that they can be competitive in terms of accuracy, giving the best predictions for some domains, and they are efficient in time performance.


Author(s):  
Prasenjit Mukherjee ◽  
Atanu Chattopadhyay ◽  
Baisakhi Chakraborty ◽  
Debashis Nandi

Extraction of knowledge data from knowledge database using natural language query is a difficult task. Different types of natural language processing (NLP) techniques have been developed to handle this knowledge data extraction task. This paper proposes an automated query-response model termed Extended Automated Knowledge Provider System (EAKPS) that can manage various types of natural language queries from user. The EAKPS uses combination based technique and it can handle assertive, interrogative, imperative, compound and complex type query sentences. The algorithm of EAKPS generates structure query language (SQL) for each natural language query to extract knowledge data from the knowledge database resident within the EAKPS. Extraction of noun or noun phrases is another issue in natural language query processing. Most of the times, determiner, preposition and conjunction are prefixed to a noun or noun phrase and it is difficult to identify the noun/noun phrase with prefix during query processing. The proposed system is able to identify these prefixes and extract exact noun or noun phrases from natural language queries without any manual intervention.


Sign in / Sign up

Export Citation Format

Share Document