scholarly journals Enhancing the Performance of Semantic Search in Bengali using Neural Net and other Classification Techniques

To know the information from the internet searching is one of the most important part for any user. In case of ‘Syntactic Search’ keyword based matching technique is used. Search accuracy is improved applying the filter like location, preference, user-history etc. However, it can happen that the user query or question and the best available answer or result in the internet domain has no terms in common or ignorable number of terms is common. In such case syntactic search cannot give the desired output. The role of ‘Semantic Search’ becomes prevalent in this scenario. The execution of semantic search faces challenge due to unavailability of resources like WordNet, Ontology, Annotation etc. An end to end algorithm is described to improve the accuracy of the semantic search in this work. Four classification techniques are used. They are ANN, Decision Tree, SVM and Naïve Bayes. Dataset is provided from the TDIL project of the Ministry of Electronics and IT, Govt. of India. The repository contains 86 categories of text having more than a million sentences. After getting the impressive result for the Bengali language test run was done for other Indian languages and a very good result is achieved. This research is extremely useful for the automatic question answering system, semantic similarity analysis, e-governance and m- governance.

Author(s):  
Radha Guha

Background:: In the era of information overload it is very difficult for a human reader to make sense of the vast information available in the internet quickly. Even for a specific domain like college or university website it may be difficult for a user to browse through all the links to get the relevant answers quickly. Objective:: In this scenario, design of a chat-bot which can answer questions related to college information and compare between colleges will be very useful and novel. Methods:: In this paper a novel conversational interface chat-bot application with information retrieval and text summariza-tion skill is designed and implemented. Firstly this chat-bot has a simple dialog skill when it can understand the user query intent, it responds from the stored collection of answers. Secondly for unknown queries, this chat-bot can search the internet and then perform text summarization using advanced techniques of natural language processing (NLP) and text mining (TM). Results:: The advancement of NLP capability of information retrieval and text summarization using machine learning tech-niques of Latent Semantic Analysis(LSI), Latent Dirichlet Allocation (LDA), Word2Vec, Global Vector (GloVe) and Tex-tRank are reviewed and compared in this paper first before implementing them for the chat-bot design. This chat-bot im-proves user experience tremendously by getting answers to specific queries concisely which takes less time than to read the entire document. Students, parents and faculty can get the answers for variety of information like admission criteria, fees, course offerings, notice board, attendance, grades, placements, faculty profile, research papers and patents etc. more effi-ciently. Conclusion:: The purpose of this paper was to follow the advancement in NLP technologies and implement them in a novel application.


Author(s):  
Cao Liu ◽  
Shizhu He ◽  
Kang Liu ◽  
Jun Zhao

By reason of being able to obtain natural language responses, natural answers are more favored in real-world Question Answering (QA) systems. Generative models learn to automatically generate natural answers from large-scale question answer pairs (QA-pairs). However, they are suffering from the uncontrollable and uneven quality of QA-pairs crawled from the Internet. To address this problem, we propose a curriculum learning based framework for natural answer generation (CL-NAG), which is able to take full advantage of the valuable learning data from a noisy and uneven-quality corpus. Specifically, we employ two practical measures to automatically measure the quality (complexity) of QA-pairs. Based on the measurements, CL-NAG firstly utilizes simple and low-quality QA-pairs to learn a basic model, and then gradually learns to produce better answers with richer contents and more complete syntaxes based on more complex and higher-quality QA-pairs. In this way, all valuable information in the noisy and uneven-quality corpus could be fully exploited. Experiments demonstrate that CL-NAG outperforms the state-of-the-arts, which increases 6.8% and 8.7% in the accuracy for simple and complex questions, respectively.


Webology ◽  
2021 ◽  
Vol 18 (SI02) ◽  
pp. 21-31
Author(s):  
P. Mahalakshmi ◽  
N. Sabiyath Fathima

Basically keywords are used to index and retrieve the documents for the user query in a conventional information retrieval systems. When more than one keywords are used for defining the single concept in the documents and in the queries, inaccurate and incomplete results were produced by keyword based retrieval systems. Additionally, manual interventions are required for determining the relationship between the related keywords in terms of semantics to produce the accurate results which have paved the way for semantic search. Various research work has been carried out on concept based information retrieval to tackle the difficulties that are caused by the conventional keyword search and the semantic search systems. This paper aims at elucidating various representation of text that is responsible for retrieving relevant search results, approaches along with the evaluation that are carried out in conceptual information retrieval, the challenges faced by the existing research to expatiate requirements of future research. In addition, the conceptual information that are extracted from the different sources for utilizing the semantic representation by the existing systems have been discussed.


2016 ◽  
Vol 6 (2) ◽  
pp. 41-65 ◽  
Author(s):  
Sheetal A. Takale ◽  
Prakash J. Kulkarni ◽  
Sahil K. Shah

Information available on the internet is huge, diverse and dynamic. Current Search Engine is doing the task of intelligent help to the users of the internet. For a query, it provides a listing of best matching or relevant web pages. However, information for the query is often spread across multiple pages which are returned by the search engine. This degrades the quality of search results. So, the search engines are drowning in information, but starving for knowledge. Here, we present a query focused extractive summarization of search engine results. We propose a two level summarization process: identification of relevant theme clusters, and selection of top ranking sentences to form summarized result for user query. A new approach to semantic similarity computation using semantic roles and semantic meaning is proposed. Document clustering is effectively achieved by application of MDL principle and sentence clustering and ranking is done by using SNMF. Experiments conducted demonstrate the effectiveness of system in semantic text understanding, document clustering and summarization.


2019 ◽  
Vol 28 (3) ◽  
pp. 455-464 ◽  
Author(s):  
M. Anand Kumar ◽  
B. Premjith ◽  
Shivkaran Singh ◽  
S. Rajendran ◽  
K. P. Soman

Abstract In recent years, the multilingual content over the internet has grown exponentially together with the evolution of the internet. The usage of multilingual content is excluded from the regional language users because of the language barrier. So, machine translation between languages is the only possible solution to make these contents available for regional language users. Machine translation is the process of translating a text from one language to another. The machine translation system has been investigated well already in English and other European languages. However, it is still a nascent stage for Indian languages. This paper presents an overview of the Machine Translation in Indian Languages shared task conducted on September 7–8, 2017, at Amrita Vishwa Vidyapeetham, Coimbatore, India. This machine translation shared task in Indian languages is mainly focused on the development of English-Tamil, English-Hindi, English-Malayalam and English-Punjabi language pairs. This shared task aims at the following objectives: (a) to examine the state-of-the-art machine translation systems when translating from English to Indian languages; (b) to investigate the challenges faced in translating between English to Indian languages; (c) to create an open-source parallel corpus for Indian languages, which is lacking. Evaluating machine translation output is another challenging task especially for Indian languages. In this shared task, we have evaluated the participant’s outputs with the help of human annotators. As far as we know, this is the first shared task which depends completely on the human evaluation.


AI Magazine ◽  
2015 ◽  
Vol 36 (4) ◽  
pp. 61-70 ◽  
Author(s):  
Daniel M. Russell

For the vast majority of queries (for example, navigation, simple fact lookup, and others), search engines do extremely well. Their ability to quickly provide answers to queries is a remarkable testament to the power of many of the fundamental methods of AI. They also highlight many of the issues that are common to sophisticated AI question-answering systems. It has become clear that people think of search programs in ways that are very different from traditional information sources. Rapid and ready-at-hand access, depth of processing, and the way they enable people to offload some ordinary memory tasks suggest that search engines have become more of a cognitive amplifier than a simple repository or front-end to the Internet. Like all sophisticated tools, people still need to learn how to use them. Although search engines are superb at finding and presenting information—up to and including extracting complex relations and making simple inferences—knowing how to frame questions and evaluate their results for accuracy and credibility remains an ongoing challenge. Some questions are still deep and complex, and still require knowledge on the part of the search user to work through to a successful answer. And the fact that the underlying information content, user interfaces, and capabilities are all in a continual state of change means that searchers need to continually update their knowledge of what these programs can (and cannot) do.


2017 ◽  
Vol 18 (3) ◽  
pp. 12 ◽  
Author(s):  
Robert M. Stonehill ◽  
Lynn Smarte

In recent years, the Educational Resources Information Center (ERIC) system has undergone tremendous changes in the kinds of products and services it offers and the methods by which users can access them. AskERIC, a computer network-based question-answering service and virtual library, exemplifies these changes. This article describes AskERIC, other ERIC gopher sites, the National Parent Information Network, ERIC listserv activity on the Internet, and ERIC 's offerings on commercial online services. It also lists resources for librarians who do training sessions on ERIC and sketches ERIC's future direction. 


2020 ◽  
pp. 60-67
Author(s):  
Stephanie Imelda Pella ◽  
Frans Likadja ◽  
Molina Odja ◽  
Wenefrida T Ina

The purpose of this research is to design and implement an attendance system based on internet of things (IoT) . The proposed system integrated two types of attendance systems, face recognition based attendance system (FRA) and fingerprint-based attendance system (FPA), with a central server. The FRA was developed in a Raspberry Pi mini-computer using Python programming language and Openc CV library. The FPA, on the other hand, was developed using Node MCU ESP8266 and Fingerprint scanner AS608 with Adafruit Fingerprint library. Both FRA and FPA are connected to a web server with a database engine through the internet connection and sensing attendance data using the HTTP_POST method. The server was developed using Apache Webserver, PHP programming language, and MySQL database engine. The server serves two main purposes, which are to record the attendance data sent by the FPA and FRA, and generate an attendance report based on the user query. The system testing was done in a local network. The result showed that both the subsystems and the integrated system worked well


2018 ◽  
Vol 7 (4.7) ◽  
pp. 148 ◽  
Author(s):  
Shilpa S. Laddha ◽  
Dr. Pradip M. Jawandhiya

Semantic Search is an area of research which focuses on meaning of terms used in user query. Ontology plays significant role to define the concept and the relationship of terms in domain. Since the understanding of concepts is domain specific, Ontology creation is also domain specific. According to this argument, query interpreted in Tourism domain can have different meaning in some other domain. This paper presents a prototype of information retrieval interface using ontology which can save users time by rendering relevant, precise and efficient search results as compared to traditional search interfaces.  


Sign in / Sign up

Export Citation Format

Share Document