Ranking Top Similar Documents for User Query Based on Normalized Vector Cosine Similarity Model

2020 ◽  
Vol 17 (9) ◽  
pp. 4531-4534
Author(s):  
Deepa Yogish ◽  
T. N. Manjunath ◽  
H. K. Yogish ◽  
Ravindra S. Hegadi

As the technology is developing information in each fields like literature, technology, science, medicine etc., also increasing in high pace. To extract related document in huge collection of documents based on user query in digital world is an interesting problem. Documents similarity Technique used in many applications like text categorization, plagiarism discernment, document clustering, information retrieval, machine translation and question answering system. Many algorithms have been developed for this purpose that take a document or input query and match it with the document databases. This paper proposes novel approach to vectorize each document and query with normalized TF-IDF method and applying Cosine Similarity function to extract top 3 documents based on user query.

Repositor ◽  
2020 ◽  
Vol 2 (9) ◽  
Author(s):  
Rizky Heriawan Prayogo Tanjung ◽  
Maskur Maskur ◽  
Nur Hayatin

AbstrakJawaban pertanyaan aplikasi penjawab pertanyaan yang tersedia saat ini masih menggunakan metode pencocokan kata kunci untuk melakukan pencarian atas jawaban. Sistem penjawab pertanyaanotomatis adalah sistem yang secara otomatis mencoba menemukan kembali informasi yang benar untuk pertanyaan diajukan oleh user. Pertanyaan dapat dikembangkan untuk membantu dan membuat lebih mudah untuk menjawab pertanyaan tentang rekayasa perangkat lunak.Aplikasiini menggunakan metodeCosine Similarityyangmerupakan salah satu solusi untukmembantu mencari jawabanpertanyaanyang diinginkan dengantepat,yangbermanfaat untuk sistem pengolah kata. Karena dengan metode ini,tanya jawab otomatis dapat mencari data yang diinginkan oleh penanya,denganmenampilkan jawaban dengan bobot tertinggi sebagai jawaban yang paling tepat.Jawaban pertama atau bobot tertinggi yang dihasilkan oleh sistem adalah jawaban yang benar menurut penilaian sistem dan pakar.Jawaban pertama atau bobot tertinggi yang dihasilkan oleh sistem adalah jawaban yang benar menurut penilaian sistem, pakar dan pengujian Kappa.Hasil pengujian menggunakan kappa statistik memberikan nilai terbaik Kappa pada jawaban pertama (jawaban dengan bobot terbesar).Nilai tersebut membuktikan bahwa sistem yang telah dibangun dapat digunakan untuk mengetahui kemiripan antar kasus penggunaan pertanyaan dan jawaban.AbstractThe Answers of question answering applications that are available today are still using keyword matching method to perform a search for answering. Automatic question answering system is a automatically system used to find information that might correspond to the questions asked by the user. Questions can be developed to help and make it easier to answer questions about software engineering.This application uses the method of Cosine Similarity which is one solution to help searching for the desired answer of questions correctly, that is useful for word processing system. By this method, Automatic Question Answering can looking for desired data of user by showing the the highest weights answer as the best answer.The first or the highest answer resulted by system is the right answer based on system, expert and Kappa Testing. The result of Kappa testing giving the best Kappa value on the first answer (the highest weights answer). It proves that the system can be used to know the similarity between question and answer for between cases of using quetions and answers.


Author(s):  
Manvi Breja

<span>User profiling, one of the main issue faced while implementing the efficient question answering system, in which the user profile is made, containing the data posed by the user, capturing their domain of interest. The paper presents the method of predicting the next related questions to the first initial question provided by the user to the question answering search engine. A novel approach of the association rule mining is highlighted in which the information is extracted from the log of the previously submitted questions to the question answering search engine, using algorithms for mining association rules and predicts the set of next questions that the user will provide to the system in the next session. Using this approach, the question answering system keeps the relevant answers of the next questions in the repository for providing a speedy response to the user and thus increasing the efficiency of the system.</span>


2017 ◽  
Vol 11 (03) ◽  
pp. 345-371
Author(s):  
Avani Chandurkar ◽  
Ajay Bansal

With the inception of the World Wide Web, the amount of data present on the Internet is tremendous. This makes the task of navigating through this enormous amount of data quite difficult for the user. As users struggle to navigate through this wealth of information, the need for the development of an automated system that can extract the required information becomes urgent. This paper presents a Question Answering system to ease the process of information retrieval. Question Answering systems have been around for quite some time and are a sub-field of information retrieval and natural language processing. The task of any Question Answering system is to seek an answer to a free form factual question. The difficulty of pinpointing and verifying the precise answer makes question answering more challenging than simple information retrieval done by search engines. The research objective of this paper is to develop a novel approach to Question Answering based on a composition of conventional approaches of Information Retrieval (IR) and Natural Language processing (NLP). The focus is on using a structured and annotated knowledge base instead of an unstructured one. The knowledge base used here is DBpedia and the final system is evaluated on the Text REtrieval Conference (TREC) 2004 questions dataset.


2019 ◽  
Vol 9 (1) ◽  
pp. 88-106
Author(s):  
Irphan Ali ◽  
Divakar Yadav ◽  
Ashok Kumar Sharma

A question answering system aims to provide the correct and quick answer to users' query from a knowledge base. Due to the growth of digital information on the web, information retrieval system is the need of the day. Most recent question answering systems consult knowledge bases to answer a question, after parsing and transforming natural language queries to knowledge base-executable forms. In this article, the authors propose a semantic web-based approach for question answering system that uses natural language processing for analysis and understanding the user query. It employs a “Total Answer Relevance Score” to find the relevance of each answer returned by the system. The results obtained thereof are quite promising. The real-time performance of the system has been evaluated on the answers, extracted from the knowledge base.


Author(s):  
Pratheek I ◽  
Joy Paulose

<p>Generating sequences of characters using a Recurrent Neural Network (RNN) is a tried and tested method for creating unique and context aware words, and is fundamental in Natural Language Processing tasks. These type of Neural Networks can also be used a question-answering system. The main drawback of most of these systems is that they work from a factoid database of information, and when queried about new and current information, the responses are usually bleak. In this paper, the author proposes a novel approach to finding answer keywords from a given body of news text or headline, based on the query that was asked, where the query would be of the nature of current affairs or recent news, with the use of Gated Recurrent Unit (GRU) variant of RNNs. Thus, this ensures that the answers provided are relevant to the content of query that was put forth.</p>


2021 ◽  
Vol 35 (4) ◽  
pp. 301-306
Author(s):  
Godavarthi Deepthi ◽  
A. Mary Sowjanya

In Natural language processing, various tasks can be implemented with the features provided by word embeddings. But for obtaining embeddings for larger chunks like sentences, the efforts applied through word embeddings will not be sufficient. To resolve such issues sentence embeddings can be used. In sentence embeddings, complete sentences along with their semantic information are represented as vectors so that the machine finds it easy to understand the context. In this paper, we propose a Question Answering System (QAS) based on sentence embeddings. Our goal is to obtain the text from the provided context for a user-query by extracting the sentence in which the correct answer is present. Traditionally, infersent models have been used on SQUAD for building QAS. In recent times, Universal Sentence Encoder with USECNN and USETrans have been developed. In this paper, we have used another variant of the Universal sentence encoder, i.e. Deep averaging network in order to obtain pre-trained sentence embeddings. The results on the SQUAD-2.0 dataset indicate our approach (USE with DAN) performs well compared to Facebook’s infersent embedding.


2019 ◽  
Vol 15 (3) ◽  
pp. 79-100 ◽  
Author(s):  
Watanee Jearanaiwongkul ◽  
Frederic Andres ◽  
Chutiporn Anutariya

Nowadays, farmers can search for treatments for their plants using search engines and applications. Most existing works are developed in the form of rule-based question answering platforms. However, an observation could be incorrectly given by the farmer. This work recommends that diseases and treatments must be considered from a set of related observations. Thus, we develop a theoretical framework for systems to manage a farmer's observation data. We investigate and formalize desirable characteristics of such systems. The observation data is attached with a geolocation in which related contextual data is found. The framework is formalized based on algebra, in which required types and functions are identified. Its key characteristics are described by: (1) the defined type called warncons for representing observation data; (2) the similarity function for warncons; and (3) the warncons composition function for composing similar warncons. Finally, we show that the framework helps observation data to become richer and improve advice-finding.


Sign in / Sign up

Export Citation Format

Share Document