Design of a Novel Search Engine for Prospective Question Answering

2014 ◽  
Vol 4 (2) ◽  
pp. 19-40
Author(s):  
Rosy Madaan ◽  
A.K. Sharma ◽  
Ashutosh Dixit

Question answering offers a more intuitive approach to information processing. A number of approaches have been used for answering questions. In this paper, we propose a questionansweringsystem that uses blogs as its source of information. The system deals with crawling blog pages, summarizing them, indexing and then ranking the summarized content. The user asks a question and gets answer(s) in response. The answer(s) obtained are better as compared to those provided by the existing QA systems that use the general web pages for the purpose of answering. The experimental results show that the proposed system has shown promising results and the responses given by the system are better than those given by the existing QA systems.

2011 ◽  
Vol 52-54 ◽  
pp. 1218-1225
Author(s):  
Zheng Yu Zhu ◽  
Chun Lei Yu ◽  
Shu Jia Dong ◽  
Jie He

Current popular search engines are built to serve all users, independent of the needs of any individual user. A personalized query expansion method based on user's historical interested Web pages (UHIWPs) and user’s historical query terms (UHQTs) is proposed in this paper. When a user submits a query keyword to a search engine, the new algorithm can automatically locate the current user’s implicit search intention and compute the term-term associations dynamically according to the user’s UHIWPs and UHQTs. More personalized expansion terms then will be generated and submitted to the search engine together with the query keyword. As a result, different search results can be returned to different users even though they input the same query keywords. Experimental results show that this method is better than the current algorithm in average precision.


2014 ◽  
Vol 23 (04) ◽  
pp. 1460014 ◽  
Author(s):  
Georgios Stratogiannis ◽  
Georgios Siolas ◽  
Andreas Stafylopatis

We describe a system that performs semantic Question Answering based on the combination of classic Information Retrieval methods with semantic ones. First, we use a search engine to gather web pages and then apply a noun phrase extractor to extract all the candidate answer entities from them. Candidate entities are ranked using a linear combination of two IR measures to pick the most relevant ones. For each one of the top ranked candidate entities we find the corresponding Wikipedia page. We then propose a novel way to exploit Semantic Information contained in the structure of Wikipedia. A vector is built for every entity from Wikipedia category names by splitting and lemmatizing the words that form them. These vectors maintain Semantic Information in the sense that we are given the ability to measure semantic closeness between the entities. Based on this, we apply an intelligent clustering method to the candidate entities and show that candidate entities in the biggest cluster are the most semantically related to the ideal answers to the query. Results on the topics of the TREC 2009 Related Entity Finding task dataset show promising performance.


2013 ◽  
Vol 850-851 ◽  
pp. 745-750
Author(s):  
Ying Hong ◽  
Zeng Min Geng

In the light of the deficiency of general search engine technology in professional retrieval,This paper researched and designed a search engine system for professional field (SESPF for short).This system automatically crawls web pages by the spider program.It introduced professional dictionary and filtered the webpages information according to certain rules.At the same time,the system improved the PageRank algorithm and Lucene webpage ranking algorithm.The experimental results show that this system has a higher precision in professional field retrieval compared with the general search engine.


2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

Understanding the actual need of user from a question is very crucial in non-factoid why-question answering as Why-questions are complex and involve ambiguity and redundancy in their understanding. The precise requirement is to determine the focus of question and reformulate them accordingly to retrieve expected answers to a question. The paper analyzes different types of why-questions and proposes an algorithm for each class to determine the focus and reformulate it into a query by appending focal terms and cue phrase ‘because’ with it. Further, a user interface is implemented which asks input why-question, applies different components of question , reformulates it and finally retrieve web pages by posing query to Google search engine. To measure the accuracy of the process, user feedback is taken which asks them to assign scoring from 1 to 10, on how relevant are the retrieved web pages according to their understanding. The results depict that maximum precision of 89% is achieved in Informational type why-questions and minimum of 48% in opinionated type why-questions.


2014 ◽  
Vol 687-691 ◽  
pp. 2513-2516
Author(s):  
Mei Ni Yang ◽  
Hui Ling Sun ◽  
Jing Shen

A product always has some different names in Chinese. Getting the aliases of these products is very important in e-commerce, online advertising, etc. Chinese aliases of a product are always placed in the titles of web pages on which this product is sold. Such titles are collected using search engine, and then a conditional random field is used to extract the aliases from them. To improve the performance, distributed representations of words are employed as features in the conditional random field. The method is tested on the real data, and the experimental results are analyzed.


2014 ◽  
Vol 2 (2) ◽  
pp. 103-112 ◽  
Author(s):  
Taposh Kumar Neogy ◽  
Harish Paruchuri

The essence of a web page is an inherently predisposed issue, one that is built on behaviors, interests, and intelligence. There are relatively a ton of reasons web pages are critical to the new world, as the matter cannot be overemphasized. The meteoric growth of the internet is one of the most potent factors making it hard for search engines to provide actionable results. With classified directories, search engines store web pages. To store these pages, some of the engines rely on the expertise of real people. Most of them are enabled and classified using automated means but the human factor is dominant in their success. From experimental results, we can deduce that the most effective and critical way to automate web pages for search engines is via the integration of machine learning.  


Author(s):  
Fei Liu ◽  
Jing Liu ◽  
Zhiwei Fang ◽  
Richang Hong ◽  
Hanqing Lu

Learning effective interactions between multi-modal features is at the heart of visual question answering (VQA). A common defect of the existing VQA approaches is that they only consider a very limited amount of interactions, which may be not enough to model latent complex image-question relations that are necessary for accurately answering questions. Therefore, in this paper, we propose a novel DCAF (Densely Connected Attention Flow) framework for modeling dense interactions. It densely connects all pairwise layers of the network via Attention Connectors, capturing fine-grained interplay between image and question across all hierarchical levels. The proposed Attention Connector efficiently connects the multi-modal features at any two layers with symmetric co-attention, and produces interaction-aware attention features. Experimental results on three publicly available datasets show that the proposed method achieves state-of-the-art performance.


2013 ◽  
Vol 312 ◽  
pp. 791-795
Author(s):  
Xiang Lin Zuo ◽  
Wen Bo Wang ◽  
Ying Wang ◽  
Wan Li Zuo

The past decade has witnessed the rapid development of search engines, which has become an indispensable part of everyday life. However, people are no longer satisfied with accessing to ordinary information, and they may instead pay more attention to fresh information. This demand poses challenges to traditional search engines, which concern more about relevance and importance of web pages. A search engine compresses three modules: crawler, indexer and searcher. Changes are needed for all these three parts to improve search engine's freshness. This paper investigates the first part of search engine crawler, we analyze the requirements for real-time crawler, and propose a novel real-time crawler based on more accurate estimation of refresh time. Experimental results demonstrate that the proposed real-time crawler can help search engine improve its freshness.


2013 ◽  
Vol 718-720 ◽  
pp. 2040-2044 ◽  
Author(s):  
Xiao Ping Xian ◽  
Yue Guang Li

In order to improve search engine retrieval quality, using the vector space model is proposed based on the user's interest page pre-classifying and content related PageRank algorithm, the two aspects of the weights to influence of web pages with the PR value. Through the simulation experiment after improved algorithm experimental result, compared with the original algorithm, the improved algorithm precision better than the original algorithm.


2016 ◽  
Vol 1 (1) ◽  
pp. 45-52
Author(s):  
Palupi Puspitorini

The aim of this study was to select the best sources of auxin of which it can stimulate the growth of shoots Pineapple plant cuttings. This research is compiled in a completely randomized design (CRD) with 4 treatments and 6 replications. The Data were statistically Analyzed by the DMRT. Level of treatment given proves that no treatment 0%, cow urine concentration of 25%, young coconut water concentration of 25% and Rootone F 100 mg / cuttings. The results showed that cow urine concentrations of 25% and Rootone F 100 mg give the best results in stimulating the growth of shoots pineapple stem cuttings. Experimental results concluded that the effect of this natural hormone were better than the shoots without given hormone.           


Sign in / Sign up

Export Citation Format

Share Document