An Approach to Trie Based Keyword Search for Search Engines

2017 ◽  
Vol 6 (1) ◽  
pp. 1-16
Author(s):  
Pranav Murali

Search Engines use indexing techniques to minimize the time taken to find the relevant information to a search query. They maintain a keywords list that may reside either in the memory or in the external storage, like a hard disk. While a pure binary search can be used for this purpose, it suffers from performance issue when keywords are stored in the external storage. Some implementations of search engines use a B-tree and sparse indexes to reduce access time. This paper aims at reducing the keyword access time further. It presents a keyword search technique that utilizes a combination of trie data structure and a new keyword prefixing method. Experimental results show good improvement in performance over pure binary search. The merits of incorporating trie based approach into contemporary indexing methods is also discussed. Keyword prefixing method is described and some salient steps in the process of keyword generation are outlined.

Information Retrieval has become the buzzword in the today’s era of advanced computing. The tremendous amount of information is available over the Internet in the form of documents which can either be structured or unstructured. It is really difficult to retrieve relevant information from such large pool. The traditional search engines based on keyword search are unable to give the desired relevant results as they search the web on the basis of the keywords present in the query fired. On contrary the ontology based semantic search engines provide relevant and quick results to the user as the information stored in the semantic web is more meaningful. The paper gives the comparative study of the ontology based search engines with those which are keyword based. Few of both types have been taken and same queries are run on each one of them to analyze the results to compare the precision of the results provided by them by classifying the results as relevant or non-relevant.


Mathematics ◽  
2021 ◽  
Vol 9 (3) ◽  
pp. 238
Author(s):  
Yuna Hur ◽  
Jaechoon Jo

A significant amount of digital cultural contents is shared online, but learners do not know where subject matter content is or how to find it. Therefore, there is a need for a service to improve educational quality by effectively providing relevant information in response to searches for content that is useful to learners. This study developed and tested the usability and utility of an intelligent information system that effectively searches and visualizes digital cultural contents. The system collects data on digital cultural contents, automatically classifies them, and creates content triple data to automatically display the results with a 3D timeline, knowledge network map, and keyword relation network map through content search, triple search, and keyword search. We also conducted a survey and in-depth interviews to verify users’ satisfaction with respect to the use and utility of the system. For the experiment, we developed survey questions to measure user satisfaction and conducted in-depth interviews regarding the system’s utility with a total of 65 subjects. The results show that the response for satisfaction with regard to the use and utility was generally “satisfied”. In addition, the system stability was evaluated as “high”.


Database ◽  
2021 ◽  
Vol 2021 ◽  
Author(s):  
Valerio Arnaboldi ◽  
Jaehyoung Cho ◽  
Paul W Sternberg

Abstract Finding relevant information from newly published scientific papers is becoming increasingly difficult due to the pace at which articles are published every year as well as the increasing amount of information per paper. Biocuration and model organism databases provide a map for researchers to navigate through the complex structure of the biomedical literature by distilling knowledge into curated and standardized information. In addition, scientific search engines such as PubMed and text-mining tools such as Textpresso allow researchers to easily search for specific biological aspects from newly published papers, facilitating knowledge transfer. However, digesting the information returned by these systems—often a large number of documents—still requires considerable effort. In this paper, we present Wormicloud, a new tool that summarizes scientific articles in a graphical way through word clouds. This tool is aimed at facilitating the discovery of new experimental results not yet curated by model organism databases and is designed for both researchers and biocurators. Wormicloud is customized for the Caenorhabditis  elegans literature and provides several advantages over existing solutions, including being able to perform full-text searches through Textpresso, which provides more accurate results than other existing literature search engines. Wormicloud is integrated through direct links from gene interaction pages in WormBase. Additionally, it allows analysis on the gene sets obtained from literature searches with other WormBase tools such as SimpleMine and Gene Set Enrichment. Database URL: https://wormicloud.textpressolab.com


2021 ◽  
Author(s):  
ZEGOUR Djamel Eddine

Abstract Today, Red-Black trees are becoming a popular data structure typically used to implement dictionaries, associative arrays, symbol tables within some compilers (C++, Java …) and many other systems. In this paper, we present an improvement of the delete algorithm of this kind of binary search tree. The proposed algorithm is very promising since it colors differently the tree while reducing color changes by a factor of about 29%. Moreover, the maintenance operations re-establishing Red-Black tree balance properties are reduced by a factor of about 11%. As a consequence, the proposed algorithm saves about 4% on running time when insert and delete operations are used together while conserving search performance of the standard algorithm.


2018 ◽  
Vol 52 (4-5) ◽  
pp. 1107-1121 ◽  
Author(s):  
Javad Tayyebi ◽  
Abumoslem Mohammadi ◽  
Seyyed Mohammad Reza Kazemi

Given a network G(V, A, u) with two specific nodes, a source node s and a sink node t, the reverse maximum flow problem is to increase the capacity of some arcs (i, j) as little as possible under bound constraints on the modifications so that the maximum flow value from s to t in the modified network is lower bounded by a prescribed value v0. In this paper, we study the reverse maximum flow problem when the capacity modifications are measured by the weighted Chebyshev distance. We present an efficient algorithm to solve the problem in two phases. The first phase applies the binary search technique to find an interval containing the optimal value. The second phase uses the discrete type Newton method to obtain exactly the optimal value. Finally, some computational experiments are conducted to observe the performance of the proposed algorithm.


Author(s):  
Novario Jaya Perdana

The accuracy of search result using search engine depends on the keywords that are used. Lack of the information provided on the keywords can lead to reduced accuracy of the search result. This means searching information on the internet is a hard work. In this research, a software has been built to create document keywords sequences. The software uses Google Latent Semantic Distance which can extract relevant information from the document. The information is expressed in the form of specific words sequences which could be used as keyword recommendations in search engines. The result shows that the implementation of the method for creating document keyword recommendation achieved high accuracy and could finds the most relevant information in the top search results.


2020 ◽  
pp. 624-650
Author(s):  
Luis Terán

With the introduction of Web 2.0, which includes users as content generators, finding relevant information is even more complex. To tackle this problem of information overload, a number of different techniques have been introduced, including search engines, Semantic Web, and recommender systems, among others. The use of recommender systems for e-Government is a research topic that is intended to improve the interaction among public administrations, citizens, and the private sector through reducing information overload on e-Government services. In this chapter, the use of recommender systems on eParticipation is presented. A brief description of the eGovernment Framework used and the participation levels that are proposed to enhance participation. The highest level of participation is known as eEmpowerment, where the decision-making is placed on the side of citizens. Finally, a set of examples for the different eParticipation types is presented to illustrate the use of recommender systems.


The Dark Web ◽  
2018 ◽  
pp. 359-374
Author(s):  
Dilip Kumar Sharma ◽  
A. K. Sharma

ICT plays a vital role in human development through information extraction and includes computer networks and telecommunication networks. One of the important modules of ICT is computer networks, which are the backbone of the World Wide Web (WWW). Search engines are computer programs that browse and extract information from the WWW in a systematic and automatic manner. This paper examines the three main components of search engines: Extractor, a web crawler which starts with a URL; Analyzer, an indexer that processes words on the web page and stores the resulting index in a database; and Interface Generator, a query handler that understands the need and preferences of the user. This paper concentrates on the information available on the surface web through general web pages and the hidden information behind the query interface, called deep web. This paper emphasizes the Extraction of relevant information to generate the preferred content for the user as the first result of his or her search query. This paper discusses the aspect of deep web with analysis of a few existing deep web search engines.


Sign in / Sign up

Export Citation Format

Share Document