When 57,300,000 Full Text Search Results Are Just Too Many

Proceedings of Balisage: The Markup Conference 2014 ◽

10.4242/balisagevol13.case01 ◽

2014 ◽

Cited By ~ 1

Author(s):

Pat Case

Keyword(s):

Full Text ◽

Search Engines ◽

End Users ◽

End User ◽

New Paradigm ◽

Text Search ◽

Full Text Search ◽

Search Results ◽

Impressive Result ◽

The Web

The Web changed the paradigm for full-text search. Searching Google for search engines returns 57,300,000 results at this writing, an impressive result set. Web search engines favor simple searches, speed, and relevance ranking. The end user most often finds a wanted result or two within the first page of search results. This new paradigm is less useful in searching collections of homogeneous data and documents than it is for searching the web. When searching collections end users may need to review everything in the collection on a topic, or may want a clean result set of only those 6 high-quality results, or may need to confirm that there are no wanted results because finding no results within a collection sometimes answers a question about a topic or collection. To accomplish these tasks, end users may need more end user functionality to return small, manageable result sets. The W3C XQuery and XPath Full Text Recommendation (XQFT) offers extensive end user functionality, restoring the end user control that librarians and expert searches enjoyed before the Web. XQFT offers more end user functionality and control than any other full-text search standard ever: more match options, more logical operators, more proximity operators, more ways to return a manageable result set. XQFT searches are also completely composable with XQuery string, number, date, and node queries, bringing the power of full-text search and database querying together for the first time. XQFT searches run directly against XML, enabling searches on any elements or attributes. XQFT implementations are standard-driven, based on shared semantics and syntax. A search in any implementation is portable and may be used in other implementations.

Download Full-text

Quantification of competitive value of documents

Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis ◽

10.11118/actaun200957050285 ◽

2009 ◽

Vol 57 (5) ◽

pp. 285-290

Author(s):

Pavel Šimek ◽

Jiří Vaněk ◽

Jan Jarolímek

Keyword(s):

Search Engine ◽

Market Share ◽

Full Text ◽

Search Engines ◽

Web Site ◽

Optimization Techniques ◽

Text Search ◽

Full Text Search ◽

Google Search ◽

The Web

The majority of Internet users use the global network to search for different information using fulltext search engines such as Google, Yahoo!, or Seznam. The web presentation operators are trying, with the help of different optimization techniques, to get to the top places in the results of fulltext search engines. Right there is a great importance of Search Engine Optimization and Search Engine Marketing, because normal users usually try links only on the first few pages of the fulltext search engines results on certain keywords and in catalogs they use primarily hierarchically higher placed links in each category. Key to success is the application of optimization methods which deal with the issue of keywords, structure and quality of content, domain names, individual sites and quantity and reliability of backward links. The process is demanding, long-lasting and without a guaranteed outcome. A website operator without advanced analytical tools do not identify the contribution of individual documents from which the entire web site consists. If the web presentation operators want to have an overview of their documents and web site in global, it is appropriate to quantify these positions in a specific way, depending on specific key words. For this purpose serves the quantification of competitive value of documents, which consequently sets global competitive value of a web site. Quantification of competitive values is performed on a specific full-text search engine. For each full-text search engine can be and often are, different results. According to published reports of ClickZ agency or Market Share is according to the number of searches by English-speaking users most widely used Google search engine, which has a market share of more than 80%. The whole procedure of quantification of competitive values is common, however, the initial step which is the analysis of keywords depends on a choice of the fulltext search engine.

Download Full-text

Recommender Systems in Digital Libraries Using Artificial Intelligence and Machine Learning

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Handbook of Research on Methodologies and Applications of Supercomputing ◽

10.4018/978-1-7998-7156-9.ch012 ◽

2021 ◽

pp. 162-178

Author(s):

Namik Delilovic

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Digital Libraries ◽

Full Text ◽

Text Search ◽

Full Text Search ◽

Artificial Intelligence Techniques ◽

Search Results ◽

Advanced Search ◽

Search Field

Searching for contents in present digital libraries is still very primitive; most websites provide a search field where users can enter information such as book title, author name, or terms they expect to be found in the book. Some platforms provide advanced search options, which allow the users to narrow the search results by specific parameters such as year, author name, publisher, and similar. Currently, when users find a book which might be of interest to them, this search process ends; only a full-text search or references at the end of the book may provide some additional pointers. In this chapter, the author is going to give an example of how a user could permanently get recommendations for additional contents even while reading the article, using present machine learning and artificial intelligence techniques.

Download Full-text

Full-Text Search Engine using MySQL

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2010.5.2233 ◽

2010 ◽

Vol 5 (5) ◽

pp. 735

Author(s):

Cornelia Gyorodi ◽

Robert Gyorodi ◽

George Pecherle ◽

George Mihai Cornea

Keyword(s):

Search Engine ◽

Full Text ◽

Bag Of Words ◽

Text Search ◽

Full Text Search ◽

Medium Scale ◽

The Web ◽

Spelling Mistake

In this article we will try to explain how we can create a search engine using the powerful MySQL full-text search. The ever increasing demands of the web requires cheap and elaborate search options. One of the most important issues for a search engine is to have the capacity to order its results set as relevance and provide the user with suggestions in the case of a spelling mistake or a small result set. In order to fulfill this request we thought about using the powerful MySQL full-text search. This option is suitable for small to medium scale websites. In order to provide sound like capabilities, a second table containing a bag of words from the main table together with the corresponding metaphone is created. When a suggestion is needed, this table is interrogated for the metaphone of the searched word and the result set is computed resulting a suggestion.

Download Full-text

Research and Improvement on Content-Based Web Search Engine

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1282 ◽

2012 ◽

Vol 532-533 ◽

pp. 1282-1286

Author(s):

Zhi Chao Lin ◽

Lei Sun ◽

Xiao Liu

Keyword(s):

Information Retrieval ◽

Search Engine ◽

Full Text ◽

Search Engines ◽

Query Expansion ◽

Web Search ◽

Text Search ◽

Web Search Engine ◽

Query Word ◽

The Web

There is a lot of information contained in the World Wide Web. It has become a research focus to obtain the required related resources quickly and accurately from the web through the content-based search engines. Most current tools of full text web search engine, such as Lucene which is a widely used open source retrieval library in information retrieval field, are purely keyword based. This may not sufficient for users to retrieve in the web. In this paper, we employ a method to overcome the limitations of current full text search engines in represent of Lucene. We propose a Query Expansion and Information Retrieval approach which can help users to acquire more accurate contents from the web. The Query Expansion component finds expanded candidate words of the query word through WordNet which contains synonyms in several different senses; In the Information Retrieval component, the query word and its candidate words are used together as the input of the search module to get the result items. Furthermore, we can put the result items into different classes based on the expansion. Some experiments and the results are described in the late part of this paper.

Download Full-text

Full-Text Search Engines for Databases

Database Technologies ◽

10.4018/978-1-60566-058-5.ch053 ◽

2009 ◽

pp. 931-939

Author(s):

László Kovács ◽

Domonkos Tikk

Keyword(s):

Information Retrieval ◽

Full Text ◽

Search Engines ◽

Free Text ◽

Text Search ◽

Text Documents ◽

Full Text Search ◽

Textual Data ◽

Efficient Information ◽

Particular Solution

Current databases are able to store several Tbytes of free-text documents. The main purpose of a database from the user’s viewpoint is the efficient information retrieval. In the case of textual data, information retrieval mostly concerns the selection and the ranking of documents. We present here the particular solution of Oracle; there for making the full-text querying more efficient, a special engine was developed that performs the preparation of full-text queries and provides a set of language and semantic specific query operators.

Download Full-text

Relevance ranking for proximity full-text search based on additional indexes with multi-component keys

Vestnik Udmurtskogo Universiteta Matematika Mekhanika Komp yuternye Nauki ◽

10.35634/vm210110 ◽

2021 ◽

Vol 31 (1) ◽

pp. 132-148

Author(s):

A.B. Veretennikov

Keyword(s):

Full Text ◽

Search Query ◽

Text Search ◽

Relevance Ranking ◽

Full Text Search ◽

Search Results ◽

Search Speed ◽

Speed Up ◽

Inverted Indexes

The problem of proximity full-text search is considered. If a search query contains high-frequently occurring words, then multi-component key indexes deliver improvement of the search speed in comparison with ordinary inverted indexes. It was shown that we can increase the search speed up to 130 times in cases when queries consist of high-frequently occurring words. In this paper, we are investigating how the multi-component key indexes architecture affects the quality of the search. We consider several well-known methods of relevance ranking; these methods are of different authors. Using these methods we perform the search in the ordinary inverted index and then in the index that is enhanced with multi-component key indexes. The results show that with multi-component key indexes we obtain search results that are very near in terms of relevance ranking to the search results that are obtained by means of ordinary inverted indexes.

Download Full-text

Efficient Indexing of Regional Maximum Activations of Convolutions using Full-Text Search Engines

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval - ICMR '17 ◽

10.1145/3078971.3079035 ◽

2017 ◽

Cited By ~ 2

Author(s):

Giuseppe Amato ◽

Fabio Carrara ◽

Fabrizio Falchi ◽

Claudio Gennaro

Keyword(s):

Full Text ◽

Search Engines ◽

Text Search ◽

Full Text Search

Download Full-text

Fast and Exact Nearest Neighbor Search in Hamming Space on Full-Text Search Engines

10.1007/978-3-030-32047-8_5 ◽

2019 ◽

pp. 49-56

Author(s):

Cun (Matthew) Mu ◽

Jun (Raymond) Zhao ◽

Guang Yang ◽

Binwei Yang ◽

Zheng (John) Yan

Keyword(s):

Full Text ◽

Search Engines ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Text Search ◽

Full Text Search ◽

Neighbor Search ◽

Hamming Space ◽

Exact Nearest Neighbor

Download Full-text

Big Data Full-Text Search Index Minimization Using Text Summarization

Information Technology And Control ◽

10.5755/j01.itc.50.2.25470 ◽

2021 ◽

Vol 50 (2) ◽

pp. 375-389

Author(s):

Waheed Iqbal ◽

Waqas Ilyas Malik ◽

Faisal Bukhari ◽

Khaled Mohamad Almustafa ◽

Zubiar Nawaz

Keyword(s):

Big Data ◽

Full Text ◽

Text Summarization ◽

Text Search ◽

Full Text Search ◽

Search Queries ◽

Search Results ◽

Real World Datasets ◽

Search Index ◽

Index Size

An efficient full-text search is achieved by indexing the raw data with an additional 20 to 30 percent storagecost. In the context of Big Data, this additional storage space is huge and introduces challenges to entertainfull-text search queries with good performance. It also incurs overhead to store, manage, and update the largesize index. In this paper, we propose and evaluate a method to minimize the index size to offer full-text searchover Big Data using an automatic extractive-based text summarization method. To evaluate the effectivenessof the proposed approach, we used two real-world datasets. We indexed actual and summarized datasets usingApache Lucene and studied average simple overlapping, Spearman’s rho correlation, and average rankingscore measures of search results obtained using different search queries. Our experimental evaluation showsthat automatic text summarization is an effective method to reduce the index size significantly. We obtained amaximum of 82% reduction in index size with 42% higher relevance of the search results using the proposedsolution to minimize the full-text index size.

Download Full-text

Improving full text search performance through textual analysis

Information Processing & Management ◽

10.1016/0306-4573(93)90083-p ◽

1993 ◽

Vol 29 (5) ◽

pp. 615-632 ◽

Cited By ~ 2

Author(s):

Mavis Molto

Keyword(s):

Full Text ◽

Textual Analysis ◽

Search Performance ◽

Text Search ◽

Full Text Search

Download Full-text