Research the Key Technologies of the Mongolian Full-Text Retrieval Based on Lucene

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2185 ◽

2013 ◽

Vol 347-350 ◽

pp. 2185-2190

Author(s):

Guo Qiang Ding ◽

Min Lin

Keyword(s):

Search Engine ◽

Full Text ◽

Text Retrieval ◽

Text Search ◽

Full Text Search ◽

Search Service ◽

Key Technologies ◽

Full Text Retrieval ◽

Key Issues

Under the premise of in-depth understanding of Lucene full-text retrieval technology, this paper will apply it to the Mongolian text search. First, several key issues are proposed which are need to be addressed in achieving the Mongolian text search technology, and give the corresponding solutions to achieve the Mongolian full-text retrieval in Lucene. Second, this paper provides a fast, accurate and comprehensive Mongolian information full-text search service, played a key role in promoting the development of the Mongolian search engine.

Download Full-text

Design and Implementation of Full-Text Retrieval System for People’s Daily Annotated Corpus

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.135-136.369 ◽

2011 ◽

Vol 135-136 ◽

pp. 369-374

Author(s):

Yang Sen Zhang ◽

Gai Juan Huang

Keyword(s):

Full Text ◽

Retrieval System ◽

Search Algorithm ◽

Text Retrieval ◽

Index Structure ◽

Text Search ◽

Full Text Search ◽

People’S Daily ◽

Full Text Retrieval ◽

People's Daily

In this paper, we have designed and realized a efficient full-text retrieval system for the basic annotation People's Daily Corpus based on the inverted index technology. According to the characteristics of the basic annotation People’s Daily Corpus data, we have analyzed the methods and strategies of system implementing thoroughly. On the basis of comparing the various schemes, we have put forward to the three levels index structure of Chinese character, word and address set, and given the design approach of each level index dictionary structure. After converting the unstructured People’s Daily corpus into index structured data, we realized the full-text search algorithm correspond to the proposed index structure. Experimental results show that the proposed search algorithm has achieved the target of "ten millions Chinese characters, response in a second", improved the speed of the People's Daily Corpus full-text search.

Download Full-text

Full-Text Search Engine Technology Research Based on Lucene

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.952.355 ◽

2014 ◽

Vol 952 ◽

pp. 355-358 ◽

Cited By ~ 1

Author(s):

Chun Feng Xu ◽

Hong Hua Xu

Keyword(s):

Search Engine ◽

Full Text ◽

Chinese Word Segmentation ◽

Text Search ◽

Full Text Search ◽

Engine Model ◽

Main Emphasis ◽

Full Text Retrieval ◽

Set Up ◽

Important Branch

As an important branch of modern information retrieval technology, full-text search is not only an important tool for dealing with unstructured data, but also one of the mainstream technology of search engines .This paper starts from studying the working principles and process of search engine model in deep and discuss Lucene's architecture with previously knowledge. The main emphasis is placed on the problem of some basic algorithms of Chinese word segmentation and Relevance Ranking. Finally, we set up a system based on Lucene of full-text retrieval by applying these technologies.

Download Full-text

Application of Full Text Search Engine Based on Lucene

Advances in Internet of Things ◽

10.4236/ait.2012.24013 ◽

2012 ◽

Vol 02 (04) ◽

pp. 106-109 ◽

Cited By ~ 6

Author(s):

Rujia Gao ◽

Danying Li ◽

Wanlong Li ◽

Yaze Dong

Keyword(s):

Search Engine ◽

Full Text ◽

Text Search ◽

Full Text Search

Download Full-text

Markup as Index Interface

Proceedings of Balisage: The Markup Conference 2015 ◽

10.4242/balisagevol15.holstege01 ◽

2015 ◽

Author(s):

Mary Holstege

Keyword(s):

Information Content ◽

Search Engine ◽

Full Text ◽

List Type ◽

Text Search ◽

Full Text Search ◽

External Knowledge ◽

Additional Information ◽

Inverted Indexes ◽

Different Content

To a search engine, indexes are specified by the content: the words, phrases, and characters that are actually present tell the search engine what inverted indexes to create. Other external knowledge can be applied add to this inventory of indexes. For example, knowledge of the document language can lead to indexes for word stems or decompounding. These can unify different content into the same index or split the same content into multiple indexes. That is, different words manifest in the content can be unified under a single search key, and the same word can have multiple manifestations under different search keys. Turning this around, the indexes represent the retrievable information content in the document. Full text search is not an either/or yes/no system, but one of relative fit (scoring). Precision balances against recall, mediated by scoring. The search engine perspective offers a different way to think about markup: As a specification of the retrievable information content of the document. As something that can, with additional information, unify different markup or provide multiple distinct views of the same markup. As something that can be present to greater or lesser degrees, with a goodness of match (scoring). As a specification that can be adjusted to balance precision and recall. What does this search engine perspective on markup mean, concretely? Can we use it to reframe some persistent conundrums, such as vocabulary resolution and overlap? Let's see.

Download Full-text

A Multi-tenant Fair Share Approach to Full-text Search Engine

2016 Seventh International Workshop on Data-Intensive Computing in the Clouds (DataCloud) ◽

10.1109/datacloud.2016.010 ◽

2016 ◽

Author(s):

Zong Peng ◽

Beth Plale

Keyword(s):

Search Engine ◽

Full Text ◽

Fair Share ◽

Text Search ◽

Full Text Search

Download Full-text

Full-Text Search Engine using MySQL

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2010.5.2233 ◽

2010 ◽

Vol 5 (5) ◽

pp. 735

Author(s):

Cornelia Gyorodi ◽

Robert Gyorodi ◽

George Pecherle ◽

George Mihai Cornea

Keyword(s):

Search Engine ◽

Full Text ◽

Bag Of Words ◽

Text Search ◽

Full Text Search ◽

Medium Scale ◽

The Web ◽

Spelling Mistake

In this article we will try to explain how we can create a search engine using the powerful MySQL full-text search. The ever increasing demands of the web requires cheap and elaborate search options. One of the most important issues for a search engine is to have the capacity to order its results set as relevance and provide the user with suggestions in the case of a spelling mistake or a small result set. In order to fulfill this request we thought about using the powerful MySQL full-text search. This option is suitable for small to medium scale websites. In order to provide sound like capabilities, a second table containing a bag of words from the main table together with the corresponding metaphone is created. When a suggestion is needed, this table is interrogated for the metaphone of the searched word and the result set is computed resulting a suggestion.

Download Full-text

Quantification of competitive value of documents

Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis ◽

10.11118/actaun200957050285 ◽

2009 ◽

Vol 57 (5) ◽

pp. 285-290

Author(s):

Pavel Šimek ◽

Jiří Vaněk ◽

Jan Jarolímek

Keyword(s):

Search Engine ◽

Market Share ◽

Full Text ◽

Search Engines ◽

Web Site ◽

Optimization Techniques ◽

Text Search ◽

Full Text Search ◽

Google Search ◽

The Web

The majority of Internet users use the global network to search for different information using fulltext search engines such as Google, Yahoo!, or Seznam. The web presentation operators are trying, with the help of different optimization techniques, to get to the top places in the results of fulltext search engines. Right there is a great importance of Search Engine Optimization and Search Engine Marketing, because normal users usually try links only on the first few pages of the fulltext search engines results on certain keywords and in catalogs they use primarily hierarchically higher placed links in each category. Key to success is the application of optimization methods which deal with the issue of keywords, structure and quality of content, domain names, individual sites and quantity and reliability of backward links. The process is demanding, long-lasting and without a guaranteed outcome. A website operator without advanced analytical tools do not identify the contribution of individual documents from which the entire web site consists. If the web presentation operators want to have an overview of their documents and web site in global, it is appropriate to quantify these positions in a specific way, depending on specific key words. For this purpose serves the quantification of competitive value of documents, which consequently sets global competitive value of a web site. Quantification of competitive values is performed on a specific full-text search engine. For each full-text search engine can be and often are, different results. According to published reports of ClickZ agency or Market Share is according to the number of searches by English-speaking users most widely used Google search engine, which has a market share of more than 80%. The whole procedure of quantification of competitive values is common, however, the initial step which is the analysis of keywords depends on a choice of the fulltext search engine.

Download Full-text