A Detailed Study of Distributed Indexed Search Techniques using SOLR

For any web application running on RDBMS databases as the backend, it might be a huge performance impact if a search needs to be performed on a table with millions of rows or if a query needs to be executed which joins multiple tables. In general, such kind of backend services make the website extremely slow. Document based reverse indexing can be a useful solution in these cases. SOLR is a standalone enterprise search server with a REST-like API. It has major features which include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g., Word, PDF and more) parsing, geospatial search, Security built in. Databases and SOLR have complementary strengths and weaknesses. SQL supports very simple wildcard-based text search with some simple normalization like matching upper case to lowercase. The problem is that these are full table scans. In SOLR all searchable words are stored in an "inverse index based", which searches orders of magnitude faster. However, designing this framework is quite challenging. This paper discusses the techniques that are highly reliable, scalable and fault tolerant which can help in setting up the distributed indexing, replication and load-balanced querying with a centralized configuration.

Download Full-text

Improving full text search performance through textual analysis

Information Processing & Management ◽

10.1016/0306-4573(93)90083-p ◽

1993 ◽

Vol 29 (5) ◽

pp. 615-632 ◽

Cited By ~ 2

Author(s):

Mavis Molto

Keyword(s):

Full Text ◽

Textual Analysis ◽

Search Performance ◽

Text Search ◽

Full Text Search

Download Full-text

GPU Computation for Online Realtime Multi-Pattern Matching

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.284-287.3428 ◽

2013 ◽

Vol 284-287 ◽

pp. 3428-3432 ◽

Cited By ~ 2

Author(s):

Yu Hsiu Huang ◽

Richard Chun Hung Lin ◽

Ying Chih Lin ◽

Cheng Yi Lin

Keyword(s):

Parallel Computation ◽

Pattern Matching ◽

Full Text ◽

Network Performance ◽

Text Search ◽

Full Text Search ◽

Network Intrusion ◽

Speed Up ◽

Set Up ◽

Gpu Implementation

Most applications of traditional full-text search, e.g., webpage search, are offline which exploit text search engine to preview the texts and set up related index. However, applications of online realtime full-text search, e.g., network Intrusion detection and prevention systems (IDPS) are too hard to implementation by using commodity hardware. They are expensive and inflexible for more and more occurrences of new virus patterns and the text cannot be previewed and the search must be complete realtime online. Additionally, IDPS needs multi-pattern matching, and then malicious packets can be removed immediately from normal ones without degrading the network performance. Considering the problem of realtime multi-pattern matching, we implement two sequential algorithms, Wu-Manber and Aho-Corasick, respectively over GPU parallel computation platform. Both pattern matching algorithms are quite suitable for the cases with a large amount of patterns. In addition, they are also easier extendable over GPU parallel computation platform to satisfy realtime requirement. Our experimental results show that the throughput of GPU implementation is about five to seven times faster than CPU. Therefore, pattern matching over GPU offers an attractive solution of IDPS to speed up malicious packets detection among the normal traffic by considering the lower cost, easy expansion and better performance.

Download Full-text

Score-consistent algebraic optimization of full-text search queries with GRAFT

Proceedings of the 2011 international conference on Management of data - SIGMOD '11 ◽

10.1145/1989323.1989404 ◽

2011 ◽

Cited By ~ 1

Author(s):

Nathan Bales ◽

Alin Deutsch ◽

Vasilis Vassalos

Keyword(s):

Full Text ◽

Text Search ◽

Full Text Search ◽

Search Queries ◽

Algebraic Optimization

Download Full-text

Application of Full Text Search Engine Based on Lucene

Advances in Internet of Things ◽

10.4236/ait.2012.24013 ◽

2012 ◽

Vol 02 (04) ◽

pp. 106-109 ◽

Cited By ~ 6

Author(s):

Rujia Gao ◽

Danying Li ◽

Wanlong Li ◽

Yaze Dong

Keyword(s):

Search Engine ◽

Full Text ◽

Text Search ◽

Full Text Search

Download Full-text

Development of the Multilingual Collaboration System for Farmers of Several Countries (2) : Multilingual Full Text Search System

Journal of the Faculty of Agriculture, Kyushu University ◽

10.5109/4605 ◽

2004 ◽

Vol 49 (2) ◽

pp. 441-448

Author(s):

Kang Oh Lee ◽

Kei Nakaji ◽

Yoichi Nada

Keyword(s):

Full Text ◽

Text Search ◽

Search System ◽

Full Text Search

Download Full-text

Experimental simulation on incremental three-gram index for two-gram full-text search systems

SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483) ◽

10.1109/icsmc.2003.1245750 ◽

2004 ◽

Cited By ~ 1

Author(s):

H. Yamamoto ◽

S. Ohmi ◽

H. Tsuji

Keyword(s):

Full Text ◽

Experimental Simulation ◽

Text Search ◽

Full Text Search ◽

Search Systems

Download Full-text

“Dynamic” Syntax Model in Automated Language Analysis Systems for Increasing Full-Text Search Systems Efficiency

Emerging Intelligent Technologies in Industry - Studies in Computational Intelligence ◽

10.1007/978-3-642-22732-5_14 ◽

2011 ◽

pp. 157-166

Author(s):

Marcin Karwinski

Keyword(s):

Full Text ◽

Text Search ◽

Full Text Search ◽

Language Analysis ◽

Dynamic Syntax ◽

Search Systems

Download Full-text

Recommender Systems in Digital Libraries Using Artificial Intelligence and Machine Learning

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Handbook of Research on Methodologies and Applications of Supercomputing ◽

10.4018/978-1-7998-7156-9.ch012 ◽

2021 ◽

pp. 162-178

Author(s):

Namik Delilovic

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Digital Libraries ◽

Full Text ◽

Text Search ◽

Full Text Search ◽

Artificial Intelligence Techniques ◽

Search Results ◽

Advanced Search ◽

Search Field

Searching for contents in present digital libraries is still very primitive; most websites provide a search field where users can enter information such as book title, author name, or terms they expect to be found in the book. Some platforms provide advanced search options, which allow the users to narrow the search results by specific parameters such as year, author name, publisher, and similar. Currently, when users find a book which might be of interest to them, this search process ends; only a full-text search or references at the end of the book may provide some additional pointers. In this chapter, the author is going to give an example of how a user could permanently get recommendations for additional contents even while reading the article, using present machine learning and artificial intelligence techniques.

Download Full-text

ASH: A New Tool for Automated and Full-Text Search in Systematic Literature Reviews

Computational Science – ICCS 2021 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-77967-2_30 ◽

2021 ◽

pp. 362-369

Author(s):

Marek Sośnicki ◽

Lech Madeyski

Keyword(s):

Full Text ◽

Text Search ◽

Full Text Search ◽

Literature Reviews

Download Full-text

SQLite Sebagai Pengganti Lucene.Net pada Pencarian Produk Toko Online

Jurnal CoSciTech (Computer Science and Information Technology) ◽

10.37859/coscitech.v1i2.2204 ◽

2020 ◽

Vol 1 (2) ◽

pp. 36-43

Author(s):

Imam Farisi ◽

Mukhaimy Gazali ◽

Rudy Anshari

Keyword(s):

Full Text ◽

Search Process ◽

Text Search ◽

Search System ◽

Full Text Search ◽

Database Table ◽

Search Speed ◽

Product Search ◽

Online Stores ◽

Two Alternatives

A business institution can promote their products in its online shop. With the promotion of online stores, consumers can find out which products are sold at the store. Consumers who are looking for a product can search based on various attributes of goods that are scattered in various fields in the database table, and can even be spread across different tables. All attributes that point to an item are collected in one document which will be searched using the Full Text Search system. Two alternatives to the Full Text Search system were selected; Lucene.Net and Sqlite Full Text Search. Before actually being used, these two search system alternatives were tested first. In document storage size, Lucene.Net is superior by 6.78 times. The speed of writing Sqlite search documents is superior by between 1,875 times to 5,197 times. In terms of key search speed, Lucene.Net was superior by between 1,169 and 1,698 times. Based on the consideration of the speed and development of Lucene.Net Core which is still in beta stage, Sqlite Full Text Search is suitable for use in the product search process in the Online Store.

Download Full-text