Locality-Sensitive Hashing for Information Retrieval System on Multiple GPGPU Devices

It is challenging to build a real-time information retrieval system, especially for systems with high-dimensional big data. To structure big data, many hashing algorithms that map similar data items to the same bucket to advance the search have been proposed. Locality-Sensitive Hashing (LSH) is a common approach for reducing the number of dimensions of a data set, by using a family of hash functions and a hash table. The LSH hash table is an additional component that supports the indexing of hash values (keys) for the corresponding data/items. We previously proposed the Dynamic Locality-Sensitive Hashing (DLSH) algorithm with a dynamically structured hash table, optimized for storage in the main memory and General-Purpose computation on Graphics Processing Units (GPGPU) memory. This supports the handling of constantly updated data sets, such as songs, images, or text databases. The DLSH algorithm works effectively with data sets that are updated with high frequency and is compatible with parallel processing. However, the use of a single GPGPU device for processing big data is inadequate, due to the small memory capacity of GPGPU devices. When using multiple GPGPU devices for searching, we need an effective search algorithm to balance the jobs. In this paper, we propose an extension of DLSH for big data sets using multiple GPGPUs, in order to increase the capacity and performance of the information retrieval system. Different search strategies on multiple DLSH clusters are also proposed to adapt our parallelized system. With significant results in terms of performance and accuracy, we show that DLSH can be applied to real-life dynamic database systems.

Download Full-text

Information retrieval system based semantique and big data

Procedia Computer Science ◽

10.1016/j.procs.2019.04.157 ◽

2019 ◽

Vol 151 ◽

pp. 1108-1113 ◽

Cited By ~ 3

Author(s):

Youssef CHOUNI ◽

Mohamed ERRITALI ◽

Youness MADANI ◽

Hanane EZZIKOURI

Keyword(s):

Big Data ◽

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System

Download Full-text

Big data Curation: Enhanced Information Retrieval System

10.22161/ijaers/nctet.2017.21 ◽

2017 ◽

Author(s):

K. Naresh ◽

A. BasiReddy ◽

S. Swarnalatha

Keyword(s):

Big Data ◽

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System ◽

Data Curation

Download Full-text

Data Set Generation for the Attributes of the Words of the Holy Quran: Information Retrieval System for E-Learning

2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences ◽

10.1109/nooric.2013.53 ◽

2013 ◽

Cited By ~ 1

Author(s):

Haq Nawaz ◽

Yasir Saleem

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System ◽

Data Set ◽

Holy Quran ◽

E Learning ◽

The Holy Quran

Download Full-text

The Research and Implementation of File Information Retrieval System Based on Big Data Semantic

Proceedings of the Advances in Materials, Machinery, Electrical Engineering (AMMEE 2017) ◽

10.2991/ammee-17.2017.103 ◽

2017 ◽

Author(s):

Zebo Zhu ◽

Baochuan Lin

Keyword(s):

Big Data ◽

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System

Download Full-text

Cloud assisted big data information retrieval system for critical data supervision in disaster regions

Computer Communications ◽

10.1016/j.comcom.2019.11.028 ◽

2020 ◽

Vol 151 ◽

pp. 548-555

Author(s):

Chunmei Wang ◽

Fang Qin ◽

Dinesh Jackson Samuel R.

Keyword(s):

Big Data ◽

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System ◽

Critical Data

Download Full-text

Essential Issues to Consider for a Manufacturing Data Query System Based on Graph

Lecture Notes in Mechanical Engineering - Advances on Mechanics, Design Engineering and Manufacturing III ◽

10.1007/978-3-030-70566-4_55 ◽

2021 ◽

pp. 347-353

Author(s):

Lise Kim ◽

Esma Yahia ◽

Frédéric Segonds ◽

Philippe Veron ◽

Victor Fau

Keyword(s):

Information Retrieval ◽

Manufacturing Industry ◽

Retrieval System ◽

Information Retrieval System ◽

Use Case ◽

Graph Database ◽

Industry Data ◽

Data Set ◽

Industrial Use ◽

Query System

AbstractManufacturing industry data are distributed, heterogeneous and numerous, resulting in different challenges including the fast, exhaustive and relevant querying of data. In order to provide an innovative answer to this challenge, the authors consider an information retrieval system based on a graph database. In this paper, the authors focus on determining the essential functions to consider in this context. The authors define a three-step methodology using root causes analysis and resolution. This methodology is then applied to a data set and queries representative of an industrial use case. As a result, the authors list four major issues to consider and discuss their potential resolutions.

Download Full-text