Massive picture retrieval system based on big data image mining

2021 ◽  
Vol 121 ◽  
pp. 54-58
Author(s):  
Kun Zhang ◽  
Kai Chen ◽  
Binghui Fan
2019 ◽  
Vol 151 ◽  
pp. 1108-1113 ◽  
Author(s):  
Youssef CHOUNI ◽  
Mohamed ERRITALI ◽  
Youness MADANI ◽  
Hanane EZZIKOURI

2020 ◽  
Author(s):  
Saliha Mezzoudj

Recently, the increasing use of mobile devices, such as cameras and smartphones, has resulted in a dramatic increase in the amount of images collected every day. Therefore, retrieving and managing these large volumes of images has become a major challenge in the field of computer vision. One of the solutions for efficiently managing image databases is an Image Content Search (CBIR) system. For this, we introduce in this chapter some fundamental theories of content-based image retrieval for large scale databases using Parallel frameworks. Section 2 and Section 3 presents the basic methods of content-based image retrieval. Then, as the emphasis of this chapter, we introduce in Section 1.2 A content-based image retrieval system for large-scale images databases. After that, we briefly address Big Data, Big Data processing platforms for large scale image retrieval. In Sections 5, 6, 7, and 8. Finally, we draw a conclusion in Section 9.


2020 ◽  
Vol 10 (7) ◽  
pp. 2539 ◽  
Author(s):  
Toan Nguyen Mau ◽  
Yasushi Inoguchi

It is challenging to build a real-time information retrieval system, especially for systems with high-dimensional big data. To structure big data, many hashing algorithms that map similar data items to the same bucket to advance the search have been proposed. Locality-Sensitive Hashing (LSH) is a common approach for reducing the number of dimensions of a data set, by using a family of hash functions and a hash table. The LSH hash table is an additional component that supports the indexing of hash values (keys) for the corresponding data/items. We previously proposed the Dynamic Locality-Sensitive Hashing (DLSH) algorithm with a dynamically structured hash table, optimized for storage in the main memory and General-Purpose computation on Graphics Processing Units (GPGPU) memory. This supports the handling of constantly updated data sets, such as songs, images, or text databases. The DLSH algorithm works effectively with data sets that are updated with high frequency and is compatible with parallel processing. However, the use of a single GPGPU device for processing big data is inadequate, due to the small memory capacity of GPGPU devices. When using multiple GPGPU devices for searching, we need an effective search algorithm to balance the jobs. In this paper, we propose an extension of DLSH for big data sets using multiple GPGPUs, in order to increase the capacity and performance of the information retrieval system. Different search strategies on multiple DLSH clusters are also proposed to adapt our parallelized system. With significant results in terms of performance and accuracy, we show that DLSH can be applied to real-life dynamic database systems.


Author(s):  
Mohammed Erritali ◽  
Abderrahim Beni-Hssane ◽  
Marouane Birjali ◽  
Youness Madani

<p>Semantic indexing and document similarity is an important information retrieval system problem in Big Data with broad applications. In this paper, we investigate MapReduce programming model as a specific framework for managing distributed processing in a large of amount documents. Then we study the state of the art of different approaches for computing the similarity of documents. Finally, we propose our approach of semantic similarity measures using WordNet as an external network semantic resource. For evaluation, we compare the proposed approach with other approaches previously presented by using our new MapReduce algorithm. Experimental results review that our proposed approach outperforms the state of the art ones on running time performance and increases the measurement of semantic similarity.</p>


Sign in / Sign up

Export Citation Format

Share Document