A Bloom Filter Application for Processing Big Datasets through MapReduce Framework

Now-a-days data streams or information streams are gigantic and quick changing. The usage of information streams can fluctuate from basic logical, scientific applications to vital business and money related ones. The useful information is abstracted from the stream and represented in the form of micro-clusters in the online phase. In offline phase micro-clusters are merged to form the macro clusters. DBSTREAM technique captures the density between micro-clusters by means of a shared density graph in the online phase. The density data in this graph is then used in reclustering for improving the formation of clusters but DBSTREAM takes more time in handling the corrupted data points In this paper an early pruning algorithm is used before pre-processing of information and a bloom filter is used for recognizing the corrupted information. Our experiments on real time datasets shows that using this approach improves the efficiency of macro-clusters by 90% and increases the generation of more number of micro-clusters within in a short time.

Download Full-text

P2P Probabilistic Routing Algorithm Based on Data Copying and Bloom Filter

Journal of Software ◽

10.3724/sp.j.1001.2011.03757 ◽

2011 ◽

Vol 22 (4) ◽

pp. 773-781

Author(s):

Gui-Ming ZHU ◽

De-Ke GUO ◽

Shi-Yao JIN

Keyword(s):

Routing Algorithm ◽

Bloom Filter ◽

Probabilistic Routing

Download Full-text

ODBF: A P2P Weak State Routing Scheme Based on Operative Decaying Bloom Filter

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2012.00910 ◽

2012 ◽

Vol 35 (5) ◽

pp. 910-917

Author(s):

Gui-Ming ZHU ◽

De-Ke GUO ◽

Shi-Yao JIN

Keyword(s):

Bloom Filter ◽

Weak State ◽

Routing Scheme

Download Full-text

Multi-keyword search over P2P based on Bloom filter

Journal of Computer Applications ◽

10.3724/sp.j.1087.2010.02335 ◽

2010 ◽

Vol 30 (9) ◽

pp. 2335-2338

Author(s):

Hua-yun YAN ◽

Ji-hong GUAN

Keyword(s):

Keyword Search ◽

Bloom Filter

Download Full-text

HF-BF: A Hotness-aware Fine-grained Bloom Filter for Unique Address Checking in IoT Blockchain

2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) ◽

10.1109/hpcc-smartcity-dss50907.2020.00136 ◽

2020 ◽

Author(s):

Wenbin Zhu ◽

Qun Ma ◽

Zhaoyan Shen ◽

Tianyu Wang ◽

Liang Ma ◽

...

Keyword(s):

Bloom Filter ◽

Fine Grained

Download Full-text

Accelerating Data Shuffling in MapReduce Framework with Scale-up NUMA Computing Architecture

24th High Performance Computing Symposium ◽

10.22360/springsim.2016.hpc.005 ◽

2016 ◽

Keyword(s):

Scale Up ◽

Mapreduce Framework ◽

Computing Architecture

Download Full-text

Bloom Filter-Based Parallel Architecture for Accelerating Equi-Join Operation on FPGA

Electronics ◽

10.3390/electronics10151778 ◽

2021 ◽

Vol 10 (15) ◽

pp. 1778

Author(s):

Binhao He ◽

Meiting Xue ◽

Shubiao Liu ◽

Wei Luo

Keyword(s):

Relational Databases ◽

Parallel Architecture ◽

Operation Time ◽

Bloom Filter ◽

Search Tree ◽

Binary Search Tree ◽

Maximum Acceleration ◽

Data Intensive ◽

Match Rate ◽

Field Programmable

As one of the most important operations in relational databases, the join is data-intensive and time-consuming. Thus, offloading this operation using field-programmable gate arrays (FPGAs) has attracted much interest and has been broadly researched in recent years. However, the available SRAM-based join architectures are often resource-intensive, power-consuming, or low-throughput. Besides, a lower match rate does not lead to a shorter operation time. To address these issues, a Bloom filter (BF)-based parallel join architecture is presented in this paper. This architecture first leverages the BF to discard the tuples that are not in the join result and classifies the remaining tuples into different channels. Second, a binary search tree is used to reduce the number of comparisons. The proposed method was implemented on a Xilinx FPGA, and the experimental results show that under a match rate of 50%, our architecture achieved a high join throughput of 145.8 million tuples per second and a maximum acceleration factor of 2.3 compared to the existing SRAM-based join architectures.

Download Full-text