bloom filters
Recently Published Documents


TOTAL DOCUMENTS

510
(FIVE YEARS 98)

H-INDEX

37
(FIVE YEARS 2)

2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Sean Randall ◽  
Helen Wichmann ◽  
Adrian Brown ◽  
James Boyd ◽  
Tom Eitelhuber ◽  
...  

Abstract Background Privacy preserving record linkage (PPRL) methods using Bloom filters have shown promise for use in operational linkage settings. However real-world evaluations are required to confirm their suitability in practice. Methods An extract of records from the Western Australian (WA) Hospital Morbidity Data Collection 2011–2015 and WA Death Registrations 2011–2015 were encoded to Bloom filters, and then linked using privacy-preserving methods. Results were compared to a traditional, un-encoded linkage of the same datasets using the same blocking criteria to enable direct investigation of the comparison step. The encoded linkage was carried out in a blinded setting, where there was no access to un-encoded data or a ‘truth set’. Results The PPRL method using Bloom filters provided similar linkage quality to the traditional un-encoded linkage, with 99.3% of ‘groupings’ identical between privacy preserving and clear-text linkage. Conclusion The Bloom filter method appears suitable for use in situations where clear-text identifiers cannot be provided for linkage.


2022 ◽  
Vol 4 (2) ◽  
Author(s):  
Hiroyuki Kano ◽  
Keisuke Hakuta

AbstractA private set intersection protocol is one of the secure multi-party computation protocols, and allows participants to compute the intersection of their sets without revealing them to each other. Ion et al. proposed the private intersection-sum protocol (PI-Sum). The PI-Sum is one of the two-party private set intersection protocol. In the PI-Sum, two parties (say Alice and Bob) have the private sets A and B. Moreover, Bob additionaly has a rational integer associated with each element of B. The PI-Sum allows Bob to obtain the sum of the rational integers associated with the elements of $$A \cap B$$ A ∩ B . This paper proposes the efficiency improvement techniques for the PI-Sum. The proposed techniques are based on Bloom filters which are probabilistic data structures. More precisely, this paper proposes three protocols which are modifications of the PI-Sum. The proposed protocols are more efficient than the PI-Sum.


2021 ◽  
Author(s):  
Christopher Hampf ◽  
Martin Bialke ◽  
Hauke Hund ◽  
Christian Fegeler ◽  
Stefan Lang ◽  
...  

Abstract BackgroundThe Federal Ministry of Research and Education funded the Network of University Medicine for establishing an infrastructure for pandemic research. This includes the development of a COVID-19 Data Exchange Platform (CODEX) that provides standardised and harmonised data sets for COVID-19 research. Nearly all university hospitals in Germany are part of the project and transmit medical data from the local data integration centres to the CODEX platform. The medical data on a person that has been collected at several sites is to be made available on the CODEX platform in a merged form. To enable this, a federated trusted third party (fTTP) will be established, which will allow the pseudonymised merging of the medical data. The fTTP implements privacy preserving record linkage based on Bloom filters and assigns pseudonyms to enable re-pseudonymisation during data transfer to the CODEX platform.ResultsThe fTTP was implemented conceptually and technically. For this purpose, the processes that are necessary for data delivery were modelled. The resulting communication relationships were identified and corresponding interfaces were specified. These were developed according to the specifications in FHIR and validated with the help of external partners. Existing tools such as the identity management system E-PIX® were further developed accordingly so that sites can generate Bloom filters based on person identifying information. An extension for the comparison of Bloom filters was implemented for the federated trust third party. The correct implementation was shown in the form of a demonstrator and the connection of two data integration centres.ConclusionsThis article describes how the fTTP was modelled and implemented. In a first expansion stage, the fTTP was exemplarily connected through two sites and its functionality was demonstrated. Further expansion stages, which are already planned, have been technically specified and will be implemented in the future in order to also handle cases in which the privacy preserving record linkage achieves ambiguous results. The first expansion stage of the fTTP is available in the University Medicine network and will be connected by all participating sites in the ongoing test phase.


2021 ◽  
Author(s):  
Martin Klein ◽  
Lyudmila Balakireva ◽  
Karolina Hulob ◽  
Ingeborg Rudomino ◽  
Drazenko Celjak
Keyword(s):  

2021 ◽  
Author(s):  
Martin Klein ◽  
Lyudmila Balakireva ◽  
Karolina Holub ◽  
Ingeborg Rudomino ◽  
Drazenko Celjak

2021 ◽  
Author(s):  
Pedro Reviriego ◽  
Ori Rottenstreich ◽  
Shanshan Liu ◽  
Fabrizio Lombardi

2021 ◽  
Author(s):  
Sanjay Kumar Srikakulam ◽  
Sebastian Keller ◽  
Fawaz Dabbaghie ◽  
Robert Bals ◽  
Olga V. Kalinina

Technological advances of next-generation sequencing present new computational challenges to develop methods to store and query these data in time- and memory-efficient ways. We present MetaProFi (https://github.com/kalininalab/metaprofi), a Bloom filter-based tool that, in addition to supporting nucleotide sequences, can for the first time directly store and query amino acid sequences and translated nucleotide sequences, thus bringing sequence comparison to a more biologically relevant protein level. Owing to the properties of Bloom filters, it has a zero false-negative rate, allows for exact and inexact searches, and leverages disk storage and Zstandard compression to achieve high time and space efficiency. We demonstrate the utility of MetaProFi by indexing UniProtKB datasets at organism- and at sequence-level in addition to the indexing of Tara Oceans dataset and the 2585 human RNA-seq experiments, showing that MetaProFi consumes far less disk space than state-of-the-art-tools while also improving performance.


2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Lin Ge ◽  
Tao Jiang

Aiming at the privacy protection of lightweight nodes based on Bloom filters in blockchain, this paper proposes a new privacy protection method. Considering the superimposition effect of query information, node and Bloom filter are regarded as the two parties of the game. A privacy protection mechanism based on the mixed strategy Nash equilibrium is proposed to judge the information query. On this basis, a Bloom filter privacy protection algorithm is proposed when the probability of information query and privacy, not being leaked, is less than the node privacy protection. It is based on variable factor disturbance, adjusting the number of bits’ set to 1 in the Bloom filter to improve the privacy protection performance in different scenarios. The experiment uses Bitcoin transaction data from 2009 to 2019 as the test data to verify the effectiveness, reliability, and superiority of the method.


2021 ◽  
Vol 14 (11) ◽  
pp. 2355-2368
Author(s):  
Tobias Schmidt ◽  
Maximilian Bandle ◽  
Jana Giceva

With today's data deluge, approximate filters are particularly attractive to avoid expensive operations like remote data/disk accesses. Among the many filter variants available, it is non-trivial to find the most suitable one and its optimal configuration for a specific use-case. We provide open-source implementations for the most relevant filters (Bloom, Cuckoo, Morton, and Xor filters) and compare them in four key dimensions: the false-positive rate, space consumption, build, and lookup throughput. We improve upon existing state-of-the-art implementations with a new optimization, radix partitioning, which boosts the build and lookup throughput for large filters by up to 9x and 5x. Our in-depth evaluation first studies the impact of all available optimizations separately before combining them to determine the optimal filter for specific use-cases. While register-blocked Bloom filters offer the highest throughput, the new Xor filters are best suited when optimizing for small filter sizes or low false-positive rates.


Sign in / Sign up

Export Citation Format

Share Document