scholarly journals Secure Multi-keyword Similarity Search Over Encrypted Data With Security Improvement

2021 ◽  
Vol 17 (2) ◽  
pp. 1-10
Author(s):  
Hussein Mohammed ◽  
Ayad Abdulsada

Searchable encryption (SE) is an interesting tool that enables clients to outsource their encrypted data into external cloud servers with unlimited storage and computing power and gives them the ability to search their data without decryption. The current solutions of SE support single-keyword search making them impractical in real-world scenarios. In this paper, we design and implement a multi-keyword similarity search scheme over encrypted data by using locality-sensitive hashing functions and Bloom filter. The proposed scheme can recover common spelling mistakes and enjoys enhanced security properties such as hiding the access and search patterns but with costly latency. To support similarity search, we utilize an efficient bi-gram-based method for keyword transformation. Such a method improves the search results accuracy. Our scheme employs two non-colluding servers to break the correlation between search queries and search results. Experiments using real-world data illustrate that our scheme is practically efficient, secure, and retains high accuracy.

2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e18725-e18725
Author(s):  
Ravit Geva ◽  
Barliz Waissengrin ◽  
Dan Mirelman ◽  
Felix Bokstein ◽  
Deborah T. Blumenthal ◽  
...  

e18725 Background: Healthcare data sharing is important for the creation of diverse and large data sets, supporting clinical decision making, and accelerating efficient research to improve patient outcomes. This is especially vital in the case of real world data analysis. However, stakeholders are reluctant to share their data without ensuring patients’ privacy, proper protection of their data sets and the ways they are being used. Homomorphic encryption is a cryptographic capability that can address these issues by enabling computation on encrypted data without ever decrypting it, so the analytics results are obtained without revealing the raw data. The aim of this study is to prove the accuracy of analytics results and the practical efficiency of the technology. Methods: A real-world data set of colorectal cancer patients’ survival data, following two different treatment interventions, including 623 patients and 24 variables, amounting to 14,952 items of data, was encrypted using leveled homomorphic encryption implemented in the PALISADE software library. Statistical analysis of key oncological endpoints was blindly performed on both the raw data and the homomorphically-encrypted data using descriptive statistics and survival analysis with Kaplan-Meier curves. Results were then compared with an accuracy goal of two decimals. Results: The difference between the raw data and the homomorphically encrypted data results, regarding all variables analyzed was within the pre-determined accuracy range goal, as well as the practical efficiency of the encrypted computation measured by run time, are presented in table. Conclusions: This study demonstrates that data encrypted with Homomorphic Encryption can be statistical analyzed with a precision of at least two decimal places, allowing safe clinical conclusions drawing while preserving patients’ privacy and protecting data owners’ data assets. Homomorphic encryption allows performing efficient computation on encrypted data non-interactively and without requiring decryption during computation time. Utilizing the technology will empower large-scale cross-institution and cross- stakeholder collaboration, allowing safe international collaborations. Clinical trial information: 0048-19-TLV. [Table: see text]


2008 ◽  
Vol 17 (01) ◽  
pp. 87-107 ◽  
Author(s):  
TIANMING HU ◽  
CHEW LIM TAN ◽  
YONG TANG ◽  
SAM YUAN SUNG ◽  
HUI XIONG ◽  
...  

The duality between document and word clustering naturally leads to the consideration of storing the document dataset in a bipartite. With documents and words modeled as vertices on two sides respectively, partitioning such a graph yields a co-clustering of words and documents. The topic of each cluster can then be represented by the top words and documents that have highest within-cluster degrees. However, such claims may fail if top words and documents are selected simply because they are very general and frequent. In addition, for those words and documents across several topics, it may not be proper to assign them to a single cluster. In other words, to precisely capture the cluster topic, we need to identify those micro-sets of words/documents that are similar among themselves and as a whole, representative of their respective topics. Along this line, in this paper, we use hyperclique patterns, strongly affiliated words/documents, to define such micro-sets. We introduce a new bipartite formulation that incorporates both word hypercliques and document hypercliques as super vertices. By co-preserving hyperclique patterns during the clustering process, our experiments on real-world data sets show that better clustering results can be obtained in terms of various external clustering validation measures and the cluster topic can be more precisely identified. Also, the partitioned bipartite with co-preserved patterns naturally lends itself to different clustering-related functions in search engines. To that end, we illustrate such an application, returning clustered search results for keyword queries. We show that the topic of each cluster with respect to the current query can be identified more accurately with the words and documents from the patterns than with those top ones from the standard bipartite formulation.


Author(s):  
Ayad I. Abdulsada ◽  
Dhafer G. Honi ◽  
Salah Al-Darraji

Many organizations and individuals are attracted to outsource their data into remote cloud service providers. To ensure privacy, sensitive data should be encrypted be-fore being hosted. However, encryption disables the direct application of the essential data management operations like searching and indexing. Searchable encryption is acryptographic tool that gives users the ability to search the encrypted data while being encrypted. However, the existing schemes either serve a single exact search that loss the ability to handle the misspelled keywords or multi-keyword search that generate very long trapdoors. In this paper, we address the problem of designing a practical multi-keyword similarity scheme that provides short trapdoors and returns the correct results according to their similarity scores. To do so, each document is translated intoa compressed trapdoor. Trapdoors are generated using key based hash functions to en-sure their privacy. Only authorized users can issue valid trapdoors. Similarity scores of two textual documents are evaluated by computing the Hamming distance between their corresponding trapdoors. A robust security definition is provided together withits proof. Our experimental results illustrate that the proposed scheme improves thesearch efficiency compared to the existing schemes. Further more, it shows a high level of performance.


Author(s):  
Fairouz Sher Ali ◽  
Hadeel Noori Saad ◽  
Falah Hassan Sarhan ◽  
Bushra Naaeem

<p>Cloud computing has become a revolutionary computing model which provides an economical and flexible strategy for resource sharing and data management. Due to privacy concerns, sensitive data has to be encrypted before being uploaded to the cloud servers. Over the last few years, several keyword searchable encryption works have been described in the literature. However, existing works mostly focus on secure searching using keyword and only retrieve Boolean results that are not yet adequate. On the other hand, poor-resources of mobile networks play an important role on all applications area nowadays. Mobile nodes mostly act as information retrieval end which make it important to address this problem. In this paper, we present a secure keyword search scheme based on the Bloom filter(SKS-BF), which enhances the system’s usability by allowing ranking based on the relevance score of the search results and retrieves the top most relevant files instead of retrieving all the files. Further, the Bloom filter (BFs) can accelerate a search process involving a large number of keywords. Extensive experiments and network simulation confirm the efficiency of our proposed schemes.</p>


Author(s):  
Akash Tidke

In this paper we present a survey on keyword based searching algorithms. Various searching techniques are used for retrieving the encrypted data from cloud servers. This survey work involves a comparative study of these keyword based searching algorithms. It concludes that till now multi-keyword ranked search MRSE scheme is the best methodology for searching the encrypted data.


2017 ◽  
Vol 107 (1) ◽  
pp. 39-56
Author(s):  
Jakub Kúdela ◽  
Irena Holubová ◽  
Ondřej Bojar

Abstract Most of the current methods for mining parallel texts from the web assume that web pages of web sites share same structure across languages. We believe that there still exists a non-negligible amount of parallel data spread across sources not satisfying this assumption. We propose an approach based on a combination of bivec (a bilingual extension of word2vec) and locality-sensitive hashing which allows us to efficiently identify pairs of parallel segments located anywhere on pages of a given web domain, regardless their structure. We validate our method on realigning segments from a large parallel corpus. Another experiment with real-world data provided by Common Crawl Foundation confirms that our solution scales to hundreds of terabytes large set of web-crawled data.


2018 ◽  
Vol 10 (5) ◽  
pp. 38
Author(s):  
Jingsha He ◽  
Jianan Wu ◽  
Nafei Zhu ◽  
Muhammad Salman Pathan

2016 ◽  
Vol 22 ◽  
pp. 219
Author(s):  
Roberto Salvatori ◽  
Olga Gambetti ◽  
Whitney Woodmansee ◽  
David Cox ◽  
Beloo Mirakhur ◽  
...  

VASA ◽  
2019 ◽  
Vol 48 (2) ◽  
pp. 134-147 ◽  
Author(s):  
Mirko Hirschl ◽  
Michael Kundi

Abstract. Background: In randomized controlled trials (RCTs) direct acting oral anticoagulants (DOACs) showed a superior risk-benefit profile in comparison to vitamin K antagonists (VKAs) for patients with nonvalvular atrial fibrillation. Patients enrolled in such studies do not necessarily reflect the whole target population treated in real-world practice. Materials and methods: By a systematic literature search, 88 studies including 3,351,628 patients providing over 2.9 million patient-years of follow-up were identified. Hazard ratios and event-rates for the main efficacy and safety outcomes were extracted and the results for DOACs and VKAs combined by network meta-analysis. In addition, meta-regression was performed to identify factors responsible for heterogeneity across studies. Results: For stroke and systemic embolism as well as for major bleeding and intracranial bleeding real-world studies gave virtually the same result as RCTs with higher efficacy and lower major bleeding risk (for dabigatran and apixaban) and lower risk of intracranial bleeding (all DOACs) compared to VKAs. Results for gastrointestinal bleeding were consistently better for DOACs and hazard ratios of myocardial infarction were significantly lower in real-world for dabigatran and apixaban compared to RCTs. By a ranking analysis we found that apixaban is the safest anticoagulant drug, while rivaroxaban closely followed by dabigatran are the most efficacious. Risk of bias and heterogeneity was assessed and had little impact on the overall results. Analysis of effect modification could guide the clinical decision as no single DOAC was superior/inferior to the others under all conditions. Conclusions: DOACs were at least as efficacious as VKAs. In terms of safety endpoints, DOACs performed better under real-world conditions than in RCTs. The current real-world data showed that differences in efficacy and safety, despite generally low event rates, exist between DOACs. Knowledge about these differences in performance can contribute to a more personalized medicine.


Sign in / Sign up

Export Citation Format

Share Document