Privacy Preserving k-Nearest Neighbor for Medical Diagnosis in e-Health Cloud

Journal of Healthcare Engineering ◽

10.1155/2018/4073103 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11 ◽

Cited By ~ 7

Author(s):

Jeongsu Park ◽

Dong Hoon Lee

Keyword(s):

Cloud Computing ◽

Medical Diagnosis ◽

Nearest Neighbor ◽

Data Access ◽

Privacy Preserving ◽

K Nearest Neighbor ◽

Diagnosis System ◽

Cloud Servers ◽

Health Cloud ◽

Medical Dataset

Cloud computing is highly suitable for medical diagnosis in e-health services where strong computing ability is required. However, in spite of the huge benefits of adopting the cloud computing, the medical diagnosis field is not yet ready to adopt the cloud computing because it contains sensitive data and hence using the cloud computing might cause a great concern in privacy infringement. For instance, a compromised e-health cloud server might expose the medical dataset outsourced from multiple medical data owners or infringe on the privacy of a patient inquirer by leaking his/her symptom or diagnosis result. In this paper, we propose a medical diagnosis system using e-health cloud servers in a privacy preserving manner when medical datasets are owned by multiple data owners. The proposed system is the first one that achieves the privacy of medical dataset, symptoms, and diagnosis results and hides the data access pattern even from e-health cloud servers performing computations using the data while it is still robust against collusion of the entities. As a building block of the proposed diagnosis system, we design a novel privacy preserving protocol for finding the k data with the highest similarity (PE-FTK) to a given symptom. The protocol reduces the average running time by 35% compared to that of a previous work in the literature. Moreover, the result of the previous work is probabilistic, i.e., the result can contain some error, while the result of our PE-FTK is deterministic, i.e., the result is correct without any error probability.

Download Full-text

A Privacy Preserving Cloud-Based K-NN Search Scheme with Lightweight User Loads

Computers ◽

10.3390/computers9010001 ◽

2020 ◽

Vol 9 (1) ◽

pp. 1 ◽

Cited By ~ 1

Author(s):

Yeong-Cherng Hsu ◽

Chih-Hsin Hsueh ◽

Ja-Ling Wu

Keyword(s):

Data Privacy ◽

Nearest Neighbor ◽

Search Algorithm ◽

Data Access ◽

Privacy Preserving ◽

Secret Key ◽

K Nearest Neighbor ◽

Sensitive Data ◽

Cloud Data ◽

Cloud Server

With the growing popularity of cloud computing, it is convenient for data owners to outsource their data to a cloud server. By utilizing the massive storage and computational resources in cloud, data owners can also provide a platform for users to make query requests. However, due to the privacy concerns, sensitive data should be encrypted before outsourcing. In this work, a novel privacy preserving K-nearest neighbor (K-NN) search scheme over the encrypted outsourced cloud dataset is proposed. The problem is about letting the cloud server find K nearest points with respect to an encrypted query on the encrypted dataset, which was outsourced by data owners, and return the searched results to the querying user. Comparing with other existing methods, our approach leverages the resources of the cloud more by shifting most of the required computational loads, from data owners and query users, to the cloud server. In addition, there is no need for data owners to share their secret key with others. In a nutshell, in the proposed scheme, data points and user queries are encrypted attribute-wise and the entire search algorithm is performed in the encrypted domain; therefore, our approach not only preserves the data privacy and query privacy but also hides the data access pattern from the cloud server. Moreover, by using a tree structure, the proposed scheme could accomplish query requests in sub-liner time, according to our performance analysis. Finally, experimental results demonstrate the practicability and the efficiency of our method.

Download Full-text

Supervised Classifier Approach for Intrusion Detection on KDD with Optimal MapReduce Framework Model in Cloud Computing

Recent Patents on Computer Science ◽

10.2174/1573401315666190619113510 ◽

2019 ◽

Vol 12 ◽

Author(s):

M. Ilayaraja ◽

S. Hemalatha ◽

P. Manickam ◽

K. Sathesh Kumar ◽

K. Shankar

Keyword(s):

Machine Learning ◽

Cloud Computing ◽

Intrusion Detection ◽

Decision Tree ◽

Learning Strategies ◽

Nearest Neighbor ◽

Detection System ◽

K Nearest Neighbor ◽

Mapreduce Model ◽

The Web

Cloud computing is characterized as the arrangement of assets or administrations accessible through the web to the clients on their request by cloud providers. It communicates everything as administrations over the web in view of the client request, for example operating system, organize equipment, storage, assets, and software. Nowadays, Intrusion Detection System (IDS) plays a powerful system, which deals with the influence of experts to get actions when the system is hacked under some intrusions. Most intrusion detection frameworks are created in light of machine learning strategies. Since the datasets, this utilized as a part of intrusion detection is Knowledge Discovery in Database (KDD). In this paper detect or classify the intruded data utilizing Machine Learning (ML) with the MapReduce model. The primary face considers Hadoop MapReduce model to reduce the extent of database ideal weight decided for reducer model and second stage utilizing Decision Tree (DT) classifier to detect the data. This DT classifier comprises utilizing an appropriate classifier to decide the class labels for the non-homogeneous leaf nodes. The decision tree fragment gives a coarse section profile while the leaf level classifier can give data about the qualities that influence the label inside a portion. From the proposed result accuracy for detection is 96.21% contrasted with existing classifiers, for example, Neural Network (NN), Naive Bayes (NB) and K Nearest Neighbor (KNN).

Download Full-text

Anonymous Authentication for Privacy Preserving of Multimedia Data in the Cloud

Handbook of Research on Multimedia Cyber Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-2701-6.ch003 ◽

2020 ◽

pp. 48-72

Author(s):

Sadiq J. Almuairfi ◽

Mamdouh Alenezi

Keyword(s):

Cloud Computing ◽

Cost Saving ◽

Private Information ◽

Reversible Data Hiding ◽

Privacy Preserving ◽

Multimedia Data ◽

Anonymous Authentication ◽

Computationally Expensive ◽

New Research ◽

Cloud Servers

Cloud computing technology provides cost-saving and flexibility of services for users. With the explosion of multimedia data, more and more data owners would outsource their personal multimedia data on the cloud. In the meantime, some computationally expensive tasks are also undertaken by cloud servers. However, the outsourced multimedia data and its applications may reveal the data owner's private information because the data owners lose control of their data. Recently, this thought has aroused new research interest on privacy-preserving reversible data hiding over outsourced multimedia data. Anonymous Authentication Scheme will be proposed in this chapter as the most relatable, applicable, and appropriate techniques to be adopted by the cloud computing professionals for the eradication of risks that have been associated with the risks and challenges of privacy.

Download Full-text

Anonymous Authentication for Privacy Preserving of Multimedia Data in the Cloud

Research Anthology on Privatizing and Securing Data ◽

10.4018/978-1-7998-8954-0.ch020 ◽

2021 ◽

pp. 428-452

Author(s):

Sadiq J. Almuairfi ◽

Mamdouh Alenezi

Keyword(s):

Cloud Computing ◽

Cost Saving ◽

Private Information ◽

Reversible Data Hiding ◽

Privacy Preserving ◽

Multimedia Data ◽

Anonymous Authentication ◽

Computationally Expensive ◽

New Research ◽

Cloud Servers

Download Full-text

Improving k-Nearest Neighbor Pattern Recognition Models for Privacy-Preserving Data Analysis

2019 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata47090.2019.9006281 ◽

2019 ◽

Author(s):

Walisa Romsaiyud ◽

Henning Schnoor ◽

Wilhelm Hasselbring

Keyword(s):

Pattern Recognition ◽

Data Analysis ◽

Nearest Neighbor ◽

Privacy Preserving ◽

K Nearest Neighbor

Download Full-text

Trajectory Clustering and k-NN for Robust Privacy Preserving Spatiotemporal Databases

Algorithms ◽

10.3390/a11120207 ◽

2018 ◽

Vol 11 (12) ◽

pp. 207 ◽

Cited By ~ 2

Author(s):

Elias Dritsas ◽

Maria Trigka ◽

Panagiotis Gerolymatos ◽

Spyros Sioutas

Keyword(s):

Nearest Neighbor ◽

Dimensional Space ◽

Motion Vector ◽

Research Work ◽

Privacy Preserving ◽

Mobile Users ◽

Trajectory Clustering ◽

K Nearest Neighbor ◽

Trajectory Data ◽

Spatiotemporal Databases

In the context of this research work, we studied the problem of privacy preserving on spatiotemporal databases. In particular, we investigated the k-anonymity of mobile users based on real trajectory data. The k-anonymity set consists of the k nearest neighbors. We constructed a motion vector of the form (x,y,g,v) where x and y are the spatial coordinates, g is the angle direction, and v is the velocity of mobile users, and studied the problem in four-dimensional space. We followed two approaches. The former applied only k-Nearest Neighbor (k-NN) algorithm on the whole dataset, while the latter combined trajectory clustering, based on K-means, with k-NN. Actually, it applied k-NN inside a cluster of mobile users with similar motion pattern (g,v). We defined a metric, called vulnerability, that measures the rate at which k-NNs are varying. This metric varies from 1 k (high robustness) to 1 (low robustness) and represents the probability the real identity of a mobile user being discovered from a potential attacker. The aim of this work was to prove that, with high probability, the above rate tends to a number very close to 1 k in clustering method, which means that the k-anonymity is highly preserved. Through experiments on real spatial datasets, we evaluated the anonymity robustness, the so-called vulnerability, of the proposed method.

Download Full-text

A Privacy-Preserving Intelligent Medical Diagnosis System Based on Oblivious Keyword Search

Mathematical Problems in Engineering ◽

10.1155/2017/8632183 ◽

2017 ◽

Vol 2017 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Zhaowen Lin ◽

Xinglin Xiao ◽

Yi Sun ◽

Yudong Zhang ◽

Yan Ma

Keyword(s):

Medical Diagnosis ◽

Keyword Search ◽

Personal Information ◽

Privacy Preserving ◽

Search Process ◽

Health Examination ◽

Diagnosis System ◽

Paillier Cryptosystem ◽

Medical Diagnosis System ◽

Computational Ability

One of the concerns people have is how to get the diagnosis online without privacy being jeopardized. In this paper, we propose a privacy-preserving intelligent medical diagnosis system (IMDS), which can efficiently solve the problem. In IMDS, users submit their health examination parameters to the server in a protected form; this submitting process is based on Paillier cryptosystem and will not reveal any information about their data. And then the server retrieves the most likely disease (or multiple diseases) from the database and returns it to the users. In the above search process, we use the oblivious keyword search (OKS) as a basic framework, which makes the server maintain the computational ability but cannot learn any personal information over the data of users. Besides, this paper also provides a preprocessing method for data stored in the server, to make our protocol more efficient.

Download Full-text

A Comparative Study on University Admission Predictions Using Machine Learning Techniques

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit2172107 ◽

2021 ◽

pp. 537-548

Author(s):

Prince Golden ◽

Kasturi Mojesh ◽

Lakshmi Madhavi Devarapalli ◽

Pabbidi Naga Suba Reddy ◽

Srigiri Rajesh ◽

...

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

Education Systems ◽

University Admissions ◽

The Troubles ◽

Learning Techniques ◽

Cloud Servers

In this era of Cloud Computing and Machine Learning where every kind of work is getting automated through machine learning techniques running off of cloud servers to complete them more efficiently and quickly, what needs to be addressed is how we are changing our education systems and minimizing the troubles related to our education systems with all the advancements in technology. One of the the prominent issues in front of students has always been their graduate admissions and the colleges they should apply to. It has always been difficult to decide as to which university or college should they apply according to their marks obtained during their undergrad as not only it’s a tedious and time consuming thing to apply for number of universities at a single time but also expensive. Thus many machine learning solutions have emerged in the recent years to tackle this problem and provide various predictions, estimations and consultancies so that students can easily make their decisions about applying to the universities with higher chances of admission. In this paper, we review the machine learning techniques which are prevalent and provide accurate predictions regarding university admissions. We compare different regression models and machine learning methodologies such as, Random Forest, Linear Regression, Stacked Ensemble Learning, Support Vector Regression, Decision Trees, KNN(K-Nearest Neighbor) etc, used by other authors in their works and try to reach on a conclusion as to which technique will provide better accuracy.

Download Full-text

Trajectory Clustering and k-NN for Robust Privacy Preserving k-NN Query Processing in GeoSpark

Algorithms ◽

10.3390/a13080182 ◽

2020 ◽

Vol 13 (8) ◽

pp. 182

Author(s):

Elias Dritsas ◽

Andreas Kanavos ◽

Maria Trigka ◽

Gerasimos Vonitsanos ◽

Spyros Sioutas ◽

...

Keyword(s):

Big Data ◽

Spatial Data ◽

Privacy Preservation ◽

Nearest Neighbor ◽

Data Representation ◽

Privacy Preserving ◽

Temporal Data ◽

K Nearest Neighbor ◽

Trajectory Data ◽

Spatio Temporal

Privacy Preserving and Anonymity have gained significant concern from the big data perspective. We have the view that the forthcoming frameworks and theories will establish several solutions for privacy protection. The k-anonymity is considered a key solution that has been widely employed to prevent data re-identifcation and concerns us in the context of this work. Data modeling has also gained significant attention from the big data perspective. It is believed that the advancing distributed environments will provide users with several solutions for efficient spatio-temporal data management. GeoSpark will be utilized in the current work as it is a key solution that has been widely employed for spatial data. Specifically, it works on the top of Apache Spark, the main framework leveraged from the research community and organizations for big data transformation, processing and visualization. To this end, we focused on trajectory data representation so as to be applicable to the GeoSpark environment, and a GeoSpark-based approach is designed for the efficient management of real spatio-temporal data. Th next step is to gain deeper understanding of the data through the application of k nearest neighbor (k-NN) queries either using indexing methods or otherwise. The k-anonymity set computation, which is the main component for privacy preservation evaluation and the main issue of our previous works, is evaluated in the GeoSpark environment. More to the point, the focus here is on the time cost of k-anonymity set computation along with vulnerability measurement. The extracted results are presented into tables and figures for visual inspection.

Download Full-text

An Efficient Diagnosis System for Detection of Liver Disease Using a Novel Integrated Method Based on Principal Component Analysis and K-Nearest Neighbor (PCA-KNN)

International Journal of Healthcare Information Systems and Informatics ◽

10.4018/ijhisi.2016100103 ◽

2016 ◽

Vol 11 (4) ◽

pp. 56-69

Author(s):

Aman Singh ◽

Babita Pandey

Keyword(s):

Predictive Value ◽

Nearest Neighbor ◽

Kidney Diseases ◽

Principal Component ◽

Component Analysis ◽

K Nearest Neighbor ◽

Statistical Parameters ◽

Diagnosis System ◽

Integrated Method ◽

Effective Diagnosis

Talking about organ failure and people immediately recall kidney diseases. On the contrary, there is no such alertness about liver diseases and its failure despite the fact that this disease is one of the leading causes of mortality worldwide. Therefore, an effective diagnosis and in time treatment of patients is paramount. This study accordingly aims to construct an intelligent diagnosis system which integrates principle component analysis (PCA) and k-nearest neighbor (KNN) methods to examine the liver patient dataset. The model works with the combination of feature extraction and classification performed by PCA and KNN respectively. Prediction results of the proposed system are compared using statistical parameters that include accuracy, sensitivity, specificity, positive predictive value and negative predictive value. In addition to higher accuracy rates, the model also attained remarkable sensitivity and specificity, which were a challenging task given an uneven variance among attribute values in the dataset.

Download Full-text