QuPiD Attack: Machine Learning-Based Privacy Quantification Mechanism for PIR Protocols in Health-Related Web Search

With the advancement in ICT, web search engines have become a preferred source to find health-related information published over the Internet. Google alone receives more than one billion health-related queries on a daily basis. However, in order to provide the results most relevant to the user, WSEs maintain the users’ profiles. These profiles may contain private and sensitive information such as the user’s health condition, disease status, and others. Health-related queries contain privacy-sensitive information that may infringe user’s privacy, as the identity of a user is exposed and may be misused by the WSE and third parties. This raises serious concerns since the identity of a user is exposed and may be misused by third parties. One well-known solution to preserve privacy involves issuing the queries via peer-to-peer private information retrieval protocol, such as useless user profile (UUP), thereby hiding the user’s identity from the WSE. This paper investigates the level of protection offered by UUP. For this purpose, we present QuPiD (query profile distance) attack: a machine learning-based attack that evaluates the effectiveness of UUP in privacy protection. QuPiD attack determines the distance between the user’s profile (web search history) and upcoming query using our proposed novel feature vector. The experiments were conducted using ten classification algorithms belonging to the tree-based, rule-based, lazy learner, metaheuristic, and Bayesian families for the sake of comparison. Furthermore, two subsets of an America Online dataset (noisy and clean datasets) were used for experimentation. The results show that the proposed QuPiD attack associates more than 70% queries to the correct user with a precision of over 72% for the clean dataset, while for the noisy dataset, the proposed QuPiD attack associates more than 40% queries to the correct user with 70% precision.

Download Full-text

Privacy Exposure Measure: A Privacy-Preserving Technique for Health-Related Web Search

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2019.2709 ◽

2019 ◽

Vol 9 (6) ◽

pp. 1196-1204 ◽

Cited By ~ 1

Author(s):

Rafiullah Khan ◽

Muhammad Arshad Islam ◽

Mohib Ullah ◽

Muhammad Aleem ◽

Muhammad Azhar Iqbal

Keyword(s):

Health Information ◽

Private Information ◽

Web Search ◽

User Behavior ◽

Privacy Preserving ◽

Personal Health Information ◽

Personal Health ◽

Sensitive Information ◽

Private Information Retrieval ◽

Health Related

The increasing use of web search engines (WSEs) for searching healthcare information has resulted in a growing number of users posting personal health information online. A recent survey demonstrates that over 80% of patients use WSE to seek health information. However, WSE stores these user's queries to analyze user behavior, result ranking, personalization, targeted advertisements, and other activities. Since health-related queries contain privacy-sensitive information that may infringe user's privacy. Therefore, privacy-preserving web search techniques such as anonymizing networks, profile obfuscation, private information retrieval (PIR) protocols etc. are used to ensure the user's privacy. In this paper, we propose Privacy Exposure Measure (PEM), a technique that facilitates user to control his/her privacy exposure while using the PIR protocols. PEM assesses the similarity between the user's profile and query before posting to WSE and assists the user in avoiding privacy exposure. The experiments demonstrate 37.2% difference between users' profile created through PEM-powered-PIR protocol and other usual users' profile. Moreover, PEM offers more privacy to the user even in case of machine-learning attack.

Download Full-text

Early Detection of Cardiovascular Disease using Machine learning Techniques an Experimental Study

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c46570.99320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 635-641

Keyword(s):

Machine Learning ◽

Proper Time ◽

Daily Basis ◽

Machine Learning Algorithms ◽

The Other ◽

Machine Learning Techniques ◽

Entire Body ◽

Time Data ◽

Learning Techniques ◽

Health Related

Human body prioritizes the heart as the second most important organ after the brain. Any disruption in the heart ultimately leads to disruption of the entire body. Being the members of modern era, enormous changes are happening to us on a daily basis that impact our lives in one way or the other. A major disease among top five fatal diseases includes the heart disease which has been consuming lives worldwide. Therefore, the prediction of this disease is of prime importance as it will enable one to take a proper and needful approach at a proper time. Data mining and machine learning are taking out and refining of useful information from a massive amount of data. It is a basic and primary process in defining and discovering useful information and hidden patterns from databases. The flexibility and adaptability of optimization algorithms find its use in dealing with complex non -linear problems. Machine Learning techniques find its use in medical sciences in solving real health-related issues by early prediction and treatment of various diseases. In this paper, six machine learning algorithms are used and then compared accordingly based on the evaluation of performance. Among all classifiers, decision tree outperforms over the other algorithms with a testing accuracy of 97.29%.

Download Full-text

NN-QuPiD Attack: Neural Network-Based Privacy Quantification Model for Private Information Retrieval Protocols

Complexity ◽

10.1155/2021/6651662 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Rafiullah Khan ◽

Mohib Ullah ◽

Atif Khan ◽

Muhammad Irfan Uddin ◽

Maha Al-Yahya

Keyword(s):

Neural Network ◽

Information Retrieval ◽

Private Information ◽

Market Research ◽

Web Search ◽

Health Condition ◽

Private Information Retrieval ◽

Quantification Model ◽

Search History ◽

Personal Interests

Web search engines usually keep users’ profiles for multiple purposes, such as result ranking and relevancy, market research, and targeted advertisements. However, user web search history may contain sensitive and private information about the user, such as health condition, personal interests, and affiliations that may infringe users’ privacy since a user’s identity may be exposed and misused by third parties. Numerous techniques are available to address privacy infringement, including Private Information Retrieval (PIR) protocols that use peer nodes to preserve privacy. Previously, we have proved that PIR protocols are vulnerable to the QuPiD Attack. In this research, we proposed NN-QuPiD Attack, an improved version of QuPiD Attack that uses an Artificial Neural Network (RNN) based model to associate queries with their original users. The results show that the NN-QuPiD Attack gave 0.512 Recall with the Precision of 0.923, whereas simple QuPiD Attack gave 0.49 Recall with the Precision of 0.934 with the same data.

Download Full-text

Personalized stratification of back to work risk amidst COVID-19: A machine learning approach (Preprint)

10.2196/preprints.22030 ◽

2020 ◽

Author(s):

Carson Lam ◽

Jacob Calvert ◽

Gina Barnes ◽

Emily Pellegrini ◽

Anna Lynn-Palevsky ◽

...

Keyword(s):

Machine Learning ◽

High Risk ◽

Learning Algorithm ◽

Severe Disease ◽

High Specificity ◽

Population Level ◽

The United States ◽

Health Condition ◽

Available P ◽

Severe Illness

BACKGROUND In the wake of COVID-19, the United States has developed a three stage plan to outline the parameters to determine when states may reopen businesses and ease travel restrictions. The guidelines also identify subpopulations of Americans that should continue to stay at home due to being at high risk for severe disease should they contract COVID-19. These guidelines were based on population level demographics, rather than individual-level risk factors. As such, they may misidentify individuals at high risk for severe illness and who should therefore not return to work until vaccination or widespread serological testing is available. OBJECTIVE This study evaluated a machine learning algorithm for the prediction of serious illness due to COVID-19 using inpatient data collected from electronic health records. METHODS The algorithm was trained to identify patients for whom a diagnosis of COVID-19 was likely to result in hospitalization, and compared against four U.S policy-based criteria: age over 65, having a serious underlying health condition, age over 65 or having a serious underlying health condition, and age over 65 and having a serious underlying health condition. RESULTS This algorithm identified 80% of patients at risk for hospitalization due to COVID-19, versus at most 62% that are identified by government guidelines. The algorithm also achieved a high specificity of 95%, outperforming government guidelines. CONCLUSIONS This algorithm may help to enable a broad reopening of the American economy while ensuring that patients at high risk for serious disease remain home until vaccination and testing become available.

Download Full-text

STUDY CONCERNING TO THE POSSIBILITY OF BICYCLE FOR THE AGED PEOPLE TO USE BICYCLES ON THE DAILY BASIS BOTH AS THE MEANS FOR TRIP OF GOING OUT AND MEANS FOR PROMOTING HEALTH CONDITION

Journal of Japan Society of Civil Engineers Ser D3 (Infrastructure Planning and Management) ◽

10.2208/jscejipm.74.i_897 ◽

2018 ◽

Vol 74 (5) ◽

pp. I_897-I_908

Author(s):

Muneharu KOKURA ◽

Toshiaki SATO ◽

Yasuo YOSHIKAWA

Keyword(s):

Daily Basis ◽

Health Condition ◽

Aged People

Download Full-text

Prediction of Pest Insect Appearance Using Sensors and Machine Learning

Sensors ◽

10.3390/s21144846 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4846

Author(s):

Dušan Marković ◽

Dejan Vujičić ◽

Snežana Tanasković ◽

Borislav Đorđević ◽

Siniša Ranđić ◽

...

Keyword(s):

Machine Learning ◽

Relative Humidity ◽

Weather Conditions ◽

Daily Basis ◽

Machine Learning Algorithms ◽

Lower Percentage ◽

Timely Manner ◽

Proposed Model ◽

Set Up ◽

Accuracy Of Prediction

The appearance of pest insects can lead to a loss in yield if farmers do not respond in a timely manner to suppress their spread. Occurrences and numbers of insects can be monitored through insect traps, which include their permanent touring and checking of their condition. Another more efficient way is to set up sensor devices with a camera at the traps that will photograph the traps and forward the images to the Internet, where the pest insect’s appearance will be predicted by image analysis. Weather conditions, temperature and relative humidity are the parameters that affect the appearance of some pests, such as Helicoverpa armigera. This paper presents a model of machine learning that can predict the appearance of insects during a season on a daily basis, taking into account the air temperature and relative humidity. Several machine learning algorithms for classification were applied and their accuracy for the prediction of insect occurrence was presented (up to 76.5%). Since the data used for testing were given in chronological order according to the days when the measurement was performed, the existing model was expanded to take into account the periods of three and five days. The extended method showed better accuracy of prediction and a lower percentage of false detections. In the case of a period of five days, the accuracy of the affected detections was 86.3%, while the percentage of false detections was 11%. The proposed model of machine learning can help farmers to detect the occurrence of pests and save the time and resources needed to check the fields.

Download Full-text

Review of visible watermark’s influence on private information-related machine learning techniques

Journal of Physics Conference Series ◽

10.1088/1742-6596/1453/1/012070 ◽

2020 ◽

Vol 1453 ◽

pp. 012070

Author(s):

Yizhe Lyu

Keyword(s):

Machine Learning ◽

Private Information ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Intelligent Detection of False Information in Arabic Tweets Utilizing Hybrid Harris Hawks Based Feature Selection and Machine Learning Models

Symmetry ◽

10.3390/sym13040556 ◽

2021 ◽

Vol 13 (4) ◽

pp. 556

Author(s):

Thaer Thaher ◽

Mahmoud Saheb ◽

Hamza Turabieh ◽

Hamouda Chantar

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Language Processing ◽

User Profile ◽

Vital Role ◽

Classification Model ◽

Fake News ◽

False Information ◽

Social Media Platforms

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.

Download Full-text

Schizophrenia: A Survey of Artificial Intelligence Techniques Applied to Detection and Classification

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18116099 ◽

2021 ◽

Vol 18 (11) ◽

pp. 6099

Author(s):

Joel Weijia Lai ◽

Candice Ke En Ang ◽

U. Rajendra Acharya ◽

Kang Hao Cheong

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Daily Living ◽

Learning Algorithms ◽

Health Condition ◽

Machine Learning Algorithms ◽

Human Cognition ◽

Computer Algorithms ◽

Large Sets ◽

Prevention Methods

Artificial Intelligence in healthcare employs machine learning algorithms to emulate human cognition in the analysis of complicated or large sets of data. Specifically, artificial intelligence taps on the ability of computer algorithms and software with allowable thresholds to make deterministic approximate conclusions. In comparison to traditional technologies in healthcare, artificial intelligence enhances the process of data analysis without the need for human input, producing nearly equally reliable, well defined output. Schizophrenia is a chronic mental health condition that affects millions worldwide, with impairment in thinking and behaviour that may be significantly disabling to daily living. Multiple artificial intelligence and machine learning algorithms have been utilized to analyze the different components of schizophrenia, such as in prediction of disease, and assessment of current prevention methods. These are carried out in hope of assisting with diagnosis and provision of viable options for individuals affected. In this paper, we review the progress of the use of artificial intelligence in schizophrenia.

Download Full-text