scholarly journals HaploHide: A Data Hiding Framework for Privacy Enhanced Sharing of Personal Genetic Data

2019 ◽  
Author(s):  
Arif Harmanci ◽  
Xiaoqian Jiang ◽  
Degui Zhi

AbstractPersonal genetic data is becoming a digital commodity as millions of individuals have direct access to and control of their genetic information. This information must be protected as it can be used for reidentification and potential discrimination of individuals and relatives. While there is a great incentive to share and use genetic information, there are limited number of practical approaches for protecting it when individuals would like to make use of their genomes in clinical and recreational settings. To enable privacy-enhanced usage of genomic data by individuals, we propose a crowd-blending-based framework where portions of the individual’s haplotype is “hidden” within a large sample of other haplotypes. The hiding framework is motivated by the existence of large-scale population panels that we utilize for generation of the crowd of haplotypes in which the individual’s haplotype is hidden. We demonstrate the usage of hiding in two different scenarios: Sharing of variant alleles on genes and sharing of GWAS variant alleles. We evaluate hiding framework by testing reidentification of hidden individuals using numerous measures of individual reidentification. In these settings, we discuss how effective hiding can be accomplished when the adversary does not have access to auxiliary identifying information. Compared to the existing approaches for protecting privacy, which require substantial changes in the computational infrastructure, e.g., homomorphic encryption, hiding-based framework does not incur any changes to the infrastructure. However, the processing must be performed for every sample in the crowd and therefore data processing cost will increase as the crowd size increases.

Author(s):  
Yanbing Bai ◽  
Lu Sun ◽  
Haoyu Liu ◽  
Chao Xie

Large-scale population movements can turn local diseases into widespread epidemics. Grasping the characteristic of the population flow in the context of the COVID-19 is of great significance for providing information to epidemiology and formulating scientific and reasonable prevention and control policies. Especially in the post-COVID-19 phase, it is essential to maintain the achievement of the fight against the epidemic. Previous research focuses on flight and railway passenger travel behavior and patterns, but China also has numerous suburban residents with a not-high economic level; investigating their travel behaviors is significant for national stability. However, estimating the impacts of the COVID-19 for suburban residents’ travel behaviors remains challenging because of lacking apposite data. Here we submit bus ticketing data including approximately 26,000,000 records from April 2020–August 2020 for 2705 stations. Our results indicate that Suburban residents in Chinese Southern regions are more likely to travel by bus, and travel frequency is higher. Associated with the economic level, we find that residents in the economically developed region more likely to travel or carry out various social activities. Considering from the perspective of the traveling crowd, we find that men and young people are easier to travel by bus; however, they are exactly the main workforce. The indication of our findings is that suburban residents’ travel behavior is affected profoundly by economy and consistent with the inherent behavior patterns before the COVID-19 outbreak. We use typical regions as verification and it is indeed the case.


2021 ◽  
Author(s):  
Liang Wang ◽  
Xavier Didelot ◽  
Yuhai Bi ◽  
George Fu Gao

Since the start of the SARS-CoV-2 pandemic in late 2019, several variants of concern (VOC) have been reported, such as B.1.1.7, B.1.351, P.1, and B.1.617.2. The exact reproduction number Rt for these VOCs is important to determine appropriate control measures. Here, we estimated the transmissibility for VOCs and lineages of SAR-CoV-2 based on genomic data and Bayesian inference under an epidemiological model to infer the reproduction number (Rt). We analyzed data for multiple VOCs from the same time period and countries, in order to compare their transmissibility while controlling for geographical and temporal factors. The lineage B had a significantly higher transmissibility than lineage A, and contributed to the global pandemic to a large extent. In addition, all VOCs had increased transmissibility when compared with other lineages in each country, indicating they are harder to control and present a high risk to public health. All countries should formulate specific prevention and control policies for these VOCs when they are detected to curve their potential for large-scale spread.


2021 ◽  
Author(s):  
Seungwan Hong ◽  
Jai Hyun Park ◽  
Wonhee Cho ◽  
Hyeongmin Choe ◽  
Jung Hee Cheon

Abstract Background: In a secure genome analysis competition called iDASH 2020, the homomorphic encryption task was to develop a multi-label tumor classification method for predicting the classes of samples based on genetic information. The scenario is that a data holder encrypts a genetic variant dataset from tumor samples and provides the encrypted data to an untrusted server. Then, the server evaluates homomorphically encrypted data in its model which is trained in plaintext using the published data or own genetic data and outputs the result in an encrypted state so that there is no leakage of genetic information. Methods: We develop a secure multi-label tumor classification method using the CKKS scheme, the approximate homomorphic encryption scheme. We first propose a new data preprocessing method to reduce the size of large-scale genetic data of tumor samples. Our method aims to analyze the dataset from iDASH 2020 competition track I, which originated from The Cancer Genome Atlas (TCGA) dataset, which consists of 2,713 samples from 11 types of cancers, genetic features from more than 25,000 genes. Secondly, we propose the new data packing method for CKKS ciphertext to provide a trade-off between the number of ciphertexts and the number of rotations in matrix multiplication. Lastly, we suggest the approximation method for softmax activation of a neural network model.Results: Our preprocessing method reduces the number of genes from more than 25,000 to 2048 or less and achieves a microAUC value of 0.9865 with a 1-layer shallow neural network. Using our model, we successfully compute the tumor classification inference steps on the encrypted test data in 4.5 minutes. Despite using the approximate softmax function, the difference in microAUC value from our implementation results in the encrypted state is less than 10-3 compared to the plain result.Conclusions: We present preprocessing and evaluation methods for secure multi-label tumor classification based on approximate homomorphic encryption using a shallow neural network model with the softmax activation function.


2020 ◽  
Author(s):  
Miran Kim ◽  
Arif Harmanci ◽  
Jean-Philippe Bossuat ◽  
Sergiu Carpov ◽  
Jung Hee Cheon ◽  
...  

ABSTRACTGenotype imputation is a fundamental step in genomic data analysis such as GWAS, where missing variant genotypes are predicted using the existing genotypes of nearby ‘tag’ variants. Imputation greatly decreases the genotyping cost and provides high-quality estimates of common variant genotypes. As population panels increase, e.g., the TOPMED Project, genotype imputation is becoming more accurate, but it requires high computational power. Although researchers can outsource genotype imputation, privacy concerns may prohibit genetic data sharing with an untrusted imputation service. To address this problem, we developed the first fully secure genotype imputation by utilizing ultra-fast homomorphic encryption (HE) techniques that can evaluate millions of imputation models in seconds. In HE-based methods, the genotype data is end-to-end encrypted, i.e., encrypted in transit, at rest, and, most importantly, in analysis, and can be decrypted only by the data owner. We compared secure imputation with three other state-of-the-art non-secure methods under different settings. We found that HE-based methods provide full genetic data security with comparable or slightly lower accuracy. In addition, HE-based methods have time and memory requirements that are comparable and even lower than the non-secure methods. We provide five different implementations and workflows that make use of three cutting-edge HE schemes (BFV, CKKS, TFHE) developed by the top contestants of the iDASH19 Genome Privacy Challenge. Our results provide strong evidence that HE-based methods can practically perform resource-intensive computations for high throughput genetic data analysis. In addition, the publicly available codebases provide a reference for the development of secure genomic data analysis methods.


2001 ◽  
Author(s):  
Bradley Olson ◽  
Leonard Jason ◽  
Joseph R. Ferrari ◽  
Leon Venable ◽  
Bertel F. Williams ◽  
...  

2020 ◽  
Vol 39 (4) ◽  
pp. 5449-5458
Author(s):  
A. Arokiaraj Jovith ◽  
S.V. Kasmir Raja ◽  
A. Razia Sulthana

Interference in Wireless Sensor Network (WSN) predominantly affects the performance of the WSN. Energy consumption in WSN is one of the greatest concerns in the current generation. This work presents an approach for interference measurement and interference mitigation in point to point network. The nodes are distributed in the network and interference is measured by grouping the nodes in the region of a specific diameter. Hence this approach is scalable and isextended to large scale WSN. Interference is measured in two stages. In the first stage, interference is overcome by allocating time slots to the node stations in Time Division Multiple Access (TDMA) fashion. The node area is split into larger regions and smaller regions. The time slots are allocated to smaller regions in TDMA fashion. A TDMA based time slot allocation algorithm is proposed in this paper to enable reuse of timeslots with minimal interference between smaller regions. In the second stage, the network density and control parameter is introduced to reduce interference in a minor level within smaller node regions. The algorithm issimulated and the system is tested with varying control parameter. The node-level interference and the energy dissipation at nodes are captured by varying the node density of the network. The results indicate that the proposed approach measures the interference and mitigates with minimal energy consumption at nodes and with less overhead transmission.


Author(s):  
О. Кravchuk ◽  
V. Symonenkov ◽  
I. Symonenkova ◽  
O. Hryhorev

Today, more than forty countries of the world are engaged in the development of military-purpose robots. A number of unique mobile robots with a wide range of capabilities are already being used by combat and intelligence units of the Armed forces of the developed world countries to conduct battlefield intelligence and support tactical groups. At present, the issue of using the latest information technology in the field of military robotics is thoroughly investigated, and the creation of highly effective information management systems in the land-mobile robotic complexes has acquired a new phase associated with the use of distributed information and sensory systems and consists in the transition from application of separate sensors and devices to the construction of modular information subsystems, which provide the availability of various data sources and complex methods of information processing. The purpose of the article is to investigate the ways to increase the autonomy of the land-mobile robotic complexes using in a non-deterministic conditions of modern combat. Relevance of researches is connected with the necessity of creation of highly effective information and control systems in the perspective robotic means for the needs of Land Forces of Ukraine. The development of the Armed Forces of Ukraine management system based on the criteria adopted by the EU and NATO member states is one of the main directions of increasing the effectiveness of the use of forces (forces), which involves achieving the principles and standards necessary for Ukraine to become a member of the EU and NATO. The inherent features of achieving these criteria will be the transition to a reduction of tasks of the combined-arms units and the large-scale use of high-precision weapons and land remote-controlled robotic devices. According to the views of the leading specialists in the field of robotics, the automation of information subsystems and components of the land-mobile robotic complexes can increase safety, reliability, error-tolerance and the effectiveness of the use of robotic means by standardizing the necessary actions with minimal human intervention, that is, a significant increase in the autonomy of the land-mobile robotic complexes for the needs of Land Forces of Ukraine.


2010 ◽  
Vol 108-111 ◽  
pp. 1158-1163 ◽  
Author(s):  
Peng Cheng Nie ◽  
Di Wu ◽  
Weiong Zhang ◽  
Yan Yang ◽  
Yong He

In order to improve the information management of the modern digital agriculture, combined several modern digital agriculture technologies, namely wireless sensor network (WSN), global positioning system (GPS), geographic information system (GIS) and general packet radio service (GPRS), and applied them to the information collection and intelligent control process of the modern digital agriculture. Combining the advantage of the local multi-channel information collection and the low-power wireless transmission of WSN, the stable and low cost long-distance communication and data transmission ability of GPRS, the high-precision positioning technology of the DGPS positioning and the large-scale field information layer-management technology of GIS, such a hybrid technology combination is applied to the large-scale field information and intelligent management. In this study, wireless sensor network routing nodes are disposed in the sub-area of field. These nodes have GPS receiver modules and the electric control mechanism, and are relative positioned by GPS. They can real-time monitor the field information and control the equipment for the field application. When the GPS position information and other collected field information are measured, the information can be remotely transmitted to PC by GPRS. Then PC can upload the information to the GIS management software. All the field information can be classified into different layers in GIS and shown on the GIS map based on their GPS position. Moreover, we have developed remote control software based on GIS. It can send the control commands through GPRS to the nodes which have control modules; and then we can real-time manage and control the field application. In conclusion, the unattended automatic wireless intelligent technology for the field information collection and control can effectively utilize hardware resources, improve the field information intelligent management and reduce the information and intelligent cost.


Sign in / Sign up

Export Citation Format

Share Document