Patient Data De-Identification

2020 ◽  
pp. 991-1010 ◽  
Author(s):  
Shweta Yadav ◽  
Asif Ekbal ◽  
Sriparna Saha ◽  
Parth S Pathak ◽  
Pushpak Bhattacharyya

With the rapid increment in the clinical text, de-identification of patient Protected Health Information (PHI) has drawn significant attention in recent past. This aims for automatic identification and removal of the patient Protected Health Information from medical records. This paper proposes a supervised machine learning technique for solving the problem of patient data de- identification. In the current paper, we provide an insight into the de-identification task, its major challenges, techniques to address challenges, detailed analysis of the results and direction of future improvement. We extract several features by studying the properties of the datasets and the domain. We build our model based on the 2014 i2b2 (Informatics for Integrating Biology to the Bedside) de-identification challenge. Experiments show that the proposed system is highly accurate in de-identification of the medical records. The system achieves the final recall, precision and F-score of 95.69%, 99.31%, and 97.46%, respectively.

Author(s):  
Shweta Yadav ◽  
Asif Ekbal ◽  
Sriparna Saha ◽  
Parth S Pathak ◽  
Pushpak Bhattacharyya

With the rapid increment in the clinical text, de-identification of patient Protected Health Information (PHI) has drawn significant attention in recent past. This aims for automatic identification and removal of the patient Protected Health Information from medical records. This paper proposes a supervised machine learning technique for solving the problem of patient data de- identification. In the current paper, we provide an insight into the de-identification task, its major challenges, techniques to address challenges, detailed analysis of the results and direction of future improvement. We extract several features by studying the properties of the datasets and the domain. We build our model based on the 2014 i2b2 (Informatics for Integrating Biology to the Bedside) de-identification challenge. Experiments show that the proposed system is highly accurate in de-identification of the medical records. The system achieves the final recall, precision and F-score of 95.69%, 99.31%, and 97.46%, respectively.


2020 ◽  
pp. 1502-1521
Author(s):  
Shweta Yadav ◽  
Asif Ekbal ◽  
Sriparna Saha ◽  
Parth S Pathak ◽  
Pushpak Bhattacharyya

With the rapid increment in the clinical text, de-identification of patient Protected Health Information (PHI) has drawn significant attention in recent past. This aims for automatic identification and removal of the patient Protected Health Information from medical records. This paper proposes a supervised machine learning technique for solving the problem of patient data de- identification. In the current paper, we provide an insight into the de-identification task, its major challenges, techniques to address challenges, detailed analysis of the results and direction of future improvement. We extract several features by studying the properties of the datasets and the domain. We build our model based on the 2014 i2b2 (Informatics for Integrating Biology to the Bedside) de-identification challenge. Experiments show that the proposed system is highly accurate in de-identification of the medical records. The system achieves the final recall, precision and F-score of 95.69%, 99.31%, and 97.46%, respectively.


1996 ◽  
Vol 26 (2) ◽  
pp. 82-87 ◽  
Author(s):  
Jeffrey Braithwaite ◽  
Johanna I Westbrook

This pilot survey examined the views of a sample of health service managers (HSMs) and health information managers (HIMs) undertaking tertiary studies about the application of information technology (IT) in health care. The survey was based on a questionnaire designed as part of a 1994 study of health service executives (HSEs) commissioned by the Australian College of Health Service Executives (ACHSE). We examined views about current and future IT expenditure, satisfaction with IT, impact of IT on quality and efficiency and the future use of electronic medical records and optical disk storage. Results identify differences and some similarities between respondent groups on these issues. The paper explores these differences and similarities and provides insight into the views held by future HSMs and HIMs.


2017 ◽  
Vol 27 (11) ◽  
pp. 3304-3324 ◽  
Author(s):  
Luca Bonomi ◽  
Xiaoqian Jiang

Modern medical research relies on multi-institutional collaborations which enhance the knowledge discovery and data reuse. While these collaborations allow researchers to perform analytics otherwise impossible on individual datasets, they often pose significant challenges in the data integration process. Due to the lack of a unique identifier, data integration solutions often have to rely on patient’s protected health information (PHI). In many situations, such information cannot leave the institutions or must be strictly protected. Furthermore, the presence of noisy values for these attributes may result in poor overall utility. While much research has been done to address these challenges, most of the current solutions are designed for a static setting without considering the temporal information of the data (e.g. EHR). In this work, we propose a novel approach that uses non-PHI for linking patient longitudinal data. Specifically, our technique captures the diagnosis dependencies using patterns which are shown to provide important indications for linking patient records. Our solution can be used as a standalone technique to perform temporal record linkage using non-protected health information data or it can be combined with Privacy Preserving Record Linkage solutions (PPRL) when protected health information is available. In this case, our approach can solve ambiguities in results. Experimental evaluations on real datasets demonstrate the effectiveness of our technique.


2013 ◽  
Vol 20 (2) ◽  
pp. 342-348 ◽  
Author(s):  
David Carrell ◽  
Bradley Malin ◽  
John Aberdeen ◽  
Samuel Bayer ◽  
Cheryl Clark ◽  
...  

2018 ◽  
Vol 116 ◽  
pp. 24-32 ◽  
Author(s):  
Liting Du ◽  
Chenxi Xia ◽  
Zhaohua Deng ◽  
Gary Lu ◽  
Shuxu Xia ◽  
...  

2019 ◽  
Vol 08 (02) ◽  
pp. 01-11
Author(s):  
Geetha Mahadevaiah ◽  
M.S Dinesh ◽  
Rithesh Sreenivasan ◽  
Sana Moin ◽  
Andre Dekker

2018 ◽  
Vol 2 (6) ◽  
Author(s):  
Hoala Greevy

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) privacy rule uses Protected Health Information (PHI) to define the type of patient information that’s protected by law.1 PHI is an important factor for HIPAA compliance. PHI isn’t confined to medical records and test results. Any information distributed by a business associate that can identify a patient and is used or disclosed to a covered entity during the course of care is considered PHI. Even if that information doesn’t reveal a patient’s medical history, it is still considered PHI.


Sign in / Sign up

Export Citation Format

Share Document