scholarly journals Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review (Preprint)

2019 ◽  
Author(s):  
Raphaël Chevrier ◽  
Vasiliki Foufi ◽  
Christophe Gaudet-Blavignac ◽  
Arnaud Robert ◽  
Christian Lovis

BACKGROUND The secondary use of health data is central to biomedical research in the era of data science and precision medicine. National and international initiatives, such as the Global Open Findable, Accessible, Interoperable, and Reusable (GO FAIR) initiative, are supporting this approach in different ways (eg, making the sharing of research data mandatory or improving the legal and ethical frameworks). Preserving patients’ privacy is crucial in this context. De-identification and anonymization are the two most common terms used to refer to the technical approaches that protect privacy and facilitate the secondary use of health data. However, it is difficult to find a consensus on the definitions of the concepts or on the reliability of the techniques used to apply them. A comprehensive review is needed to better understand the domain, its capabilities, its challenges, and the ratio of risk between the data subjects’ privacy on one side, and the benefit of scientific advances on the other. OBJECTIVE This work aims at better understanding how the research community comprehends and defines the concepts of de-identification and anonymization. A rich overview should also provide insights into the use and reliability of the methods. Six aspects will be studied: (1) terminology and definitions, (2) backgrounds and places of work of the researchers, (3) reasons for anonymizing or de-identifying health data, (4) limitations of the techniques, (5) legal and ethical aspects, and (6) recommendations of the researchers. METHODS Based on a scoping review protocol designed a priori, MEDLINE was searched for publications discussing de-identification or anonymization and published between 2007 and 2017. The search was restricted to MEDLINE to focus on the life sciences community. The screening process was performed by two reviewers independently. RESULTS After searching 7972 records that matched at least one search term, 135 publications were screened and 60 full-text articles were included. (1) Terminology: Definitions of the terms de-identification and anonymization were provided in less than half of the articles (29/60, 48%). When both terms were used (41/60, 68%), their meanings divided the authors into two equal groups (19/60, 32%, each) with opposed views. The remaining articles (3/60, 5%) were equivocal. (2) Backgrounds and locations: Research groups were based predominantly in North America (31/60, 52%) and in the European Union (22/60, 37%). The authors came from 19 different domains; computer science (91/248, 36.7%), biomedical informatics (47/248, 19.0%), and medicine (38/248, 15.3%) were the most prevalent ones. (3) Purpose: The main reason declared for applying these techniques is to facilitate biomedical research. (4) Limitations: Progress is made on specific techniques but, overall, limitations remain numerous. (5) Legal and ethical aspects: Differences exist between nations in the definitions, approaches, and legal practices. (6) Recommendations: The combination of organizational, legal, ethical, and technical approaches is necessary to protect health data. CONCLUSIONS Interest is growing for privacy-enhancing techniques in the life sciences community. This interest crosses scientific boundaries, involving primarily computer science, biomedical informatics, and medicine. The variability observed in the use of the terms de-identification and anonymization emphasizes the need for clearer definitions as well as for better education and dissemination of information on the subject. The same observation applies to the methods. Several legislations, such as the American Health Insurance Portability and Accountability Act (HIPAA) and the European General Data Protection Regulation (GDPR), regulate the domain. Using the definitions they provide could help address the variable use of these two concepts in the research community.

2020 ◽  
Author(s):  
Catherine Arnott Smith ◽  
Deahan Yu ◽  
Juan Fernando Maestre ◽  
Uba Backonja ◽  
Andrew Boyd ◽  
...  

BACKGROUND Informatics tools for consumers and patients are important vehicles for facilitating engagement, and the field of consumer health informatics is an key space for exploring the potential of these tools. To understand research findings in this complex and heterogeneous field, a scoping review can help not only to identify, but to bridge, the array of diverse disciplines and publication venues involved. OBJECTIVE The goal of this systematic scoping review was to characterize the extent; range; and nature of research activity in consumer health informatics, focusing on the contributing disciplines of informatics; information science; and engineering. METHODS Four electronic databases (Compendex, LISTA, Library Literature, and INSPEC) were searched for published studies dating from January 1, 2008, to June 1, 2015. Our inclusion criteria specified that they be English-language articles describing empirical studies focusing on consumers; relate to human health; and feature technologies designed to interact directly with consumers. Clinical applications and technologies regulated by the FDA, as well as digital tools that do not provide individualized information, were excluded. RESULTS We identified 271 studies in 63 unique journals and 22 unique conference proceedings. Sixty-five percent of these studies were found in health informatics journals; 23% in information science and library science; 15% in computer science; 4% in medicine; and 5% in other fields, ranging from engineering to education. A single journal, the Journal of Medical Internet Research, was home to 36% of the studies. Sixty-two percent of these studies relied on quantitative methods, 55% on qualitative methods, and 17% were mixed-method studies. Seventy percent of studies used no specific theoretical framework; of those that did, Social Cognitive Theory appeared the most frequently, in 16 studies. Fifty-two studies identified problems with technology adoption, acceptance, or use, 38% of these barriers being machine-centered (for example, content or computer-based), and 62% user-centered, the most frequently mentioned being attitude and motivation toward technology. One hundred and twenty-six interventional studies investigated disparities or heterogeneity in treatment effects in specific populations. The most frequent disparity investigated was gender (13 studies), followed closely by race/ethnicity (11). Half the studies focused on a specific diagnosis, most commonly diabetes and cancer; 30% focused on a health behavior, usually information-seeking. Gaps were found in reporting of study design, with only 46% of studies reporting on specific methodological details. Missing details were response rates, since 59% of survey studies did not provide them; and participant retention rates, since 53% of interventional studies did not provide this information. Participant demographics were usually not reported beyond gender and age. Only 17% studies informed the reader of their theoretical basis, and only 4 studies focused on theory at the group, network, organizational or ecological levels—the majority being either health behavior or interpersonal theories. Finally, of the 131 studies describing the design of a new technology, 81% did not involve either patients or consumers in their design. In fact, while consumer and patient were necessarily core concepts in this literature, these terms were often used interchangeably. The research literature of consumer health informatics at present is scattered across research fields; only 49% of studies from these disciplines is indexed by MEDLINE and studies in computer science are siloed in a user interface that makes exploration of that literature difficult. CONCLUSIONS Few studies analyzed in this scoping review were based in theory, and very little was presented in this literature about the life context, motives for technology use, and personal characteristics of study participants.


Author(s):  
Annabelle Cumyn ◽  
Roxanne Dault ◽  
Adrien Barton ◽  
Anne-Marie Cloutier ◽  
Jean-François Ethier

A survey was conducted to assess citizens, research ethics committee members, and researchers’ attitude toward information and consent for the secondary use of health data for research within learning health systems (LHSs). Results show that the reuse of health data for research to advance knowledge and improve care is valued by all parties; consent regarding health data reuse for research has fundamental importance particularly to citizens; and all respondents deemed important the existence of a secure website to support the information and consent processes. This survey was part of a larger project that aims at exploring public perspectives on alternate approaches to the current consent models for health data reuse to take into consideration the unique features of LHSs. The revised model will need to ensure that citizens are given the opportunity to be better informed about upcoming research and have their say, when possible, in the use of their data.


2021 ◽  
Vol 10 (4) ◽  
pp. 1-28
Author(s):  
John Edison MUñOZ ◽  
Kerstin Dautenhahn

The use of games as vehicles to study human-robot interaction (HRI) has been established as a suitable solution to create more realistic and naturalistic opportunities to investigate human behavior. In particular, multiplayer games that involve at least two human players and one or more robots have raised the attention of the research community. This article proposes a scoping review to qualitatively examine the literature on the use of multiplayer games in HRI scenarios employing embodied robots aiming to find experimental patterns and common game design elements. We find that researchers have been using multiplayer games in a wide variety of applications in HRI, including training, entertainment and education, allowing robots to take different roles. Moreover, robots have included different capabilities and sensing technologies, and elements such as external screens or motion controllers were used to foster gameplay. Based on our findings, we propose a design taxonomy called Robo Ludens, which identifies HRI elements and game design fundamentals and classifies important components used in multiplayer HRI scenarios. The Robo Ludens taxonomy covers considerations from a robot-oriented perspective as well as game design aspects to provide a comprehensive list of elements that can foster gameplay and bring enjoyable experiences in HRI scenarios.


2021 ◽  
pp. 019394592110292
Author(s):  
Elizabeth E. Umberfield ◽  
Sharon L. R. Kardia ◽  
Yun Jiang ◽  
Andrea K. Thomer ◽  
Marcelline R. Harris

Nurse scientists are increasingly interested in conducting secondary research using real world collections of biospecimens and health data. The purposes of this scoping review are to (a) identify federal regulations and norms that bear authority or give guidance over reuse of residual clinical biospecimens and health data, (b) summarize domain experts’ interpretations of permissions of such reuse, and (c) summarize key issues for interpreting regulations and norms. Final analysis included 25 manuscripts and 23 regulations and norms. This review illustrates contextual complexity for reusing residual clinical biospecimens and health data, and explores issues such as privacy, confidentiality, and deriving genetic information from biospecimens. Inconsistencies make it difficult to interpret, which regulations or norms apply, or if applicable regulations or norms are congruent. Tools are necessary to support consistent, expert-informed consent processes and downstream reuse of residual clinical biospecimens and health data by nurse scientists.


Animals ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 673
Author(s):  
Alexandra L. Whittaker ◽  
Yifan Liu ◽  
Timothy H. Barker

The Mouse Grimace Scale (MGS) was developed 10 years ago as a method for assessing pain through the characterisation of changes in five facial features or action units. The strength of the technique is that it is proposed to be a measure of spontaneous or non-evoked pain. The time is opportune to map all of the research into the MGS, with a particular focus on the methods used and the technique’s utility across a range of mouse models. A comprehensive scoping review of the academic literature was performed. A total of 48 articles met our inclusion criteria and were included in this review. The MGS has been employed mainly in the evaluation of acute pain, particularly in the pain and neuroscience research fields. There has, however, been use of the technique in a wide range of fields, and based on limited study it does appear to have utility for pain assessment across a spectrum of animal models. Use of the method allows the detection of pain of a longer duration, up to a month post initial insult. There has been less use of the technique using real-time methods and this is an area in need of further research.


2006 ◽  
Vol 34 (3) ◽  
pp. 520-525 ◽  
Author(s):  
Margaret A. Winker

Race and ethnicity are commonly reported variables in biomedical research, but how they were initially determined is often not described and the rationale for analyzing them is often not provided. JAMA improved the reporting of these factors by implementing a policy and procedure for doing so. However, still lacking are careful consideration of what is actually being measured when race/ethnicity is described, consistent terminology, hypothesis-driven justification for analyzing race/ethnicity, and a consistent and generalizable measurement of socioeconomic status. Furthermore, some studies continue to use race/ethnicity as a proxy for genetics. Research into appropriate measures of race/ethnicity and socioeconomic factors, as well as education of researchers regarding issues of race/ethnicity, is necessary to clarify the meaning of race/ethnicity in the biomedical literature.


2021 ◽  
Author(s):  
Meghan Shyama Nagpal ◽  
Antonia Barbaric ◽  
Diana Sherifali ◽  
Plinio P Morita ◽  
Joseph A Cafazzo

BACKGROUND Complications due to Type 2 Diabetes (T2D) can be mitigated through proper self-management which can positively change health behaviours. Technological tools are available to help people living with T2D manage their condition and such tools provide a large repository for patient-generated health data (PGHD). Analytics can provide insights about the ambulatory behaviours of people living with T2D. OBJECTIVE The objective of this review was to investigate analytical insights can be derived through PGHD with respect to ambulatory behaviours of people living with T2D. METHODS A scoping review using the Arksey & O’Malley framework was conducted in which a comprehensive search of the literature was conducted by two reviewers. Three electronic databases (PubMed, IEEE, ACM) were searched using keywords associated with diabetes, behaviours, and analytics. Several rounds of screening using predetermined inclusion and exclusion criteria were conducted and studies were selected. Critical examination took place through a descriptive-analytical narrative method and data extracted from the studies was classified into thematic categories. These categories reflect the findings of this study as per our objective. RESULTS We identified 43 studies that met the inclusion criteria for this review. While 70% of the studies examined PGHD independently, 30% of the studies combined PGHD with other data sources. The majority of these studies used machine learning algorithms to perform their analysis. Themes identified through this review include 1) predicting diabetes / obesity, 2) factors that contribute to diabetes / obesity, 3) insights from social media & online forums, 4) predicting glycemia, 5) improved adherence / outcomes, 6) analysis of sedentary behaviours, 7) deriving behavioural patterns, 8) discovering clinical findings, and 9) developing design principles. CONCLUSIONS The increased volume and availability of PGHD has the potential to derive analytical insights regarding the ambulatory behaviours of people living with T2D. From the literature, we determined that analytics can predict outcomes and identify granular behavioural patterns from PGHD. This review determined the broad range of insights that can be examined through PGHD, that would not be available through other data sources.


2014 ◽  
Vol 23 (01) ◽  
pp. 167-169 ◽  
Author(s):  
N. Griffon ◽  
J. Charlet ◽  
S. J. Darmoni ◽  

Summary Objective: To summarize the best papers in the field of Knowledge Representation and Management (KRM). Methods: A comprehensive review of medical informatics literature was performed to select some of the most interesting papers of KRM and natural language processing (NLP) published in 2013. Results: Four articles were selected, one focuses on Electronic Health Record (EHR) interoperability for clinical pathway personalization based on structured data. The other three focus on NLP (corpus creation, de-identification, and co-reference resolution) and highlight the increase in NLP tools performances. Conclusion: NLP tools are close to being seriously concurrent to humans in some annotation tasks. Their use could increase drastically the amount of data usable for meaningful use of EHR.


2021 ◽  
Author(s):  
Jianxia Gong ◽  
Vikrant Sihag ◽  
Qingxia Kong ◽  
Lindu Zhao

BACKGROUND The recent surge in clinical and nonclinical health-related data has been accompanied by a concomitant increase in personal health data (PHD) research across multiple disciplines such as medicine, computer science, and management. There is now a need to synthesize the dynamic knowledge of PHD in various disciplines to spot potential research hotspots. OBJECTIVE The aim of this study was to reveal the knowledge evolutionary trends in PHD and detect potential research hotspots using bibliometric analysis. METHODS We collected 8281 articles published between 2009 and 2018 from the Web of Science database. The knowledge evolution analysis (KEA) framework was used to analyze the evolution of PHD research. The KEA framework is a bibliometric approach that is based on 3 knowledge networks: reference co-citation, keyword co-occurrence, and discipline co-occurrence. RESULTS The findings show that the focus of PHD research has evolved from medicine centric to technology centric to human centric since 2009. The most active PHD knowledge cluster is developing knowledge resources and allocating scarce resources. The field of computer science, especially the topic of artificial intelligence (AI), has been the focal point of recent empirical studies on PHD. Topics related to psychology and human factors (eg, attitude, satisfaction, education) are also receiving more attention. CONCLUSIONS Our analysis shows that PHD research has the potential to provide value-based health care in the future. All stakeholders should be educated about AI technology to promote value generation through PHD. Moreover, technology developers and health care institutions should consider human factors to facilitate the effective adoption of PHD-related technology. These findings indicate opportunities for interdisciplinary cooperation in several PHD research areas: (1) AI applications for PHD; (2) regulatory issues and governance of PHD; (3) education of all stakeholders about AI technology; and (4) value-based health care including “allocative value,” “technology value,” and “personalized value.”


Author(s):  
Jackie Street ◽  
Belinda Fabrianesi ◽  
Rebecca Bosward ◽  
Stacy Carter ◽  
Annette Braunack-Mayer

IntroductionLarge volumes of health data are generated through the interaction of individuals with hospitals, government agencies and health care providers. There is potential in the linkage and sharing of administrative data with private industry to support improved drug and device provision but data sharing is highly contentious. Objectives and ApproachWe conducted a scoping review of quantitative and qualitative studies examining public attitudes towards the sharing of health data, held by government, with private industry for research and development. We searched four data bases, PubMed, Scopus, Cinahl and Web of Science as well as Google Scholar and Google Advanced. The search was confined to English-only publications since January 2014 but was not geographically limited. We thematically coded included papers. ResultsWe screened 6788 articles. Thirty-six studies were included primarily from UK and North America. No Australian studies were identified. Across studies, willingness to share non-identified data was generally high with the participant’s own health provider (84-91%) and academic researchers (64-93%) but fell if the data was to be shared with private industry (14-53%). There was widespread misunderstanding of the benefits of sharing data for health research. Publics expressed concern about a range of issues including data security, misuse of data and use of data to generate profit. Conditions which would increase public confidence in sharing of data included: strict safeguards on data collection and use including secure storage, opt-in or opt-out consent mechanisms, and good communication through trusted agents. Conclusion / ImplicationsWe identified a research gap: Australian views on sharing government health data with private industry. The international experience suggests that public scepticism about data sharing with private industry will need to be addressed by good communication about public benefit of data sharing, a strong program of public engagement and information sharing conducted through trusted entities.


Sign in / Sign up

Export Citation Format

Share Document