scholarly journals Transformation of Electronic Health Records and Questionnaire Data to OMOP CDM: A Feasibility Study Using SG_T2DM Dataset

2021 ◽  
Vol 12 (04) ◽  
pp. 757-767
Author(s):  
Selva Muthu Kumaran Sathappan ◽  
Young Seok Jeon ◽  
Trung Kien Dang ◽  
Su Chi Lim ◽  
Yi-Ming Shao ◽  
...  

Abstract Background Diabetes mellitus (DM) is an important public health concern in Singapore and places a massive burden on health care spending. Tackling chronic diseases such as DM requires innovative strategies to integrate patients' data from diverse sources and use scientific discovery to inform clinical practice that can help better manage the disease. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) was chosen as the framework for integrating data with disparate formats. Objective The study aimed to evaluate the feasibility of converting Singapore based data source, comprising of electronic health records (EHR), cognitive and depression assessment questionnaire data to OMOP CDM standard. Additionally, we also validate whether our OMOP CDM instance is fit for the purpose of research by executing a simple treatment pathways study using Atlas, a graphical user interface tool to conduct analysis on OMOP CDM data as a proof of concept. Methods We used de-identified EHR, cognitive, and depression assessment questionnaires data from a tertiary care hospital in Singapore to convert it to version 5.3.1 of OMOP CDM standard. We evaluate the OMOP CDM conversion by (1) assessing the mapping coverage (that is the percentage of source terms mapped to OMOP CDM standard); (2) local raw dataset versus CDM dataset analysis; and (3) Implementing Harmonized Intrinsic Data Quality Framework using an open-source R package called Data Quality Dashboard. Results The content coverage of OMOP CDM vocabularies is more than 90% for clinical data, but only around 11% for questionnaire data. The comparison of characteristics between source and target data returned consistent results and our transformed data did not pass 38 (1.4%) out of 2,622 quality checks. Conclusion Adoption of OMOP CDM at our site demonstrated that EHR data are feasible for standardization with minimal information loss, whereas challenges remain for standardizing cognitive and depression assessment questionnaire data that requires further work.

2021 ◽  
Vol 12 (04) ◽  
pp. 816-825
Author(s):  
Yingcheng Sun ◽  
Alex Butler ◽  
Ibrahim Diallo ◽  
Jae Hyun Kim ◽  
Casey Ta ◽  
...  

Abstract Background Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population. Objectives This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage. Methods We present an end-to-end analytical framework for transforming free-text clinical trial eligibility criteria into executable database queries conformant with the Observational Medical Outcomes Partnership Common Data Model and for systematically quantifying the population representativeness for each clinical trial. Results We calculated the population representativeness of 782 novel coronavirus disease 2019 (COVID-19) trials and 3,827 type 2 diabetes mellitus (T2DM) trials in the United States respectively using this framework. With the use of overly restrictive eligibility criteria, 85.7% of the COVID-19 trials and 30.1% of T2DM trials had poor population representativeness. Conclusion This research demonstrates the potential of using the EHR data to assess the clinical trials population representativeness, providing data-driven metrics to inform the selection and optimization of eligibility criteria.


2015 ◽  
Vol 22 (6) ◽  
pp. 1220-1230 ◽  
Author(s):  
Huan Mo ◽  
William K Thompson ◽  
Luke V Rasmussen ◽  
Jennifer A Pacheco ◽  
Guoqian Jiang ◽  
...  

Abstract Background Electronic health records (EHRs) are increasingly used for clinical and translational research through the creation of phenotype algorithms. Currently, phenotype algorithms are most commonly represented as noncomputable descriptive documents and knowledge artifacts that detail the protocols for querying diagnoses, symptoms, procedures, medications, and/or text-driven medical concepts, and are primarily meant for human comprehension. We present desiderata for developing a computable phenotype representation model (PheRM). Methods A team of clinicians and informaticians reviewed common features for multisite phenotype algorithms published in PheKB.org and existing phenotype representation platforms. We also evaluated well-known diagnostic criteria and clinical decision-making guidelines to encompass a broader category of algorithms. Results We propose 10 desired characteristics for a flexible, computable PheRM: (1) structure clinical data into queryable forms; (2) recommend use of a common data model, but also support customization for the variability and availability of EHR data among sites; (3) support both human-readable and computable representations of phenotype algorithms; (4) implement set operations and relational algebra for modeling phenotype algorithms; (5) represent phenotype criteria with structured rules; (6) support defining temporal relations between events; (7) use standardized terminologies and ontologies, and facilitate reuse of value sets; (8) define representations for text searching and natural language processing; (9) provide interfaces for external software algorithms; and (10) maintain backward compatibility. Conclusion A computable PheRM is needed for true phenotype portability and reliability across different EHR products and healthcare systems. These desiderata are a guide to inform the establishment and evolution of EHR phenotype algorithm authoring platforms and languages.


Circulation ◽  
2018 ◽  
Vol 137 (suppl_1) ◽  
Author(s):  
Tekeda F Ferguson ◽  
Sunayana Kumar ◽  
Denise Danos

Purpose: In conjunction with women being diagnosed earlier with breast cancer and a rapidly aging population, advances in cancer therapies have swiftly propelled cardiotoxicity as a major health concern for breast cancer patients. Frequent cardiotoxicity outcomes include: reduced left ventricular ejection fraction (LVEF), myocardial infarction, asymptomatic or hospitalized heart failure, arrhythmias, hypertension, and thromboembolism. The purpose of this study was to use an electronic health records system determine if an increased odds of heart disease was present among women with breast cancer. Methods: Data from the Research Action for Health Network (REACHnet) was used for the analysis. REACHnet is a clinical data research network that uses the common data model to extract electronic health records (EHR) from health networks in Louisiana (n=100,000).Women over the age of 30 with data (n=35,455) were included in the analysis. ICD-9 diagnosis codes were used to classify heart disease (HD) (Hypertensive HD, Ischemic HD, Pulmonary HD, and Other HD) and identify breast cancer patients. Additional EHR variables considered were smoking status, and patient vitals. Chi-square tests, crude, and adjusted logistic regression models were computed utilizing SAS 9.4. Results: Utilizing diagnoses codes our study team has estimated 28.6% of women over the age of 30 with a breast cancer diagnosis (n=816) also had a heart disease diagnosis, contrasted with 15.6% of women without a breast cancer diagnosis. Among patients with heart disease, there was no significant difference in the distribution of the type of heart disease diagnoses by breast cancer status (p=0.87). There was a 2.21 (1.89, 2.58) crude odds ratio of having a CVD diagnoses among breast cancer cases when referenced to cancer free women. After adjusting for age (30-49, 50-64, 65+), race (black/white), and comorbidities (obesity/overweight, diabetes, current smoker) there was an increased risk of heart disease (OR: 1.24 (1.05, 1.47)). Conclusion: The short-term and long-term consequences of cardiotoxicity on cancer treatment risk-to-benefit ratio, survivorship issues, and competing causes of mortality are increasingly being acknowledged. Our next efforts will include making advances in predictive risk modeling. Maximizing benefits while reducing cardiac risks needs to become a priority in oncologic management and monitoring for late-term toxic effects.


2018 ◽  
Vol 2 (11) ◽  
pp. 1172-1179 ◽  
Author(s):  
Ashima Singh ◽  
Javier Mora ◽  
Julie A. Panepinto

Key Points The algorithms have high sensitivity and specificity to identify patients with hemoglobin SS/Sβ0 thalassemia and acute care pain encounters. Codes conforming to common data model are provided to facilitate adoption of algorithms and standardize definitions for EHR-based research.


BMJ Open ◽  
2019 ◽  
Vol 9 (7) ◽  
pp. e029314 ◽  
Author(s):  
Kaiwen Ni ◽  
Hongling Chu ◽  
Lin Zeng ◽  
Nan Li ◽  
Yiming Zhao

ObjectivesThere is an increasing trend in the use of electronic health records (EHRs) for clinical research. However, more knowledge is needed on how to assure and improve data quality. This study aimed to explore healthcare professionals’ experiences and perceptions of barriers and facilitators of data quality of EHR-based studies in the Chinese context.SettingFour tertiary hospitals in Beijing, China.ParticipantsNineteen healthcare professionals with experience in using EHR data for clinical research participated in the study.MethodsA qualitative study based on face-to-face semistructured interviews was conducted from March to July 2018. The interviews were audiorecorded and transcribed verbatim. Data analysis was performed using the inductive thematic analysis approach.ResultsThe main themes included factors related to healthcare systems, clinical documentation, EHR systems and researchers. The perceived barriers to data quality included heavy workload, staff rotations, lack of detailed information for specific research, variations in terminology, limited retrieval capabilities, large amounts of unstructured data, challenges with patient identification and matching, problems with data extraction and unfamiliar with data quality assessment. To improve data quality, suggestions from participants included: better staff training, providing monetary incentives, performing daily data verification, improving software functionality and coding structures as well as enhancing multidisciplinary cooperation.ConclusionsThese results provide a basis to begin to address current barriers and ultimately to improve validity and generalisability of research findings in China.


2016 ◽  
Vol 22 (4) ◽  
pp. 1017-1029 ◽  
Author(s):  
Lua Perimal-Lewis ◽  
David Teubner ◽  
Paul Hakendorf ◽  
Chris Horwood

Effective and accurate use of routinely collected health data to produce Key Performance Indicator reporting is dependent on the underlying data quality. In this research, Process Mining methodology and tools were leveraged to assess the data quality of time-based Emergency Department data sourced from electronic health records. This research was done working closely with the domain experts to validate the process models. The hospital patient journey model was used to assess flow abnormalities which resulted from incorrect timestamp data used in time-based performance metrics. The research demonstrated process mining as a feasible methodology to assess data quality of time-based hospital performance metrics. The insight gained from this research enabled appropriate corrective actions to be put in place to address the data quality issues.


Author(s):  
Diana Walther ◽  
Patricia Halfon ◽  
David Desseauve ◽  
Yvan Vial ◽  
Bernard Burnand ◽  
...  

IntroductionPostpartum hemorrhage (PPH) remains a major cause of morbidity and mortality worldwide. Geo-temporal comparisons of in-hospital PPH incidence remain a challenge due to differences in definition, data quality and the absence of accurate, validated indicators. Objectives and ApproachTo compare the incidence of PPH using different definitions to assess the need for a validated indicator. Singleton births from 2014-2016 at Lausanne University Hospital, Switzerland, were included. PPH was defined based on 1) clinical diagnosis using International Classification of Diseases (ICD-10-GM) PPH diagnostic codes, 2) volume of blood loss ≥500ml for vaginal births and ≥1000ml for cesareans 3) peripartum Hb change >2g/dl in vaginal births and ≥4g/dl in cesareans and 4) fulfillment of criteria from definition one, two or three. Data were extracted from hospital discharge data and linked with electronic health records. ResultsThere were 2529, 2660 and 2715 singleton births in 2014, 2015 and 2016, respectively, 28.8% were cesareans. Peripartum change in Hb was available for 17% of births. The incidence (95% CI) of PPH in 2014, 2015 and 2016 was, respectively: 1)6.0% (5.1, 7.0), 6.3% (5.4, 7.3) and 7.9% (6.9, 9.0) based on diagnostic codes; 2)7.9% (6.8, 9.0), 7.1% (6.2, 8.2) and 7.2% (6.3, 8.3) based on blood loss volumes; 3)2.4% (1.8, 3.1), 2.7% (2.1, 3.4) and 3.5% (2.9, 4.3) based on change in Hb; 4)11.3% (10.1, 12.6), 10.4% (9.3, 11.6) and 11.0% (9.9, 12.3) based on the combined definition. Differences in PPH incidence by year between definitions one and four, two and four and three and four were all statistically significant (McNemar p-values Conclusion/ImplicationsIncidence varied widely according to definition and data availability, not to mention data quality. Our results highlight the need for a validated PPH indicator to enable monitoring. Future prospects include the validation of a diagnostic code based PPH indicator aided by text mining in electronic health records.


Sign in / Sign up

Export Citation Format

Share Document