scholarly journals International Electronic Health Record-Derived COVID-19 Clinical Course Profiles: The 4CE Consortium

Author(s):  
Gabriel A Brat ◽  
Griffin M Weber ◽  
Nils Gehlenborg ◽  
Paul Avillach ◽  
Nathan P Palmer ◽  
...  

ABSTRACTWe leveraged the largely untapped resource of electronic health record data to address critical clinical and epidemiological questions about Coronavirus Disease 2019 (COVID-19). To do this, we formed an international consortium (4CE) of 96 hospitals across 5 countries (www.covidclinical.net). Contributors utilized the Informatics for Integrating Biology and the Bedside (i2b2) or Observational Medical Outcomes Partnership (OMOP) platforms to map to a common data model. The group focused on comorbidities and temporal changes in key laboratory test values. Harmonized data were analyzed locally and converted to a shared aggregate form for rapid analysis and visualization of regional differences and global commonalities. Data covered 27,584 COVID-19 cases with 187,802 laboratory tests. Case counts and laboratory trajectories were concordant with existing literature. Laboratory tests at the time of diagnosis showed hospital-level differences equivalent to country-level variation across the consortium partners. Despite the limitations of decentralized data generation, we established a framework to capture the trajectory of COVID-19 disease in patients and their response to interventions.

JAMIA Open ◽  
2019 ◽  
Vol 2 (1) ◽  
pp. 10-14 ◽  
Author(s):  
Benjamin S Glicksberg ◽  
Boris Oskotsky ◽  
Nicholas Giangreco ◽  
Phyllis M Thangaraj ◽  
Vivek Rudrapatna ◽  
...  

Abstract Objectives Electronic health record (EHR) data are increasingly used for biomedical discoveries. The nature of the data, however, requires expertise in both data science and EHR structure. The Observational Medical Outcomes Partnership (OMOP) common data model (CDM) standardizes the language and structure of EHR data to promote interoperability of EHR data for research. While the OMOP CDM is valuable and more attuned to research purposes, it still requires extensive domain knowledge to utilize effectively, potentially limiting more widespread adoption of EHR data for research and quality improvement. Materials and methods We have created ROMOP: an R package for direct interfacing with EHR data in the OMOP CDM format. Results ROMOP streamlines typical EHR-related data processes. Its functions include exploration of data types, extraction and summarization of patient clinical and demographic data, and patient searches using any CDM vocabulary concept. Conclusion ROMOP is freely available under the Massachusetts Institute of Technology (MIT) license and can be obtained from GitHub (http://github.com/BenGlicksberg/ROMOP). We detail instructions for setup and use in the Supplementary Materials. Additionally, we provide a public sandbox server containing synthesized clinical data for users to explore OMOP data and ROMOP (http://romop.ucsf.edu).


10.2196/28620 ◽  
2021 ◽  
Vol 5 (11) ◽  
pp. e28620
Author(s):  
Sarah B May ◽  
Thomas P Giordano ◽  
Assaf Gottlieb

Background Identification of people with HIV from electronic health record (EHR) data is an essential first step in the study of important HIV outcomes, such as risk assessment. This task has been historically performed via manual chart review, but the increased availability of large clinical data sets has led to the emergence of phenotyping algorithms to automate this process. Existing algorithms for identifying people with HIV rely on a combination of International Classification of Disease codes and laboratory tests or closely mimic clinical testing guidelines for HIV diagnosis. However, we found that existing algorithms in the literature missed a significant proportion of people with HIV in our data. Objective The aim of this study is to develop and evaluate HIV-Phen, an updated criteria-based HIV phenotyping algorithm. Methods We developed an algorithm using HIV-specific laboratory tests and medications and compared it with previously published algorithms in national and local data sets to identify cohorts of people with HIV. Cohort demographics were compared with those reported in the national and local surveillance data. Chart reviews were performed on a subsample of patients from the local database to calculate the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of the algorithm. Results Our new algorithm identified substantially more people with HIV in both national (up to an 85.75% increase) and local (up to an 83.20% increase) EHR databases than the previously published algorithms. The demographic characteristics of people with HIV identified using our algorithm were similar to those reported in national and local HIV surveillance data. Our algorithm demonstrated improved sensitivity over existing algorithms (98% vs 56%-92%) while maintaining a similar overall accuracy (96% vs 80%-96%). Conclusions We developed and evaluated an updated criteria-based phenotyping algorithm for identifying people with HIV in EHR data that demonstrates improved sensitivity over existing algorithms.


2021 ◽  
Author(s):  
Sarah B May ◽  
Thomas P Giordano ◽  
Assaf Gottlieb

BACKGROUND Identification of people with HIV from electronic health record (EHR) data is an essential first step in the study of important HIV outcomes, such as risk assessment. This task has been historically performed via manual chart review, but the increased availability of large clinical data sets has led to the emergence of phenotyping algorithms to automate this process. Existing algorithms for identifying people with HIV rely on a combination of International Classification of Disease codes and laboratory tests or closely mimic clinical testing guidelines for HIV diagnosis. However, we found that existing algorithms in the literature missed a significant proportion of people with HIV in our data. OBJECTIVE The aim of this study is to develop and evaluate HIV-Phen, an updated criteria-based HIV phenotyping algorithm. METHODS We developed an algorithm using HIV-specific laboratory tests and medications and compared it with previously published algorithms in national and local data sets to identify cohorts of people with HIV. Cohort demographics were compared with those reported in the national and local surveillance data. Chart reviews were performed on a subsample of patients from the local database to calculate the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of the algorithm. RESULTS Our new algorithm identified substantially more people with HIV in both national (up to an 85.75% increase) and local (up to an 83.20% increase) EHR databases than the previously published algorithms. The demographic characteristics of people with HIV identified using our algorithm were similar to those reported in national and local HIV surveillance data. Our algorithm demonstrated improved sensitivity over existing algorithms (98% vs 56%-92%) while maintaining a similar overall accuracy (96% vs 80%-96%). CONCLUSIONS We developed and evaluated an updated criteria-based phenotyping algorithm for identifying people with HIV in EHR data that demonstrates improved sensitivity over existing algorithms.


2020 ◽  
Author(s):  
Wade L. Schulz ◽  
H. Patrick Young ◽  
Andreas Coppi ◽  
Bobak J. Mortazavi ◽  
Zhenqiu Lin ◽  
...  

AbstractThe electronic health record (EHR) holds the prospect of providing more complete and timely access to clinical information for studies, quality assessments, and quality improvement compared to other data sources, such as administrative claims. Our goal was to assess the completeness and timeliness of structured diagnoses in the EHR compared to computed diagnoses for hypertension (HTN), hyperlipidemia (HLD), and diabetes mellitus (DM). We determined the amount of time for a structured diagnosis to be recorded in the EHR from when an equivalent diagnosis could be computed from other structured data elements, such as vital signs and laboratory results. Using our local instance of EHR data in the PCORnet common data model (CDM) with encounters from January 1, 2012 through February 10, 2019, we identified patients with at least two observations above threshold separated by at least 30 days. The thresholds were outpatient blood pressure of ≥ 140/90 mmHg, any low-density lipoprotein ≥ 130 mg/dl, or any hemoglobin A1c ≥ 7%, respectively. The primary measure was the length of time between the computed diagnosis and the time at which a structured diagnosis could be identified within the EHR history or problem list. We found that 39.8% of those with HTN, 21.6% with HLD, and 1.0% with DM did not receive a corresponding structured diagnosis recorded in the EHR. For those who received a structured diagnosis, a mean of 389, 198, and 106 days elapsed before the patient had the corresponding diagnosis of HTN, HLD, or DM, respectively, recorded in the EHR. We identified a marked temporal delay between when a diagnosis can be computed or inferred and when an equivalent structured diagnosis is recorded within the EHR. These findings demonstrate the continued need for additional study of the EHR to avoid bias when using observational data and reinforce the need for computational approaches to identify clinical phenotypes.


2019 ◽  
Author(s):  
Premanand Tiwari ◽  
Katie Colborn ◽  
Derek E. Smith ◽  
Fuyong Xing ◽  
Debashis Ghosh ◽  
...  

AbstractAtrial fibrillation (AF) is the most common sustained cardiac arrhythmia, whose early detection could lead to significant improvements in outcomes through appropriate prescription of anticoagulation. Although a variety of methods exist for screening for AF, there is general agreement that a targeted approach would be preferred. Implicit within this approach is the need for an efficient method for identification of patients at risk. In this investigation, we examined the strengths and weaknesses of an approach based on application of machine-learning algorithms to electronic health record (EHR) data that has been harmonized to the Observational Medical Outcomes Partnership (OMOP) common data model. We examined data from a total of 2.3M individuals, of whom 1.16% developed incident AF over designated 6-month time intervals. We examined and compared several approaches for data reduction, sample balancing (re-sampling) and predictive modeling using cross-validation for hyperparameter selection, and out-of-sample testing for validation. Although no approach provided outstanding classification accuracy, we found that the optimal approach for prediction of 6-month incident AF used a random forest classifier, raw features (no data reduction), and synthetic minority oversampling technique (SMOTE) resampling (F1 statistic 0.12, AUC 0.65). This model performed better than a predictive model based only on known AF risk factors, and highlighted the importance of using resampling methods to optimize ML approaches to imbalanced data as exists in EHRs. Further studies using EHR data in other medical systems are needed to validate the clinical applicability of these findings.


2011 ◽  
Vol 4 (0) ◽  
Author(s):  
Michael Klompas ◽  
Chaim Kirby ◽  
Jason McVetta ◽  
Paul Oppedisano ◽  
John Brownstein ◽  
...  

Author(s):  
José Carlos Ferrão ◽  
Mónica Duarte Oliveira ◽  
Daniel Gartner ◽  
Filipe Janela ◽  
Henrique M. G. Martins

Sign in / Sign up

Export Citation Format

Share Document