Fast Healthcare Interoperability Resources (FHIR) as a Meta Model to Integrate Common Data Models: Development of a Tool and Quantitative Validation Study

Background In a multisite clinical research collaboration, institutions may or may not use the same common data model (CDM) to store clinical data. To overcome this challenge, we proposed to use Health Level 7’s Fast Healthcare Interoperability Resources (FHIR) as a meta-CDM—a single standard to represent clinical data. Objective In this study, we aimed to create an open-source application termed the Clinical Asset Mapping Program for FHIR (CAMP FHIR) to efficiently transform clinical data to FHIR for supporting source-agnostic CDM-to-FHIR mapping. Methods Mapping with CAMP FHIR involves (1) mapping each source variable to its corresponding FHIR element and (2) mapping each item in the source data’s value sets to the corresponding FHIR value set item for variables with strict value sets. To date, CAMP FHIR has been used to transform 108 variables from the Informatics for Integrating Biology & the Bedside (i2b2) and Patient-Centered Outcomes Research Network data models to fields across 7 FHIR resources. It is designed to allow input from any source data model and will support additional FHIR resources in the future. Results We have used CAMP FHIR to transform data on approximately 23,000 patients with asthma from our institution’s i2b2 database. Data quality and integrity were validated against the origin point of the data, our enterprise clinical data warehouse. Conclusions We believe that CAMP FHIR can serve as an alternative to implementing new CDMs on a project-by-project basis. Moreover, the use of FHIR as a CDM could support rare data sharing opportunities, such as collaborations between academic medical centers and community hospitals. We anticipate adoption and use of CAMP FHIR to foster sharing of clinical data across institutions for downstream applications in translational research.

Download Full-text

Representing Knowledge Consistently Across Health Systems

Yearbook of Medical Informatics ◽

10.15265/iy-2017-018 ◽

2017 ◽

Vol 26 (01) ◽

pp. 139-147 ◽

Cited By ~ 19

Author(s):

S. T. Rosenbloom ◽

R. J. Carroll ◽

J. L. Warner ◽

M. E. Matheny ◽

J. C. Denny

Keyword(s):

Knowledge Representation ◽

Clinical Data ◽

Information Technologies ◽

The United States ◽

Data Models ◽

Research Network ◽

Patient Centered ◽

Health Information Technologies ◽

Health Level 7 ◽

Support Of Research

Summary Objectives: Electronic health records (EHRs) have increasingly emerged as a powerful source of clinical data that can be leveraged for reuse in research and in modular health apps that integrate into diverse health information technologies. A key challenge to these use cases is representing the knowledge contained within data from different EHR systems in a uniform fashion. Method: We reviewed several recent studies covering the knowledge representation in the common data models for the Observational Medical Outcomes Partnership (OMOP) and its Observational Health Data Sciences and Informatics program, and the United States Patient Centered Outcomes Research Network (PCORNet). We also reviewed the Health Level 7 Fast Healthcare Interoperability Resource standard supporting app-like programs that can be used across multiple EHR and research systems. Results: There has been a recent growth in high-impact efforts to support quality-assured and standardized clinical data sharing across different institutions and EHR systems. We focused on three major efforts as part of a larger landscape moving towards shareable, transportable, and computable clinical data. Conclusion: The growth in approaches to developing common data models to support interoperable knowledge representation portends an increasing availability of high-quality clinical data in support of research. Building on these efforts will allow a future whereby significant portions of the populations in the world may be able to share their data for research.

Download Full-text

Representing Knowledge Consistently Across Health Systems

Yearbook of Medical Informatics ◽

10.1055/s-0037-1606495 ◽

2017 ◽

Vol 26 (01) ◽

pp. 139-147

Author(s):

S. T. Rosenbloom ◽

R. J. Carroll ◽

J. L. Warner ◽

M. E. Matheny ◽

J. C. Denny

Keyword(s):

Knowledge Representation ◽

Clinical Data ◽

Information Technologies ◽

The United States ◽

Data Models ◽

Research Network ◽

Patient Centered ◽

Health Information Technologies ◽

Health Level 7 ◽

Support Of Research

Download Full-text

Automated Production of Research Data Marts from a Canonical Fast Healthcare Interoperability Resource (FHIR) Data Repository: Applications to COVID-19 Research

10.1101/2021.03.11.21253384 ◽

2021 ◽

Author(s):

Leslie A Lenert ◽

Andrey V. Ilatovskiy ◽

James Agnew ◽

Patricia Rudsill ◽

Jeff Jacobs ◽

...

Keyword(s):

Clinical Data ◽

Data Model ◽

Assessment Tools ◽

Research Network ◽

Data Repository ◽

Common Data Model ◽

Real World Data ◽

Data Repositories ◽

Automated Production ◽

Healthcare Data

AbstractObjectiveObjective: The COVID-19 pandemic has enhanced the need for timely real-world data (RWD) for research. To meet this need, several large clinical consortia have developed networks for access to RWD from electronic health records (EHR), each with its own common data model (CDM) and custom pipeline for extraction, transformation, and load operations for production and incremental updating. However, the demands of COVID-19 research for timely RWD (e.g., 2-week delay) make this less feasible.Methods and MaterialsWe describe the use of the Fast Healthcare Interoperability Resource (FHIR) data model as a canonical model for representation of clinical data for automated transformation to the Patient-Centered Outcomes Research Network (PCORnet) and Observational Medical Outcomes Partnership (OMOP) CDMs and the near automated production of linked clinical data repositories (CDRs) for COVID-19 research using the FHIR subscription standard. The approach was applied to healthcare data from a large academic institution and was evaluated using published quality assessment tools.ResultsSix years of data (1.07M patients, 10.1M encounters, 137M laboratory results), were loaded into the FHIR CDR producing 3 linked real-time linked repositories: FHIR, PCORnet, and OMOP. PCORnet and OMOP databases were refined in subsequent post processing steps into production releases and met published quality standards. The approach greatly reduced CDM production efforts.ConclusionsFHIR and FHIR CDRs can play an important role in enhancing the availability of RWD from EHR systems. The above approach leverages 21st Century Cures Act mandated standards and could greatly enhance the availability of datasets for research.

Download Full-text

Data interchange using i2b2

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocv188 ◽

2016 ◽

Vol 23 (5) ◽

pp. 909-915 ◽

Cited By ~ 29

Author(s):

Jeffrey G Klann ◽

Aaron Abend ◽

Vijay A Raghavan ◽

Kenneth D Mandl ◽

Shawn N Murphy

Keyword(s):

Data Model ◽

Data Extraction ◽

Information Model ◽

Research Network ◽

Common Data Model ◽

Data Types ◽

Patient Centered ◽

Information Models ◽

National Patient ◽

Analytical Requirements

Abstract Objective Reinventing data extraction from electronic health records (EHRs) to meet new analytical needs is slow and expensive. However, each new data research network that wishes to support its own analytics tends to develop its own data model. Joining these different networks without new data extraction, transform, and load (ETL) processes can reduce the time and expense needed to participate. The Informatics for Integrating Biology and the Bedside (i2b2) project supports data network interoperability through an ontology-driven approach. We use i2b2 as a hub, to rapidly reconfigure data to meet new analytical requirements without new ETL programming. Materials and Methods Our 12-site National Patient-Centered Clinical Research Network (PCORnet) Clinical Data Research Network (CDRN) uses i2b2 to query data. We developed a process to generate a PCORnet Common Data Model (CDM) physical database directly from existing i2b2 systems, thereby supporting PCORnet analytic queries without new ETL programming. This involved: a formalized process for representing i2b2 information models (the specification of data types and formats); an information model that represents CDM Version 1.0; and a program that generates CDM tables, driven by this information model. This approach is generalizable to any logical information model. Results Eight PCORnet CDRN sites have implemented this approach and generated a CDM database without a new ETL process from the EHR. This enables federated querying within the CDRN and compatibility with the national PCORnet Distributed Research Network. Discussion We have established a way to adapt i2b2 to new information models without requiring changes to the underlying data. Eight Scalable Collaborative Infrastructure for a Learning Health System sites vetted this methodology, resulting in a network that, at present, supports research on 10 million patients’ data. Conclusion New analytical requirements can be quickly and cost-effectively supported by i2b2 without creating new data extraction processes from the EHR.

Download Full-text

International Comparison of Approaches to Common Data Models for Comparative Effectiveness Research

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.737 ◽

2018 ◽

Vol 3 (4) ◽

Author(s):

Adrian Levy ◽

Robert Platt ◽

Soko Setoguchi ◽

Jeffrey Brown ◽

Michael Paterson

Keyword(s):

Comparative Effectiveness Research ◽

Data Model ◽

Comparative Effectiveness ◽

Distributed Networks ◽

Network Governance ◽

Data Models ◽

Research Network ◽

Common Data Model ◽

Effectiveness Research ◽

The Us

Over the past decade, characterizing the safety and effectiveness of drugs has advanced through distributed networks of data repositories where investigators implement the same procedures to address the same topic using a common data model. Distributed networks for pharmacoepidemiology have now been established in the United States (US), Globally/Europe Canada, and Asian countries. Sentinel in the US was developed in response to legislation and is funded by the US Food and Drug Administration to address their safety queries. The Observational Medical Outcomes Partnership (OMOP) is an international collaborative with a growing European data network that developed a common data model through a public-private partnership. The Canadian Network of Observational Drug Effect Studies (CNODES) receives funding and study queries from Health Canada and dissemination is directly back to the regulator as well as through the peer-reviewed literature. The Asian Pharmacoepidemiology Network (AsPEN) is an investigator-initiated multi-national research network formed to support the safety and effectiveness assessment of medications and other therapeutics and to facilitate the prompt identification and validation of emerging safety issues among the countries in Asia and Pacific regions. While these networks have implemented two different common data models (CNODES with Sentinel, ASPEN with OMOP), each network differs from the others in the aims, stage and implementation, operational approach, data quality assurance mechanisms, funding, and dissemination. The objectives of this session are to compare and contrast the role and goals, design principles, implementation approaches, and analytic conventions and procedures between common data models implemented by SENTINEL, OMOP, CNODES, ands AsPEN. Divided into seven 15-minute segments the session begins with an overview of distributed networks of common data models for pharmacoepidemiology. In four slides, each presenter then characterizes their network by describing the following: number of data holders, lives covered, and records, data holdings, data access model, network governance. process for transforming a repository’s data into the common data model target audience(s), process of identifying queries and knowledge dissemination plan two key challenges faced by the network and the lessons learned In identifying similarities and meaningful differences between the networks, in the next segment the discussant will articulate the relative strengths of the different approaches taken. This will lead into the last segment in which the floor will be opened for questions and comments from the audience. The session would be of benefit to researchers seeking to better understand or join an existing distributed network as well as researchers interested in broadening their understanding of global comparative effectiveness research.

Download Full-text

Abstract P059: Utilizing Electronic Health Records to Evaluate Racial Disparities in Metabolic Syndrome

Circulation ◽

10.1161/circ.137.suppl_1.p059 ◽

2018 ◽

Vol 137 (suppl_1) ◽

Author(s):

Denise Danos ◽

Maura Kepper ◽

Tekeda Ferguson ◽

Claudia Leonardi ◽

Richard Scribner

Keyword(s):

Cardiovascular Disease ◽

Metabolic Syndrome ◽

Racial Disparities ◽

White Women ◽

Research Network ◽

Common Data Model ◽

Patient Centered ◽

Phenotype Definition ◽

Increased Risk ◽

Metabolic Conditions

Purpose: Metabolic syndrome is defined as a clustering of clinical metabolic conditions (increased blood pressure, high blood sugar, increased body fat, abnormal cholesterol or triglycerides) and has been associated with an increased risk for several chronic diseases, such as cardiovascular disease. The aim of this project was to identify individuals presenting with metabolic syndrome using a computational patient phenotype definition derived from electronic medical records (EHR) clinical outcomes data. Secondly, this project evaluated racial disparities in metabolic syndrome across Southeast Louisiana. Methods: Data was obtained through Research Action for Health Network (REACHnet). Using the National Patient-Centered Clinical Research Network Common Data Model, REACHnet has standardized and made usable EHR data for patient-centered research across Louisiana and Texas. The computational patient phenotype definition for metabolic syndrome was developed based on the National Cholesterol Education Program Expert Panel in Adult Treatment Panel III (NCEP III) guidelines. The presence of metabolic conditions was established using ICD9 Diagnosis codes, patient vitals and lab results that are routinely available in EHR data. Logistic regression models to assess racial disparities were executed using SAS 9.4. Results: We analyzed 18,664 patient EHRs for individuals 18 years or older with complete clinical data spanning the years 2013 to 2014. The sample was 43.28% male (n=8,077) and 29.35% black (n=5,477). Based on the patient phenotype definition, the prevalence of metabolic syndrome in the sample was 39.09%. Controlling for age, the odds of metabolic syndrome were twice as high for black women than for white women (OR= 2 (1.83, 2.18)), while the odds were 15% greater for black men than for white men (OR: 1.15 (1.04, 1.28)). Conclusion: We observed significant disparities in the prevalence of clinically evident metabolic syndrome in southeast Louisiana. Racial disparities were greatest among women. It has been increasingly recognized that differential exposure to chronic social and nutritive stress from living in a disadvantaged neighborhood may be contributing to racial health disparities. Further research in this sample will link ancillary sources of neighborhood data to the successfully developed metabolic syndrome phenotype to explore potential mechanisms for racial disparities in cardiovascular disease among a clinically-rich, state-wide sample.

Download Full-text

Association between Use of Hydrochlorothiazide and Nonmelanoma Skin Cancer: Common Data Model Cohort Study in Asian Population

Journal of Clinical Medicine ◽

10.3390/jcm9092910 ◽

2020 ◽

Vol 9 (9) ◽

pp. 2910

Author(s):

Seung Min Lee ◽

Kwangsoo Kim ◽

Jihoon Yoon ◽

Sue K. Park ◽

Sungji Moon ◽

...

Keyword(s):

Skin Cancer ◽

Cumulative Dose ◽

Data Model ◽

Cox Regression ◽

Antihypertensive Drugs ◽

Univariate Analysis ◽

Asian Population ◽

Research Network ◽

Common Data Model ◽

High Cumulative Dose

Although hydrochlorothiazide (HCTZ) has been suggested to increase skin cancer risk in white Westerners, there is scant evidence for the same in Asians. We analyzed the association between the use of hydrochlorothiazide and non-melanoma in the Asian population using the common data model. Methods: A retrospective multicenter observational study was conducted using a distributed research network to analyze the effect of HCTZ on skin cancer from 2004 to 2018. We performed Cox regression to evaluate the effects by comparing the use of HCTZ with other antihypertensive drugs. All analyses were re-evaluated using matched data using the propensity score matching (PSM). Then, the overall effects were evaluated by combining results with the meta-analysis. Results: Positive associations were observed in the use of HCTZ with high cumulative dose for non-melanoma skin cancer (NMSC) in univariate analysis prior to the use of PSM. Some negative associations were observed in the use of low and medium cumulative doses. Conclusion: Although many findings in our study were inconclusive, there was a non-significant association of a dose-response pattern with estimates increasing in cumulative dose of HCTZ. In particular, a trend with a non-significant positive association was observed with the high cumulative dose of HCTZ.

Download Full-text

Validation of Cardiovascular End Points Ascertainment Leveraging Multisource Electronic Health Records Harmonized Into a Common Data Model in the ADAPTABLE Randomized Clinical Trial

Circulation Cardiovascular Quality and Outcomes ◽

10.1161/circoutcomes.121.008190 ◽

2021 ◽

Cited By ~ 1

Author(s):

Guillaume Marquis-Gravel ◽

Bradley G. Hammill ◽

Hillary Mulder ◽

Matthew T. Roe ◽

Holly R. Robertson ◽

...

Keyword(s):

Myocardial Infarction ◽

Positive Predictive Value ◽

Clinical Research ◽

Predictive Value ◽

Research Network ◽

Common Data Model ◽

Patient Centered ◽

End Points ◽

Clinical Research Network ◽

National Patient

Background: The ADAPTABLE trial (Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-Term Effectiveness) is the first randomized trial conducted within the National Patient-Centered Clinical Research Network to use the electronic health record data formatted into a common data model as the primary source of end point ascertainment, without confirmation by standard adjudication. The objective of this prespecified study is to assess the validity of nonfatal end points captured from the National Patient-Centered Clinical Research Network, using traditional blinded adjudication as the gold standard. Methods: A total of 15 076 participants with established atherosclerotic cardiovascular disease were randomized to two doses of aspirin (81 mg and 325 mg once daily). Nonfatal end points (hospitalization for nonfatal myocardial infarction, nonfatal stroke, and major bleeding requiring transfusion of blood products) were captured with the use of programming algorithms applied to National Patient-Centered Clinical Research Network data. A random subset of end points was independently reviewed by a disease-specific expert adjudicator. The positive predictive value of the programming algorithms were calculated separately for end points listed as primary and as nonprimary diagnoses. Results: A total of 225 end points were identified (91 myocardial infarction events, 89 stroke events, and 45 bleeding events), including 142 (63%) that were listed as primary diagnoses. Complete source documents were missing for 14% of events. The positive predictive value were 90%, 72%, and 93% for hospitalizations for myocardial infarction, stroke, and major bleeding, respectively, as compared to adjudication. When only primary diagnoses were considered, positive predictive value were 93%, 91%, and 97%, respectively. When only nonprimary diagnoses were considered, positive predictive value were 82%, 36%, and 71%. Conclusions: As compared with blinded adjudication, clinical end point ascertainment from queries of the National Patient-Centered Clinical Research Network distributed harmonized data was valid to identify hospitalizations for myocardial infarction in ADAPTABLE. The proportion of contradicted events was high for hospitalizations for bleeding and strokes when nonprimary diagnoses were analyzed, but not when only primary diagnoses were considered.

Download Full-text

Abstract 56: Blood Pressure Control in the Real World - Early Evidence From a National Blood Pressure Surveillance System

Circulation ◽

10.1161/circ.141.suppl_1.56 ◽

2020 ◽

Vol 141 (Suppl_1) ◽

Cited By ~ 1

Author(s):

Rhonda M Cooper-DeHoff ◽

Valy Fontil ◽

Thomas Carton ◽

Kathryn McAuliffe ◽

Myra Smith ◽

...

Keyword(s):

Blood Pressure ◽

Surveillance System ◽

Control Measure ◽

Quality Metrics ◽

Research Network ◽

Fixed Dose ◽

Common Data Model ◽

Patient Centered ◽

Eligibility Criteria ◽

Process Measures

Background: The Patient-Centered Outcomes Research Institute (PCORI) funded National Blood Pressure Surveillance System (BP Track) is a new national system that generates quarterly metrics of blood pressure (BP) control and BP-related quality metrics for participating healthcare systems from electronic health record (EHR) data. Methods: Queries against standardized EHR data in the national Patient-Centered Clinical Research Network (PCORnet) Common Data Model format produce a set of quality metrics relevant to improving BP control, including Controlling High BP (NQF 0018) and Improvement in BP (CMS65v7), and eight process measures specific to clinical management and treatment practices for improving BP control ( Table ). The metrics, aggregated overall and by health system are reported back to health systems via user-friendly Tableau dashboards, and allow for observation of metric trends. Results: To date, 19 datamarts have contributed EHR data from 1,177,232 patients who met the eligibility criteria for the BP Control measure and 4,454,729 encounters that included a BP measurement during the measurement period. Average age was 62 years; 10% were young adults (<45 years), 17% were African American, 52% female, 28% had diabetes, 15% had coronary heart disease, and 14% had depression. Results demonstrate substantial opportunity for improvement in overall BP control (60% with BP<140/<90 mmHg, range: 42-72%), and many healthcare processes, including medication intensification (12%, 0.6-22%) and use of fixed dose combination medications (24%, 0-88%, Table ). Conclusion: Major opportunities exist for improving BP control, especially in improving the frequency and quality of BP medication prescribing for patients with high BP. The BP Track National BP Surveillance System will track these metrics, by demographic subgroup and over time, and will generate data that can guide and focus quality improvement initiatives aimed at effective BP management.

Download Full-text

The IeDEA Data Exchange Standard: a common data model for global HIV cohort collaboration

10.1101/2020.07.22.20159921 ◽

2020 ◽

Author(s):

Stephany N Duda ◽

Beverly S Musick ◽

Mary-Ann Davies ◽

Annette H Sohn ◽

Bruno Ledergerber ◽

...

Keyword(s):

Data Model ◽

Data Exchange ◽

Governance Structure ◽

Data Models ◽

Hiv Care ◽

Common Data Model ◽

Resource Limited Settings ◽

Resource Limited ◽

Hiv Research ◽

Data Exchange Standard

Objective To describe content domains and applications of the IeDEA Data Exchange Standard, its development history, governance structure, and relationships to other established data models, as well as to share open source, reusable, scalable, and adaptable implementation tools with the informatics community. Methods In 2012, the International Epidemiology Databases to Evaluate AIDS (IeDEA) collaboration began development of a data exchange standard, the IeDEA DES, to support collaborative global HIV epidemiology research. With the HIV Cohorts Data Exchange Protocol as a template, a global group of data managers, statisticians, clinicians, informaticians, and epidemiologists reviewed existing data schemas and clinic data procedures to develop the HIV data exchange model. The model received a substantial update in 2017, with annual updates thereafter. Findings The resulting IeDEA DES is a patient-centric common data model designed for HIV research that has been informed by established data models from US-based electronic health records, broad experience in data collection in resource-limited settings, and informatics best practices. The IeDEA DES is inherently flexible and continues to grow based on the ongoing stewardship of the IeDEA Data Harmonization Working Group with input from external collaborators. Use of the IeDEA DES has improved multiregional collaboration within and beyond IeDEA, expediting over 95 multiregional research projects using data from more than 400 HIV care and treatment sites across seven global regions. A detailed data model specification and REDCap data entry templates that implement the IeDEA DES are publicly available on GitHub. Conclusions The IeDEA common data model and related resources are powerful tools to foster collaboration and accelerate science across research networks. While currently directed towards observational HIV research and data from resource-limited settings, this model is flexible and extendable to other areas of health research.

Download Full-text