scholarly journals OSIRIS: A Minimum Data Set for Data Sharing and Interoperability in Oncology

2021 ◽  
pp. 256-265
Author(s):  
Julien Guérin ◽  
Yec'han Laizet ◽  
Vincent Le Texier ◽  
Laetitia Chanas ◽  
Bastien Rance ◽  
...  

PURPOSE Many institutions throughout the world have launched precision medicine initiatives in oncology, and a large amount of clinical and genomic data is being produced. Although there have been attempts at data sharing with the community, initiatives are still limited. In this context, a French task force composed of Integrated Cancer Research Sites (SIRICs), comprehensive cancer centers from the Unicancer network (one of Europe's largest cancer research organization), and university hospitals launched an initiative to improve and accelerate retrospective and prospective clinical and genomic data sharing in oncology. MATERIALS AND METHODS For 5 years, the OSIRIS group has worked on structuring data and identifying technical solutions for collecting and sharing them. The group used a multidisciplinary approach that included weekly scientific and technical meetings over several months to foster a national consensus on a minimal data set. RESULTS The resulting OSIRIS set and event-based data model, which is able to capture the disease course, was built with 67 clinical and 65 omics items. The group made it compatible with the HL7 Fast Healthcare Interoperability Resources (FHIR) format to maximize interoperability. The OSIRIS set was reviewed, approved by a National Plan Strategic Committee, and freely released to the community. A proof-of-concept study was carried out to put the OSIRIS set and Common Data Model into practice using a cohort of 300 patients. CONCLUSION Using a national and bottom-up approach, the OSIRIS group has defined a model including a minimal set of clinical and genomic data that can be used to accelerate data sharing produced in oncology. The model relies on clear and formally defined terminologies and, as such, may also benefit the larger international community.

Author(s):  
Eugenia Rinaldi ◽  
Sylvia Thun

HiGHmed is a German Consortium where eight University Hospitals have agreed to the cross-institutional data exchange through novel medical informatics solutions. The HiGHmed Use Case Infection Control group has modelled a set of infection-related data in the openEHR format. In order to establish interoperability with the other German Consortia belonging to the same national initiative, we mapped the openEHR information to the Fast Healthcare Interoperability Resources (FHIR) format recommended within the initiative. FHIR enables fast exchange of data thanks to the discrete and independent data elements into which information is organized. Furthermore, to explore the possibility of maximizing analysis capabilities for our data set, we subsequently mapped the FHIR elements to the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). The OMOP data model is designed to support the conduct of research to identify and evaluate associations between interventions and outcomes caused by these interventions. Mapping across standard allows to exploit their peculiarities while establishing and/or maintaining interoperability. This article provides an overview of our experience in mapping infection control related data across three different standards openEHR, FHIR and OMOP CDM.


2018 ◽  
pp. 1-14 ◽  
Author(s):  
Christine M. Micheel ◽  
Shawn M. Sweeney ◽  
Michele L. LeNoue-Newton ◽  
Fabrice André ◽  
Philippe L. Bedard ◽  
...  

The American Association for Cancer Research (AACR) Project Genomics Evidence Neoplasia Information Exchange (GENIE) is an international data-sharing consortium focused on enabling advances in precision oncology through the gathering and sharing of tumor genetic sequencing data linked with clinical data. The project’s history, operational structure, lessons learned, and institutional perspectives on participation in the data-sharing consortium are reviewed. Individuals involved with the inception and execution of AACR Project GENIE from each member institution described their experiences and lessons learned. The consortium was conceived in January 2014 and publicly released its first data set in January 2017, which consisted of 18,804 samples from 18,324 patients contributed by the eight founding institutions. Commitment and contributions from many individuals at AACR and the member institutions were crucial to the consortium’s success. These individuals filled leadership, project management, informatics, data curation, contracts, ethics, and security roles. Many lessons were learned during the first 3 years of the consortium, including on how to gather, harmonize, and share data; how to make decisions and foster collaboration; and how to set the stage for continued participation and expansion of the consortium. We hope that the lessons shared here will assist new GENIE members as well as others who embark on the journey of forming a genomic data–sharing consortium.


2021 ◽  
pp. 12-20
Author(s):  
Rimma Belenkaya ◽  
Michael J. Gurley ◽  
Asieh Golozar ◽  
Dmitry Dymshyts ◽  
Robert T. Miller ◽  
...  

2015 ◽  
Vol 06 (03) ◽  
pp. 536-547 ◽  
Author(s):  
F.S. Resnic ◽  
S.L. Robbins ◽  
J. Denton ◽  
L. Nookala ◽  
D. Meeker ◽  
...  

SummaryBackground: Adoption of a common data model across health systems is a key infrastructure requirement to allow large scale distributed comparative effectiveness analyses. There are a growing number of common data models (CDM), such as Mini-Sentinel, and the Observational Medical Outcomes Partnership (OMOP) CDMs.Objective: In this case study, we describe the challenges and opportunities of a study specific use of the OMOP CDM by two health systems and describe three comparative effectiveness use cases developed from the CDM.Methods: The project transformed two health system databases (using crosswalks provided) into the OMOP CDM. Cohorts were developed from the transformed CDMs for three comparative effectiveness use case examples. Administrative/billing, demographic, order history, medication, and laboratory were included in the CDM transformation and cohort development rules.Results: Record counts per person month are presented for the eligible cohorts, highlighting differences between the civilian and federal datasets, e.g. the federal data set had more outpatient visits per person month (6.44 vs. 2.05 per person month). The count of medications per person month reflected the fact that one system‘s medications were extracted from orders while the other system had pharmacy fills and medication administration records. The federal system also had a higher prevalence of the conditions in all three use cases. Both systems required manual coding of some types of data to convert to the CDM.Conclusion: The data transformation to the CDM was time consuming and resources required were substantial, beyond requirements for collecting native source data. The need to manually code subsets of data limited the conversion. However, once the native data was converted to the CDM, both systems were then able to use the same queries to identify cohorts. Thus, the CDM minimized the effort to develop cohorts and analyze the results across the sites.FitzHenry F, Resnic FS, Robbins SL, Denton J, Nookala L, Meeker D, Ohno-Machado L, Matheny ME. A Case Report on Creating a Common Data Model for Comparative Effectiveness with the Observational Medical Outcomes Partnership. Appl Clin Inform 2015; 6: 536–547http://dx.doi.org/10.4338/ACI-2014-12-CR-0121


Author(s):  
Hanning Wang ◽  
Weixiang Xu ◽  
Chaolong Jia

Railway distributed system integration needs to realize information exchange, resources sharing and coordination process across fields, departments and application systems. And railway data integration is essential to implement this integration. In order to resolve the problem of heterogeneity of data models among data sources of different railway operation systems, this paper presents a novel integration data model of spatial structure, a XML-oriented 3-dimension common data model. The proposed model accommodates both the flexibility of level relationship and syntax expression in data integration. In this model, a spatial data pattern is used to describe and express the characteristic relationship of data items among all types of data. Based on the data model with rooted directed graph and the organization of level as well as the flexibility of the expression, the model can represent the mapping between different data models, including relationship model and object-oriented model. A consistent concept and algebraic description of the data set is given to function as the metadata in data integration, so that the algebraic manipulation of data integration is standardized to support the data integration of distributed system.


2021 ◽  
Vol 12 (01) ◽  
pp. 057-064
Author(s):  
Christian Maier ◽  
Lorenz A. Kapsner ◽  
Sebastian Mate ◽  
Hans-Ulrich Prokosch ◽  
Stefan Kraus

Abstract Background The identification of patient cohorts for recruiting patients into clinical trials requires an evaluation of study-specific inclusion and exclusion criteria. These criteria are specified depending on corresponding clinical facts. Some of these facts may not be present in the clinical source systems and need to be calculated either in advance or at cohort query runtime (so-called feasibility query). Objectives We use the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) as the repository for our clinical data. However, Atlas, the graphical user interface of OMOP, does not offer the functionality to perform calculations on facts data. Therefore, we were in search for a different approach. The objective of this study is to investigate whether the Arden Syntax can be used for feasibility queries on the OMOP CDM to enable on-the-fly calculations at query runtime, to eliminate the need to precalculate data elements that are involved with researchers' criteria specification. Methods We implemented a service that reads the facts from the OMOP repository and provides it in a form which an Arden Syntax Medical Logic Module (MLM) can process. Then, we implemented an MLM that applies the eligibility criteria to every patient data set and outputs the list of eligible cases (i.e., performs the feasibility query). Results The study resulted in an MLM-based feasibility query that identifies cases of overventilation as an example of how an on-the-fly calculation can be realized. The algorithm is split into two MLMs to provide the reusability of the approach. Conclusion We found that MLMs are a suitable technology for feasibility queries on the OMOP CDM. Our method of performing on-the-fly calculations can be employed with any OMOP instance and without touching existing infrastructure like the Extract, Transform and Load pipeline. Therefore, we think that it is a well-suited method to perform on-the-fly calculations on OMOP.


2020 ◽  
Author(s):  
nicolas paris ◽  
adrien parrot

Objectives : In the era of big data, the intensive care unit (ICU) is very likely to benefit from real-time computer analysis and modeling based on close patient mon- itoring and Electronic Health Record data. MIMIC is the first open access database in the ICU domain. Many studies have shown that common data models (CDMs) improve database searching by allowing code, tools and experience to be shared. OMOP-CDM is spreading all over the world. The objective was to evaluate the difficulty to transform MIMIC into an OMOP (MIMIC-OMOP) database and the benefits of this transformation for analysts. Material & Method: A documented, tested, versioned, exemplified and open repository has been set up to support the transformation and improvement of the MIMIC community's source code. The resulting data set was evaluated over a 48- hour datathon. Result: With an investment of 2 people for 500 hours, 64% of the data items of the 26 MIMIC tables have been standardized into the OMOP CDM and 78% of the source concepts mapped to reference terminologies. The model proved its ability to support community contributions and was well received during the datathon with 160 participants and 15,000 requests executed with a maximum duration of one minute. Conclusion: The resulting MIMIC-OMOP dataset is the first MIMIC-OMOP dataset available free of charge with real disidentified data ready for replicable in- tensive care research. This approach can be generalized to any medical field.


2021 ◽  
Author(s):  
Samer Alkarkoukly ◽  
Abdul-Mateen Rajput

openEHR is an open-source technology for e-health, aims to build data models for interoperable Electronic Health Records (EHRs) and to enhance semantic interoperability. openEHR architecture consists of different building blocks, among them is the “template” which consists of different archetypes and aims to collect the data for a specific use-case. In this paper, we created a generic data model for a virtual pancreatic cancer patient, using the openEHR approach and tools, to be used for testing and virtual environments. The data elements for this template were derived from the “Oncology minimal data set” of HiGHmed project. In addition, we generated virtual data profiles for 10 patients using the template. The objective of this exercise is to provide a data model and virtual data profiles for testing and experimenting scenarios within the openEHR environment. Both of the template and the 10 virtual patient profiles are available publicly.


Sign in / Sign up

Export Citation Format

Share Document