scholarly journals Towards a Semantic Data Harmonization Federated Infrastructure

Author(s):  
Catalina Martinez-Costa ◽  
Francisco Abad-Navarro

Data integration is an increasing need in medical informatics projects like the EU Precise4Q project, in which multidisciplinary semantically and syntactically heterogeneous data across several institutions needs to be integrated. Besides, data sharing agreements often allow a virtual data integration only, because data cannot leave the source repository. We propose a data harmonization infrastructure in which data is virtually integrated by sharing a semantically rich common data representation that allows their homogeneous querying. This common data model integrates content from well-known biomedical ontologies like SNOMED CT by using the BTL2 upper level ontology, and is imported into a graph database. We successfully integrated three datasets and made some test queries showing the feasibility of the approach.

2011 ◽  
Vol 268-270 ◽  
pp. 1868-1873
Author(s):  
Li Jun Yang

The existence of heterogeneous data sources brings great inconvenience to realize the exchange visits to data between different information systems. Therefore, it becomes a meaningful research topic to solve the problem of realizing convenient and flexible exchange visits. This paper combines the data representation format of XML generally used in current network with an interaction technique of WebService, and constructs a UDM data model, which can implement structured data of relational type as well as describe unstructured data and self-describing semi-structured data. So UDM data model can be used as a common data model integrated by heterogeneous data to integrate these heterogeneous data.


Author(s):  
Yaoling Zhu ◽  
Claus Pahl

A major aim of the Web service platform is the integration of existing software and information systems. Data integration is a central aspect in this context. Traditional techniques for information and data transformation are, however, not sufficient to provide flexible and automatable data integration solutions for Web service-enabled information systems. The difficulties arise from a high degree of complexity in data structures in many applications and from the additional problem of heterogeneity of data representation in applications that often cross organisational boundaries. The authors present an integration technique that embeds a declarative data transformation technique based on semantic data models as a mediator service into a Web service-oriented information system architecture. Automation through consistency-oriented semantic data models and flexibility through modular declarative data transformations are the key enablers of the approach.


2011 ◽  
pp. 997-1012
Author(s):  
Yaoling Zhu ◽  
Claus Pahl

A major aim of the Web service platform is the integration of existing software and information systems. Data integration is a central aspect in this context. Traditional techniques for information and data transformation are, however, not sufficient to provide flexible and automatable data integration solutions for Web service-enabled information systems. The difficulties arise from a high degree of complexity in data structures in many applications and from the additional problem of heterogeneity of data representation in applications that often cross organisational boundaries. The authors present an integration technique that embeds a declarative data transformation technique based on semantic data models as a mediator service into a Web service-oriented information system architecture. Automation through consistency-oriented semantic data models and flexibility through modular declarative data transformations are the key enablers of the approach.


2018 ◽  
Vol 57 (S 01) ◽  
pp. e82-e91 ◽  
Author(s):  
Hans-Ulrich Prokosch ◽  
Till Acker ◽  
Johannes Bernarding ◽  
Harald Binder ◽  
Martin Boeker ◽  
...  

Summary Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on the German Medical Informatics Initiative. Similar to other large international data sharing networks (e.g. OHDSI, PCORnet, eMerge, RD-Connect) MIRACUM is a consortium of academic and hospital partners as well as one industrial partner in eight German cities which have joined forces to create interoperable data integration centres (DIC) and make data within those DIC available for innovative new IT solutions in patient care and medical research. Objectives: Sharing data shall be supported by common interoperable tools and services, in order to leverage the power of such data for biomedical discovery and moving towards a learning health system. This paper aims at illustrating the major building blocks and concepts which MIRACUM will apply to achieve this goal. Governance and Policies: Besides establishing an efficient governance structure within the MIRACUM consortium (based on the steering board, a central administrative office, the general MIRACUM assembly, six working groups and the international scientific advisory board), defining DIC governance rules and data sharing policies, as well as establishing (at each MIRACUM DIC site, but also for MIRACUM in total) use and access committees are major building blocks for the success of such an endeavor. Architectural Framework and Methodology: The MIRACUM DIC architecture builds on a comprehensive ecosystem of reusable open source tools (MIRACOLIX), which are linkable and interoperable amongst each other, but also with the existing software environment of the MIRACUM hospitals. Efficient data protection measures, considering patient consent, data harmonization and a MIRACUM metadata repository as well as a common data model are major pillars of this framework. The methodological approach for shared data usage relies on a federated querying and analysis concept. Use Cases: MIRACUM aims at proving the value of their DIC with three use cases: IT support for patient recruitment into clinical trials, the development and routine care implementation of a clinico-molecular predictive knowledge tool, and molecular-guided therapy recommendations in molecular tumor boards. Results: Based on the MIRACUM DIC release in the nine months conceptual phase first large scale analysis for stroke and colorectal cancer cohorts have been pursued. Discussion: Beyond all technological challenges successfully applying the MIRACUM tools for the enrichment of our knowledge about diagnostic and therapeutic concepts, thus supporting the concept of a Learning Health System will be crucial for the acceptance and sustainability in the medical community and the MIRACUM university hospitals.


2010 ◽  
Vol 11 (3) ◽  
pp. 292-298
Author(s):  
Hongjun SU ◽  
Yehua SHENG ◽  
Yongning WEN ◽  
Min CHEN

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mario Zanfardino ◽  
Rossana Castaldo ◽  
Katia Pane ◽  
Ornella Affinito ◽  
Marco Aiello ◽  
...  

AbstractAnalysis of large-scale omics data along with biomedical images has gaining a huge interest in predicting phenotypic conditions towards personalized medicine. Multiple layers of investigations such as genomics, transcriptomics and proteomics, have led to high dimensionality and heterogeneity of data. Multi-omics data integration can provide meaningful contribution to early diagnosis and an accurate estimate of prognosis and treatment in cancer. Some multi-layer data structures have been developed to integrate multi-omics biological information, but none of these has been developed and evaluated to include radiomic data. We proposed to use MultiAssayExperiment (MAE) as an integrated data structure to combine multi-omics data facilitating the exploration of heterogeneous data. We improved the usability of the MAE, developing a Multi-omics Statistical Approaches (MuSA) tool that uses a Shiny graphical user interface, able to simplify the management and the analysis of radiogenomic datasets. The capabilities of MuSA were shown using public breast cancer datasets from TCGA-TCIA databases. MuSA architecture is modular and can be divided in Pre-processing and Downstream analysis. The pre-processing section allows data filtering and normalization. The downstream analysis section contains modules for data science such as correlation, clustering (i.e., heatmap) and feature selection methods. The results are dynamically shown in MuSA. MuSA tool provides an easy-to-use way to create, manage and analyze radiogenomic data. The application is specifically designed to guide no-programmer researchers through different computational steps. Integration analysis is implemented in a modular structure, making MuSA an easily expansible open-source software.


Author(s):  
Uwe Weissflog

Abstract This paper provides an overview of methods and ideas to achieve data integration in CIM. It describes a dictionary approach allowing participating applications to define their common constructs gradually as an additional service across application systems. Because of the importance of product definition data, the role of PDES/STEP as part of this dictionary approach is also described. The technical concepts of the dictionary, such as schema mapping, semantic data model, user methods and the required additions within participating applications are explained. Problems related to data integrity, data redundancy, performance and binding of dissimilar software components are discussed as well as the deficiencies related to today’s data modelling capabilities. The added value an active dictionary can provide to a CIM environment consisting of established applications in heterogeneous environments, where migration into one standardized homogeneous set of CIM applications is not likely, is also explained.


Sign in / Sign up

Export Citation Format

Share Document