scholarly journals The Biomedical Research Hub: A Federated Platform for Patient Research Data

Author(s):  
Craig Barnes ◽  
Binam Bajracharya ◽  
Matthew Cannalte ◽  
Zakir Gowani ◽  
Will Haley ◽  
...  

Objective. The objective was to develop and operate a cloud-based federated system for managing, analyzing and sharing patient data for research purposes, while allowing each resource sharing patient data to operate their component based upon their own governance rules. The federated system is called the Biomedical Research Hub (BRH). Methods. The BRH is a cloud-based federated system built over a core set of software services called framework services. BRH framework services include authentication and authorization, services for generating and assessing FAIR data, and services for importing and exporting bulk clinical data. The BRH includes data resources providing data operated by different entities and workspaces that can access and analyze data from one or more of the data resources in the BRH. Results. The BRH contains multiple data commons that in aggregate provide access to over 6 PB of research data from over 400,000 research participants. Discussion and conclusion. With the growing acceptance of using public cloud computing platforms for biomedical research, and the growing use of opaque persistent digital identifiers for datasets, data objects, and other entities, there is now a foundation for systems that federate data from multiple independently operated data resources that expose FAIR APIs, each using a separate data model. Applications can be built that access data from one or more of the data resources.

2019 ◽  
Vol 11 (2) ◽  
pp. 189-207
Author(s):  
Erwin Tantoso ◽  
Wing-Cheong Wong ◽  
Wei Hong Tay ◽  
Joanne Lee ◽  
Swati Sinha ◽  
...  

2020 ◽  
Author(s):  
Oliver Maassen ◽  
Sebastian Fritsch ◽  
Julia Gantner ◽  
Saskia Deffge ◽  
Julian Kunze ◽  
...  

BACKGROUND The increasing development of artificial intelligence (AI) systems in medicine driven by researchers and entrepreneurs goes along with enormous expectations for medical care advancement. AI might change the clinical practice of physicians from almost all medical disciplines and in most areas of healthcare. While expectations for AI in medicine are high, practical implementations of AI for clinical practice are still scarce in Germany. Moreover, physicians’ requirements and expectations of AI in medicine and their opinion on the usage of anonymized patient data for clinical and biomedical research has not been investigated widely in German university hospitals. OBJECTIVE Evaluate physicians’ requirements and expectations of AI in medicine and their opinion on the secondary usage of patient data for (bio)medical research e.g. for the development of machine learning (ML) algorithms in university hospitals in Germany. METHODS A web-based survey was conducted addressing physicians of all medical disciplines in 8 German university hospitals. Answers were given on Likert scales and general demographic responses. Physicians were asked to participate locally via email in the respective hospitals. RESULTS 121 (39.9%) female and 173 (57.1%) male physicians (N=303) from a wide range of medical disciplines and work experience levels completed the online survey. The majority of respondents either had a positive (130/303, 42.9%) or a very positive attitude (82/303, 27.1%) towards AI in medicine. A vast majority of physicians expected the future of medicine to be a mix of human and artificial intelligence (273/303, 90.1%) but also requested a scientific evaluation before the routine implementation of AI-based systems (276/303, 91.1%). Physicians were most optimistic that AI applications would identify drug interactions (280/303, 92.4%) to improve patient care substantially but were quite reserved regarding AI-supported diagnosis of psychiatric diseases (62/303, 20.5%). 82.5% of respondents (250/303) agreed that there should be open access to anonymized patient databases for medical and biomedical research. CONCLUSIONS Physicians in stationary patient care in German university hospitals show a generally positive attitude towards using most AI applications in medicine. Along with this optimism, there come several expectations and hopes that AI will assist physicians in clinical decision making. Especially in fields of medicine where huge amounts of data are processed (e.g., imaging procedures in radiology and pathology) or data is collected continuously (e.g. cardiology and intensive care medicine), physicians’ expectations to substantially improve future patient care are high. However, for the practical usage of AI in healthcare regulatory and organizational challenges still have to be mastered.


2021 ◽  
Author(s):  
Anita Bandrowski ◽  
Jeffrey S. Grethe ◽  
Anna Pilko ◽  
Tom Gillespie ◽  
Gabi Pine ◽  
...  

AbstractThe NIH Common Fund’s Stimulating Peripheral Activity to Relieve Conditions (SPARC) initiative is a large-scale program that seeks to accelerate the development of therapeutic devices that modulate electrical activity in nerves to improve organ function. Integral to the SPARC program are the rich anatomical and functional datasets produced by investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural and functional connectivity, mapping of cell types and molecular profiling. These datasets are provided to the research community through an open data platform, the SPARC Portal. To ensure SPARC datasets are Findable, Accessible, Interoperable and Reusable (FAIR), they are all submitted to the SPARC portal following a standard scheme established by the SPARC Curation Team, called the SPARC Data Structure (SDS). Inspired by the Brain Imaging Data Structure (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are coming from all fields of biomedical research. Here we present the rationale and design of the SDS, including a description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. The objective is to provide detailed guidelines for anyone desiring to comply with the SDS. Since the SDS are suitable for any type of biomedical research data, it can be adopted by any group desiring to follow the FAIR data principles for managing their data, even outside of the SPARC consortium. Finally, this manuscript provides a foundational framework that can be used by any organization desiring to either adapt the SDS to suit the specific needs of their data or simply desiring to design their own FAIR data sharing scheme from scratch.


2021 ◽  
Author(s):  
Shyam Visweswaran ◽  
Brian McLay ◽  
Nickie Capella ◽  
Michele Morris ◽  
John T Milnes ◽  
...  

Objective As a long-standing Clinical and Translational Science Awards (CTSA) Program hub, the University of Pittsburgh and the University of Pittsburgh Medical Center (UPMC) developed and implemented a modern research data warehouse (RDW) to efficiently provision electronic patient data for clinical and translational research. Methods Because UPMC is one of the largest health care systems in the US with multiple vendors' electronic health record (EHR) systems, we designed and implemented an RDW named Neptune to serve the specific needs of our CTSA. Neptune uses an atomic design where data is stored at a high level of granularity as represented in source systems. Neptune contains robust patient identity management tailored for research; integrates patient data from multiple sources, including EHRs, health plans, and research studies; and includes knowledge for mapping to standard terminologies. Neptune enables efficient provisioning of data to large analytics-oriented data models and to individual investigators. Results Neptune contains data for more than 5 million patients longitudinally organized as HIPAA Limited Data with dates and includes structured EHR data, clinical documents, health insurance claims, and research data. Neptune is used as a source for patient data for hundreds of IRB-approved research projects by local investigators and for national projects such as the Accrual to Clinical Trials (ACT) network, the All of Us Research Program, and the National Patient-Centered Clinical Research Network. Discussion The design of Neptune was heavily influenced by the large size of UPMC, the varied data sources, and the rich partnership between the University and the healthcare system. It features several desiderata of an RDW, including robust protected health information management, an extensible information storage model, and binding to standard terminologies at the time of data delivery. It also includes several unique aspects, including the physical warehouse straddling the University of Pittsburgh and UPMC networks and management under a HIPAA Business Associates Agreement.


2021 ◽  
Author(s):  
Núria Queralt-Rosinach ◽  
Rajaram Kaliyaperumal ◽  
César H. Bernabé ◽  
Qinqin Long ◽  
Simone A. Joosten ◽  
...  

AbstractBackgroundThe COVID-19 pandemic has challenged healthcare systems and research worldwide. Data is collected all over the world and needs to be integrated and made available to other researchers quickly. However, the various heterogeneous information systems that are used in hospitals can result in fragmentation of health data over multiple data ‘silos’ that are not interoperable for analysis. Consequently, clinical observations in hospitalised patients are not prepared to be reused efficiently and timely. There is a need to adapt the research data management in hospitals to make COVID-19 observational patient data machine actionable, i.e. more Findable, Accessible, Interoperable and Reusable (FAIR) for humans and machines. We therefore applied the FAIR principles in the hospital to make patient data more FAIR.ResultsIn this paper, we present our FAIR approach to transform COVID-19 observational patient data collected in the hospital into machine actionable digital objects to answer medical doctors’ research questions. With this objective, we conducted a coordinated FAIRification among stakeholders based on ontological models for data and metadata, and a FAIR based architecture that complements the existing data management. We applied FAIR Data Points for metadata exposure, turning investigational parameters into a FAIR dataset. We demonstrated that this dataset is machine actionable by means of three different computational activities: federated query of patient data along open existing knowledge sources across the world through the Semantic Web, implementing Web APIs for data query interoperability, and building applications on top of these FAIR patient data for FAIR data analytics in the hospital.ConclusionsOur work demonstrates that a FAIR research data management plan based on ontological models for data and metadata, open Science, Semantic Web technologies, and FAIR Data Points is providing data infrastructure in the hospital for machine actionable FAIR digital objects. This FAIR data is prepared to be reused for federated analysis, linkable to other FAIR data such as Linked Open Data, and reusable to develop software applications on top of them for hypothesis generation and knowledge discovery.


2019 ◽  
Vol 36 (1) ◽  
pp. 1-9 ◽  
Author(s):  
Vahid Jalili ◽  
Enis Afgan ◽  
James Taylor ◽  
Jeremy Goecks

Abstract Motivation Large biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users. Results We have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g. username, password, API key), instead relying on automatically generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use. Availability and implementation Freely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/cloudauthz.


Sign in / Sign up

Export Citation Format

Share Document