scholarly journals Gap between real-world data and clinical research within hospitals in China: a qualitative study

BMJ Open ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. e038375
Author(s):  
Feifei Jin ◽  
Chen Yao ◽  
Xiaoyan Yan ◽  
Chongya Dong ◽  
Junkai Lai ◽  
...  

ObjectiveTo investigate the gap between real-world data and clinical research initiated by doctors in China, explore the potential reasons for this gap and collect different stakeholders’ suggestions.DesignThis qualitative study involved three types of hospital personnel based on three interview outlines. The data analysis was performed using the constructivist grounded theory analysis process.SettingSix tertiary hospitals (three general hospitals and three specialised hospitals) in Beijing, China, were included.ParticipantsIn total, 42 doctors from 12 departments, 5 information technology managers and 4 clinical managers were interviewed through stratified purposive sampling.ResultsElectronic medical record data cannot be directly downloaded into clinical research files, which is a major problem in China. The lack of data interoperability, unstructured electronic medical record data and concerns regarding data security create a gap between real-world data and research data. Updating hospital information systems, promoting data standards and establishing an independent clinical research platform may be feasible suggestions for solving the current problems.ConclusionsDetermining the causes of gaps and targeted solutions could contribute to the development of clinical research in China. This research suggests that updating the hospital information system, promoting data standards and establishing a clinical research platform could promote the use of real-world data in the future.

2021 ◽  
Author(s):  
Rhonda Facile ◽  
Erin Elizabeth Muhlbradt ◽  
Mengchun Gong ◽  
Qing-Na Li ◽  
Vaishali B. Popat ◽  
...  

BACKGROUND Real World Data (RWD) and Real World Evidence (RWE) have an increasingly important role in clinical research and health care decision making in many countries. In order to leverage RWD and generate reliable RWE, a framework must be in place to ensure that the data is well-defined and structured in a way that is semantically interoperable and consistent across stakeholders. The adoption of data standards is one of the cornerstones supporting high-quality evidence for clinical medicine and therapeutics development. CDISC data standards are mature, globally recognized and heavily utilized by the pharmaceutical industry for regulatory submission in the US and Japan and are recommended in Europe and China. Against this backdrop, the CDISC RWD Connect Initiative was initiated to better understand the barriers to implementing CDISC standards for RWD and to identify the tools and guidance needed to more easily implement CDISC standards for this purpose. We believe that bridging the gap between RWD and clinical trial generated data will benefit all stakeholders. OBJECTIVE The aim of this project was to understand the barriers to implementing CDISC standards for Real World Data (RWD) and to identify what tools and guidance may be needed to more easily implement CDISC standards for this purpose. METHODS We conducted a qualitative Delphi survey involving an Expert Advisory Board (EAB) with multiple key stakeholders, with three rounds of input and review. RESULTS In total, 66 experts participated in round 1, 56 participated in round 2 and 49 participated in round 3 of the Delphi Survey. Their input was collected and analyzed culminating in group statements. It was widely agreed that the standardization of RWD is highly necessary, and the primary focus should be on its ability to improve data-sharing and the quality of RWE. The priorities for RWD standardization include electronic health records, such as data shared using HL7 FHIR, and data stemming from observational studies. With different standardization efforts already underway in these areas, a gap analysis should be performed to identify areas where synergies and efficiencies are possible and then collaborate with stakeholders to create, or extend existing, mappings between CDISC and other standards, controlled terminologies and models to represent data originating across different sources. CONCLUSIONS There are many ongoing data standardization efforts that span the spectrum of human health data related activities including, but not limited to, those related to healthcare, public health, product or disease registries and clinical research, each with different definitions, levels of granularity and purpose. Amongst these standardization efforts, CDISC has been successful in standardizing clinical trial-based data for regulation worldwide. However, the complexity of the CDISC standards, and the fact that they were developed for different purposes, combined with the lack of awareness and incentives to using a new standard, insufficient training and implementation support are significant barriers for setting up the use of CDISC standards for RWD. The collection and dissemination of use cases showing in detail how to effectively implement CDISC standards for RWD, developing tools and support systems specifically for the RWD community, and collaboration with other standards development organizations and initiatives are potential steps towards connecting RWD to research. The integrity of RWE is dependent on the quality of the RWD and the data standards utilized in its collection, integration, processing, exchange and reporting. Using CDISC as part of the database schema will help to link clinical trial data and RWD and promote innovation in health data science. The authors believe that CDISC standards, if adapted carefully and presented appropriately to the RWD community, can provide “FAIR” structure and semantics for common clinical concepts and domains and help to bridge the gap between RWD and clinical trial generated data. CLINICALTRIAL Not Applicable


2020 ◽  
Vol 29 (01) ◽  
pp. 203-207
Author(s):  
Christel Daniel ◽  
Dipak Kalra ◽  

Objectives: To summarize key contributions to current research in the field of Clinical Research Informatics (CRI) and to select best papers published in 2019. Method: A bibliographic search using a combination of MeSH descriptors and free-text terms on CRI was performed using PubMed, followed by a double-blind review in order to select a list of candidate best papers to be then peer-reviewed by external reviewers. After peer-review ranking, a consensus meeting between the two section editors and the editorial team was organized to finally conclude on the selected three best papers. Results: Among the 517 papers, published in 2019, returned by the search, that were in the scope of the various areas of CRI, the full review process selected three best papers. The first best paper describes the use of a homomorphic encryption technique to enable federated analysis of real-world data while complying more easily with data protection requirements. The authors of the second best paper demonstrate the evidence value of federated data networks reporting a large real world data study related to the first line treatment for hypertension. The third best paper reports the migration of the US Food and Drug Administration (FDA) adverse event reporting system database to the OMOP common data model. This work opens the combined analysis of both spontaneous reporting system and electronic health record (EHR) data for pharmacovigilance. Conclusions: The most significant research efforts in the CRI field are currently focusing on real world evidence generation and especially the reuse of EHR data. With the progress achieved this year in the areas of phenotyping, data integration, semantic interoperability, and data quality assessment, real world data is becoming more accessible and reusable. High quality data sets are key assets not only for large scale observational studies or for changing the way clinical trials are conducted but also for developing or evaluating artificial intelligence algorithms guiding clinical decision for more personalized care. And lastly, security and confidentiality, ethical and regulatory issues, and more generally speaking data governance are still active research areas this year.


10.2196/27172 ◽  
2021 ◽  
Author(s):  
Vendula Churová ◽  
Roman Vyškovský ◽  
Kateřina Maršálová ◽  
David Kudláček ◽  
Daniel Schwarz

2018 ◽  
Author(s):  
Sujay S Kakarmath ◽  
Neda Derakhshani ◽  
Sara B. Golas ◽  
Jennifer Felsted ◽  
Takuma Shibahara ◽  
...  

BACKGROUND Heart failure (HF) patients have a high readmission rate with approximately 20% of patients being readmitted within 30-days after discharge. Hospital interventions to reduce HF readmissions are resource- and effort-intensive. Widespread availability of electronic medical record data has spurred interest in using machine learning-based techniques for risk stratification of heart failure patients. The predictive performance of machine learning-based predictive models is often evaluated solely using the Area Under the Receiver Operating Characteristic (AUROC) curve. However, the AUROC is independent of prevalence therefore predictive models with the same AUROC can have differential clinical utility. Furthermore, the AUROC does not provide any insight about the presence of overfitting or decay in predictive performance of a model over time, both of which can affect its real-world performance. OBJECTIVE Our primary objective is to assess real-world performance of a 30-day readmission risk prediction model for HF patients, which had an AUROC of 0.71 in the training dataset. METHODS Predictions for risk of 30-day readmissions in HF patients in the Partners Healthcare System were prospectively obtained from the model. We assessed the positive (PPV) and negative predictive value (NPV), in addition to sensitivity, specificity, accuracy, model calibration and Brier score. RESULTS Four hundred twenty index admissions that were not part of the training dataset were included in this prospective evaluation. Readmission rate was 24% (101 30-day readmissions). The AUROC of the predictive model was 0.57. At a discrimination threshold of 0.2 for flagging high-risk index admissions, the sensitivity and specificity of the model were 53.46% and 63.32%, respectively. The PPV and NPV were 31.57% and 81.12%, respectively. The Brier score was 0.19. CONCLUSIONS Our analysis offers important insights about the real-world performance of this predictive model. The NPV suggests that the model’s prediction about patients at low risk for readmission are reliable. This insight can be useful in optimizing resource allocation for patients with heart failure.


Author(s):  
Yuan Fang ◽  
Ming-Wei Chang

Microblogs present an excellent opportunity for monitoring and analyzing world happenings. Given that words are often ambiguous, entity linking becomes a crucial step towards understanding microblogs. In this paper, we re-examine the problem of entity linking on microblogs. We first observe that spatiotemporal ( i.e., spatial and temporal) signals play a key role, but they are not utilized in existing approaches. Thus, we propose a novel entity linking framework that incorporates spatiotemporal signals through a weakly supervised process. Using entity annotations on real-world data, our experiments show that the spatiotemporal model improves F1 by more than 10 points over existing systems. Finally, we present a qualitative study to visualize the effectiveness of our approach.


2021 ◽  
Author(s):  
Vendula Churová ◽  
Roman Vyškovský ◽  
Kateřina Maršálová ◽  
David Kudláček ◽  
Daniel Schwarz

BACKGROUND Statistical analysis, which has become an integral part of evidence-based medicine, relies heavily on data quality that is of critical importance in modern clinical research. Input data are not only at risk of being falsified or fabricated, but also of being mishandled by investigators. OBJECTIVE The urgent need to assure the highest data quality possible has led to implementation of various auditing strategies designed to monitor clinical trials and detect errors of different origin that frequently occur in the field. METHODS An automatic anomaly detection algorithm based on machine learning that combines clustering with a series of distance metrics is presented. RESULTS The algorithm is built in a particular electronic data capture (EDC) system that stores real-world data in clinical registries. These data, together with newly generated, simulated anomalous data were utilized to evaluate the detection performance of this algorithm. CONCLUSIONS The experimental results demonstrate that the algorithm, which is universal, and as such may be implemented in other EDC systems, is capable of anomalous data detection with sensitivity exceeding 85%.


Sign in / Sign up

Export Citation Format

Share Document