scholarly journals Increasing Trust in Real-World Evidence Through Evaluation of Observational Data Quality

Author(s):  
Clair Blacketer ◽  
Frank J Defalco ◽  
Patrick B Ryan ◽  
Peter R Rijnbeek

Advances in standardization of observational healthcare data have enabled methodological breakthroughs, rapid global collaboration, and generation of real-world evidence to improve patient outcomes. Standardizations in data structure, such as use of Common Data Models (CDM), need to be coupled with standardized approaches for data quality assessment. To ensure confidence in real-world evidence generated from the analysis of real-world data, one must first have confidence in the data itself. The Data Quality Dashboard is an open-source R package that reports potential quality issues in an OMOP CDM instance through the systematic execution and summarization of over 3,300 configurable data quality checks. We describe the implementation of check types across a data quality framework of conformance, completeness, plausibility, with both verification and validation. We illustrate how data quality checks, paired with decision thresholds, can be configured to customize data quality reporting across a range of observational health data sources. We discuss how data quality reporting can become part of the overall real-world evidence generation and dissemination process to promote transparency and build confidence in the resulting output. Transparently communicating how well CDM standardized databases adhere to a set of quality measures adds a crucial piece that is currently missing from observational research. Assessing and improving the quality of our data will inherently improve the quality of the evidence we generate.

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yiqing Zhao ◽  
Saravut J. Weroha ◽  
Ellen L. Goode ◽  
Hongfang Liu ◽  
Chen Wang

Abstract Background Next-generation sequencing provides comprehensive information about individuals’ genetic makeup and is commonplace in oncology clinical practice. However, the utility of genetic information in the clinical decision-making process has not been examined extensively from a real-world, data-driven perspective. Through mining real-world data (RWD) from clinical notes, we could extract patients’ genetic information and further associate treatment decisions with genetic information. Methods We proposed a real-world evidence (RWE) study framework that incorporates context-based natural language processing (NLP) methods and data quality examination before final association analysis. The framework was demonstrated in a Foundation-tested women cancer cohort (N = 196). Upon retrieval of patients’ genetic information using NLP system, we assessed the completeness of genetic data captured in unstructured clinical notes according to a genetic data-model. We examined the distribution of different topics regarding BRCA1/2 throughout patients’ treatment process, and then analyzed the association between BRCA1/2 mutation status and the discussion/prescription of targeted therapy. Results We identified seven topics in the clinical context of genetic mentions including: Information, Evaluation, Insurance, Order, Negative, Positive, and Variants of unknown significance. Our rule-based system achieved a precision of 0.87, recall of 0.93 and F-measure of 0.91. Our machine learning system achieved a precision of 0.901, recall of 0.899 and F-measure of 0.9 for four-topic classification and a precision of 0.833, recall of 0.823 and F-measure of 0.82 for seven-topic classification. We found in result-containing sentences, the capture of BRCA1/2 mutation information was 75%, but detailed variant information (e.g. variant types) is largely missing. Using cleaned RWD, significant associations were found between BRCA1/2 positive mutation and targeted therapies. Conclusions In conclusion, we demonstrated a framework to generate RWE using RWD from different clinical sources. Rule-based NLP system achieved the best performance for resolving contextual variability when extracting RWD from unstructured clinical notes. Data quality issues such as incompleteness and discrepancies exist thus manual data cleaning is needed before further analysis can be performed. Finally, we were able to use cleaned RWD to evaluate the real-world utility of genetic information to initiate a prescription of targeted therapy.


2020 ◽  
Author(s):  
Yiqing ZHAO ◽  
Saravut J Weroha ◽  
Ellen Goode ◽  
Hongfang Liu ◽  
Chen Wang

Abstract Background: Next-generation sequencing provides comprehensive information about individuals’ genetic makeup and is commonplace in oncology clinical practice. However, the utility of genetic information in clinical decision-making process has not been examined extensively from a real-world, data-driven perspective. Through mining real-world data (RWD) from clinical notes, we could extract patients’ genetic information and further associate treatment decisions with genetic information.Methods: We proposed a real-world evidence (RWE) study framework that incorporates context-based natural language processing (NLP) methods and data quality examination before final association analysis. The framework was demonstrated on a Foundation-tested women cancer cohort (N=196). Upon retrieval of patients’ genetic information using NLP system, we assessed completeness of genetic data captured in unstructured clinical notes according a genetic data-model. We examined the distribution of different topics regarding BRCA1/2 throughout patients’ treatment process, and then analyzed the association between BRCA1/2 mutation status and the discussion/prescription of targeted therapy. Results: We identified seven topics in clinical context of genetic mentions including: Information, Evaluation, Insurance, Order, Negative, Positive, and Variants of unknown significance (VUS). Our rule-based system achieved a precision of 0.87, recall of 0.93 and F-measure of 0.91. Our machine learning system achieved a precision of 0.901, recall of 0.899 and F-measure of 0.9 for four-topic classification and a precision of 0.833, recall of 0.823 and F-measure of 0.82 for seven-topic classification. We found in result-containing sentences, capture of BRCA1/2 mutation information was 75%, but detailed variant information (e.g. variant types) is largely missing. Using cleaned RWD, significant associations were found between BRCA1/2 positive mutation and targeted therapies.Conclusions: In conclusion, we demonstrated a framework to generate RWE using RWD from different clinical sources. Rule-based NLP system achieved the best performance for resolving contextual variability when extracting RWD from unstructured clinical notes. Data quality issue such as incompleteness and discrepancies exist thus manual data cleaning is needed before further analysis can be performed. Finally, we were able to use cleaned RWD to evaluate real-world utility of genetic information to initiate prescription of targeted therapy.


2019 ◽  
Vol 14 (1) ◽  
pp. 174-179 ◽  
Author(s):  
David C. Klonoff

Real-world evidence (RWE) is the clinical evidence about benefits or risks of medical products derived from analyzing real world data (RWD), which are data collected through routine clinical practice. This article discusses the advantages and disadvantages of RWE studies, how these studies differ from randomized controlled trials (RCTs), how to overcome barriers to current skepticism about RWE, how FDA is using RWE, how to improve the quality of RWE, and finally the future of RWE trials.


Oncology ◽  
2021 ◽  
Vol 99 (Suppl. 1) ◽  
pp. 3-7
Author(s):  
George D. Demetri ◽  
Silvia Stacchiotti

Real-world data are defined as data relating to any aspect of a patient’s health status collected in the context of routine health surveillance and medical care delivery. Sources range from insurance billing claims through to electronic surveillance data (e.g., activity trackers). Real-world data derive from large populations in diverse clinical settings and thus can be extrapolated more readily than clinical trial data to patients in different clinical settings or with a variety of comorbidities. Real-world data are used to generate real-world evidence, which might be regarded as a “meta-analysis” of accumulated real-world data. Increasingly, regulatory authorities are recognizing the value of real-world data and real-world evidence, especially for rare diseases where it may be practically unfeasible to conduct randomized controlled trials. However, the quality of real-world evidence depends on the quality of the data collected which, in turn, depends on a correct pathological diagnosis and the homogeneous behaviour of a reliably defined and consistent disease entity. As each of the more than 80 varieties of soft tissue sarcoma (STS) types represents a distinct disease entity, the situation is exceedingly complicated. Discordant diagnoses, which affect data quality, present a major challenge for use of real-world data. As real-world data are difficult to collect, collaboration across sarcoma reference institutions and sophisticated information technology solutions are required before the potential of real-world evidence to inform decision-making in the management of STS can be fully exploited.


2021 ◽  
Author(s):  
Qian Li ◽  
Hansi Zhang ◽  
Zhaoyi Chen ◽  
Yi Guo ◽  
Thomas George ◽  
...  

Recently, there is a growing interest in using real-world data (RWD) to generate real-world evidence (RWE) that complements clinical trials. Nevertheless, to quantify the treatment effects, it is important to develop meaningful RWD-based endpoints. In cancer trials, two real-world endpoints are particularly of interest: real-world overall survival (rwOS) and real-world time to next treatment (rwTTNT). In this work, we identified ways to calculate these real-world endpoints with structured EHR data, and validated these endpoints against the gold-standard measurements of these endpoints derived from linked EHR and TR data. In addition, we also examined and reported the data quality issues especially the inconsistency between the EHR and TR data. Using survival model, our result showed that patients (1) without subsequent chemotherapy or (2) with subsequent chemotherapy and longer rwTTNT, would have longer rwOS, showing the validity of using rwTTNT as a real-world surrogate marker for measuring cancer endpoints.


Author(s):  
Giovanni Paoletti ◽  
Danilo Di Bona ◽  
Derek Chu ◽  
Davide Firinu ◽  
Enrico Heffler ◽  
...  

Although there is a considerable body of knowledge about allergen immunotherapy (AIT), there is a lack of data on the reliability of real-world evidence (RWE) in AIT and consequently, a lack of information on how AIT effectively works in real life. To address the current unmet need for an appraisal of the quality of RWE in AIT, the European Academy of Allergy and Clinical Immunology Methodology Committee recently initiated a systematic review of observational studies of AIT, which will use the RELEVANT tool and the Grading of Recommendations Assessment, Development and Evaluation approach (GRADE) to rate the quality of the evidence base as a whole. The next step will be to develop a broadly applicable, pragmatic “real-world” database using systematic data collection. Based on the current RWE base, and perspectives and recommendations of authorities and scientific societies, a hierarchy of RWE in AIT is proposed, which places pragmatic trials and registry data at the positions of highest level of evidence. There is a need to establish more AIT registries that collect data in a cohesive way, using standardised protocols. This will provide an essential source of real-world data that can be easily shared, promoting evidence-based research and quality improvement in study design and clinical decision-making.


2020 ◽  
Author(s):  
Yiqing ZHAO ◽  
Saravut J Weroha ◽  
Ellen Goode ◽  
Hongfang Liu ◽  
Chen Wang

Abstract Background : Next- generation sequencing provides comprehensive information about individuals’ genetic makeup and is commonplace in oncology clinical practice. However, the utility of genetic information in clinical decision-making process has not been examined extensively from a real-world, data-driven perspective. Through mining real-world data (RWD) from clinical notes, we could extract patients’ genetic information and further associate treatment decisions with genetic information. Methods : We proposed a real-world evidence (RWE) study framework that incorporates context-based natural language processing (NLP) methods and data quality examination before final association analysis. The framework was demonstrated in a Foundation-tested women cancer cohort (N=196). Upon retrieval of patients’ genetic information using NLP system, we assessed the completeness of genetic data captured in unstructured clinical notes according to a genetic data-model. We examined the distribution of different topics regarding BRCA1/2 throughout patients’ treatment process, and then analyzed the association between BRCA1/2 mutation status and the discussion/prescription of targeted therapy . Results: We identified seven topics in clinical context of genetic mentions including: Information, Evaluation, Insurance, Order, Negative, Positive, and Variants of unknown significance (VUS) . O ur rule-based system achieved a precision of 0.87, recall of 0.93 and F- measure of 0.91. Our machine learning system achieved a precision of 0.901, recall of 0.899 and F- measure of 0.9 for four-topic classification and a precision of 0.833, recall of 0.823 and F- measure of 0.82 for seven-topic classification. We found in result-containing sentences, capture of BRCA1/2 mutation information was 75%, but detailed variant information (e.g. variant types) is largely missing. Using cleaned RWD, significant associations were found between BRCA1/2 positive mutation and targeted therapies. Conclusions : In conclusion, we demonstrated a framework to generate RWE using RWD from different clinical sources. Rule-based NLP system achieved the best performance for resolving contextual variability when extracting RWD from unstructured clinical notes . Data quality issues such as incompleteness and discrepancies exist thus manual data cleaning is needed before further analysis can be performed. Finally, we were able to use cleaned RWD to evaluate real-world utility of genetic information to initiate a prescription of targeted therapy.


2020 ◽  
Vol 36 (4) ◽  
pp. 459-468 ◽  
Author(s):  
Karen M. Facey ◽  
Piia Rannanheimo ◽  
Laura Batchelor ◽  
Marine Borchardt ◽  
Jo de Cock

ObjectivesThere are divergent views on the potential of real-world data (RWD) to inform decisions made by regulators, health technology assessment (HTA) bodies, payers, clinicians, and patients. This RWE4Decisions initiative explored the particularly challenging setting of highly innovative technologies, which require Payers/HTAs to make decisions on a small evidence base with major uncertainties. The aim was to go beyond strategic intent to consider actions that each stakeholder could take to improve use of RWD in this setting.ResultsCase studies of recent Payer/HTA decisions about highly innovative technologies were considered in light of recent international initiatives about RWD. This showed a lack of clarity about the Payer/HTA questions that could be answered by RWD and how the quality of real-world evidence (RWE) could be assessed. All stakeholders worked together to create a vision whereby stakeholders agree what RWD can be collected for highly innovative technologies based on principles of collaboration and transparency. For each stakeholder group, recommended actions to support the generation, analysis, and interpretation of RWD to inform decision making were developed. For HTA bodies, this includes cross border HTA/regulatory collaboration to agree RWD requirements over the technology life cycle to inform initial recommendations and reassessment, data analytics methods development for HTA, and promotion of transparency in RWE studies.RecommendationsStakeholders need to collaborate on demonstration projects to consider how RWE can be developed to inform healthcare decisions and contribute to a learning network that can develop systems to support a learning health system and improve patient outcomes through best use of RWD.


Sign in / Sign up

Export Citation Format

Share Document