scholarly journals Anomaly Detection Algorithm for Real-World Data and Evidence in Clinical Research (Preprint)

10.2196/27172 ◽  
2021 ◽  
Author(s):  
Vendula Churová ◽  
Roman Vyškovský ◽  
Kateřina Maršálová ◽  
David Kudláček ◽  
Daniel Schwarz
2021 ◽  
Author(s):  
Vendula Churová ◽  
Roman Vyškovský ◽  
Kateřina Maršálová ◽  
David Kudláček ◽  
Daniel Schwarz

BACKGROUND Statistical analysis, which has become an integral part of evidence-based medicine, relies heavily on data quality that is of critical importance in modern clinical research. Input data are not only at risk of being falsified or fabricated, but also of being mishandled by investigators. OBJECTIVE The urgent need to assure the highest data quality possible has led to implementation of various auditing strategies designed to monitor clinical trials and detect errors of different origin that frequently occur in the field. METHODS An automatic anomaly detection algorithm based on machine learning that combines clustering with a series of distance metrics is presented. RESULTS The algorithm is built in a particular electronic data capture (EDC) system that stores real-world data in clinical registries. These data, together with newly generated, simulated anomalous data were utilized to evaluate the detection performance of this algorithm. CONCLUSIONS The experimental results demonstrate that the algorithm, which is universal, and as such may be implemented in other EDC systems, is capable of anomalous data detection with sensitivity exceeding 85%.


BMJ Open ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. e038375
Author(s):  
Feifei Jin ◽  
Chen Yao ◽  
Xiaoyan Yan ◽  
Chongya Dong ◽  
Junkai Lai ◽  
...  

ObjectiveTo investigate the gap between real-world data and clinical research initiated by doctors in China, explore the potential reasons for this gap and collect different stakeholders’ suggestions.DesignThis qualitative study involved three types of hospital personnel based on three interview outlines. The data analysis was performed using the constructivist grounded theory analysis process.SettingSix tertiary hospitals (three general hospitals and three specialised hospitals) in Beijing, China, were included.ParticipantsIn total, 42 doctors from 12 departments, 5 information technology managers and 4 clinical managers were interviewed through stratified purposive sampling.ResultsElectronic medical record data cannot be directly downloaded into clinical research files, which is a major problem in China. The lack of data interoperability, unstructured electronic medical record data and concerns regarding data security create a gap between real-world data and research data. Updating hospital information systems, promoting data standards and establishing an independent clinical research platform may be feasible suggestions for solving the current problems.ConclusionsDetermining the causes of gaps and targeted solutions could contribute to the development of clinical research in China. This research suggests that updating the hospital information system, promoting data standards and establishing a clinical research platform could promote the use of real-world data in the future.


Circulation ◽  
2020 ◽  
Vol 141 (Suppl_1) ◽  
Author(s):  
Åke Olsson ◽  
Magnus Samulesson

Background: Automatic ECG algorithms using only RR-variability in ECG to detect AF have shown high false positive rates. By including P-wave presence in the algorithm, research has shown that it can increase detection accuracy for AF. Methods: A novel RR- and P-wave based automatic detection algorithm implemented in the Coala Heart Monitor ("Coala", Coala Life AB, Sweden) was evaluated for detection accuracy by the comparison to blinded manual ECG interpretation based on real-world data. Evaluation was conducted on 100 consecutive anonymous printouts of chest- and thumb-ECG waveforms, where the algorithm had detected both irregular RR-rhythms and strong P-waves in either chest or thumb recording (non-AF episodes classified by algorithm as Category 12).The recordings, without exclusions, were generated from 5,512 real-world data recordings from actual Coala users in Sweden (both OTC and Rx users) during the period of March 5 to March 22, 2019, with no control or influence by the researchers or any other organization or individual. The prevalence of cardiac conditions in the user population was unknown.The blinded recordings were each manually interpreted by a trained cardiologist. The manual interpretation was compared with the automatic analysis performed by the detection algorithm to determine the number of additional false negative indications for AF as presented to the user. Results: The trained cardiologist manually interpreted 0 of the 100 recordings as AF. Manual interpretation showed that the novel automatic AF algorithm yielded 0 % False Negative error and 100 % Negative Predictive Value (NPV) for detection of AF. Irregular RR-rhythms were detected in 569 recordings (10 % of a total of 5,512 recordings). The 100 non-AF recordings containing both irregular RR-rhythms and strong P-waves constituted 18% of all recordings with irregular RR-rhythms. Respiratory sinus arrhythmia was the single most prevalent condition and was found in 47% of irregular RR-rhythms with strong P-waves. Conclusion: The novel, P-wave based automatic ECG algorithm used in the Coala, showed a zero percent False Negative error rate for AF detection in ECG recordings with RR-variability but presence of P-waves, as compared to manual interpretation by a cardiologist.


2012 ◽  
Vol 2 (3) ◽  
Author(s):  
Jaroslav Zendulka ◽  
Martin Pešek

AbstractCurrently many devices provide information about moving objects and location-based services that accumulate a huge volume of moving object data, including trajectories. This paper deals with two useful analysis tasks — mining moving object patterns and trajectory outlier detection. We also present our experience with the TOP-EYE trajectory outlier detection algorithm, which we applied to two real-world data sets.


2020 ◽  
Vol 29 (01) ◽  
pp. 203-207
Author(s):  
Christel Daniel ◽  
Dipak Kalra ◽  

Objectives: To summarize key contributions to current research in the field of Clinical Research Informatics (CRI) and to select best papers published in 2019. Method: A bibliographic search using a combination of MeSH descriptors and free-text terms on CRI was performed using PubMed, followed by a double-blind review in order to select a list of candidate best papers to be then peer-reviewed by external reviewers. After peer-review ranking, a consensus meeting between the two section editors and the editorial team was organized to finally conclude on the selected three best papers. Results: Among the 517 papers, published in 2019, returned by the search, that were in the scope of the various areas of CRI, the full review process selected three best papers. The first best paper describes the use of a homomorphic encryption technique to enable federated analysis of real-world data while complying more easily with data protection requirements. The authors of the second best paper demonstrate the evidence value of federated data networks reporting a large real world data study related to the first line treatment for hypertension. The third best paper reports the migration of the US Food and Drug Administration (FDA) adverse event reporting system database to the OMOP common data model. This work opens the combined analysis of both spontaneous reporting system and electronic health record (EHR) data for pharmacovigilance. Conclusions: The most significant research efforts in the CRI field are currently focusing on real world evidence generation and especially the reuse of EHR data. With the progress achieved this year in the areas of phenotyping, data integration, semantic interoperability, and data quality assessment, real world data is becoming more accessible and reusable. High quality data sets are key assets not only for large scale observational studies or for changing the way clinical trials are conducted but also for developing or evaluating artificial intelligence algorithms guiding clinical decision for more personalized care. And lastly, security and confidentiality, ethical and regulatory issues, and more generally speaking data governance are still active research areas this year.


2017 ◽  
Vol 2017 ◽  
pp. 1-13 ◽  
Author(s):  
Thitaree Tanprasert ◽  
Chalermpol Saiprasert ◽  
Suttipong Thajchayapong

This paper proposes an algorithm for real-time driver identification using the combination of unsupervised anomaly detection and neural networks. The proposed algorithm uses nonphysiological signals as input, namely, driving behavior signals from inertial sensors (e.g., accelerometers) and geolocation signals from GPS sensors. First anomaly detection is performed to assess if the current driver is whom he/she claims to be. If an anomaly is detected, the algorithm proceeds to find relevant features in the input signals and use neural networks to identify drivers. To assess the proposed algorithm, real-world data are collected from ten drivers who drive different vehicles on several routes in real-world traffic conditions. Driver identification is performed on each of the seven-second-long driving behavior signals and geolocation signals in a streaming manner. It is shown that the proposed algorithm can achieve relatively high accuracy and identify drivers within 13 seconds. The proposed algorithm also outperforms the previously proposed driver identification algorithms. Furthermore, to demonstrate how the proposed algorithm can be deployed in real-world applications, results from real-world data associated with each operation of the proposed algorithm are shown step-by-step.


2021 ◽  
Author(s):  
Rhonda Facile ◽  
Erin Elizabeth Muhlbradt ◽  
Mengchun Gong ◽  
Qing-Na Li ◽  
Vaishali B. Popat ◽  
...  

BACKGROUND Real World Data (RWD) and Real World Evidence (RWE) have an increasingly important role in clinical research and health care decision making in many countries. In order to leverage RWD and generate reliable RWE, a framework must be in place to ensure that the data is well-defined and structured in a way that is semantically interoperable and consistent across stakeholders. The adoption of data standards is one of the cornerstones supporting high-quality evidence for clinical medicine and therapeutics development. CDISC data standards are mature, globally recognized and heavily utilized by the pharmaceutical industry for regulatory submission in the US and Japan and are recommended in Europe and China. Against this backdrop, the CDISC RWD Connect Initiative was initiated to better understand the barriers to implementing CDISC standards for RWD and to identify the tools and guidance needed to more easily implement CDISC standards for this purpose. We believe that bridging the gap between RWD and clinical trial generated data will benefit all stakeholders. OBJECTIVE The aim of this project was to understand the barriers to implementing CDISC standards for Real World Data (RWD) and to identify what tools and guidance may be needed to more easily implement CDISC standards for this purpose. METHODS We conducted a qualitative Delphi survey involving an Expert Advisory Board (EAB) with multiple key stakeholders, with three rounds of input and review. RESULTS In total, 66 experts participated in round 1, 56 participated in round 2 and 49 participated in round 3 of the Delphi Survey. Their input was collected and analyzed culminating in group statements. It was widely agreed that the standardization of RWD is highly necessary, and the primary focus should be on its ability to improve data-sharing and the quality of RWE. The priorities for RWD standardization include electronic health records, such as data shared using HL7 FHIR, and data stemming from observational studies. With different standardization efforts already underway in these areas, a gap analysis should be performed to identify areas where synergies and efficiencies are possible and then collaborate with stakeholders to create, or extend existing, mappings between CDISC and other standards, controlled terminologies and models to represent data originating across different sources. CONCLUSIONS There are many ongoing data standardization efforts that span the spectrum of human health data related activities including, but not limited to, those related to healthcare, public health, product or disease registries and clinical research, each with different definitions, levels of granularity and purpose. Amongst these standardization efforts, CDISC has been successful in standardizing clinical trial-based data for regulation worldwide. However, the complexity of the CDISC standards, and the fact that they were developed for different purposes, combined with the lack of awareness and incentives to using a new standard, insufficient training and implementation support are significant barriers for setting up the use of CDISC standards for RWD. The collection and dissemination of use cases showing in detail how to effectively implement CDISC standards for RWD, developing tools and support systems specifically for the RWD community, and collaboration with other standards development organizations and initiatives are potential steps towards connecting RWD to research. The integrity of RWE is dependent on the quality of the RWD and the data standards utilized in its collection, integration, processing, exchange and reporting. Using CDISC as part of the database schema will help to link clinical trial data and RWD and promote innovation in health data science. The authors believe that CDISC standards, if adapted carefully and presented appropriately to the RWD community, can provide “FAIR” structure and semantics for common clinical concepts and domains and help to bridge the gap between RWD and clinical trial generated data. CLINICALTRIAL Not Applicable


2016 ◽  
Vol 22 ◽  
pp. 219
Author(s):  
Roberto Salvatori ◽  
Olga Gambetti ◽  
Whitney Woodmansee ◽  
David Cox ◽  
Beloo Mirakhur ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document