scholarly journals A Systematic Framework for Analyzing Observation Data in Patient-Centered Registries: Case Study for Patients With Depression (Preprint)

2020 ◽  
Author(s):  
Maryam Zolnoori ◽  
Mark D Williams ◽  
William B Leasure ◽  
Kurt B Angstman ◽  
Che Ngufor

BACKGROUND Patient-centered registries are essential in population-based clinical care for patient identification and monitoring of outcomes. Although registry data may be used in real time for patient care, the same data may further be used for secondary analysis to assess disease burden, evaluation of disease management and health care services, and research. The design of a registry has major implications for the ability to effectively use these clinical data in research. OBJECTIVE This study aims to develop a systematic framework to address the data and methodological issues involved in analyzing data in clinically designed patient-centered registries. METHODS The systematic framework was composed of 3 major components: visualizing the multifaceted and heterogeneous patient-centered registries using a data flow diagram, assessing and managing data quality issues, and identifying patient cohorts for addressing specific research questions. RESULTS Using a clinical registry designed as a part of a collaborative care program for adults with depression at Mayo Clinic, we were able to demonstrate the impact of the proposed framework on data integrity. By following the data cleaning and refining procedures of the framework, we were able to generate high-quality data that were available for research questions about the coordination and management of depression in a primary care setting. We describe the steps involved in converting clinically collected data into a viable research data set using registry cohorts of depressed adults to assess the impact on high-cost service use. CONCLUSIONS The systematic framework discussed in this study sheds light on the existing inconsistency and data quality issues in patient-centered registries. This study provided a step-by-step procedure for addressing these challenges and for generating high-quality data for both quality improvement and research that may enhance care and outcomes for patients. INTERNATIONAL REGISTERED REPORT DERR1-10.2196/18366

10.2196/18366 ◽  
2020 ◽  
Vol 9 (10) ◽  
pp. e18366
Author(s):  
Maryam Zolnoori ◽  
Mark D Williams ◽  
William B Leasure ◽  
Kurt B Angstman ◽  
Che Ngufor

Background Patient-centered registries are essential in population-based clinical care for patient identification and monitoring of outcomes. Although registry data may be used in real time for patient care, the same data may further be used for secondary analysis to assess disease burden, evaluation of disease management and health care services, and research. The design of a registry has major implications for the ability to effectively use these clinical data in research. Objective This study aims to develop a systematic framework to address the data and methodological issues involved in analyzing data in clinically designed patient-centered registries. Methods The systematic framework was composed of 3 major components: visualizing the multifaceted and heterogeneous patient-centered registries using a data flow diagram, assessing and managing data quality issues, and identifying patient cohorts for addressing specific research questions. Results Using a clinical registry designed as a part of a collaborative care program for adults with depression at Mayo Clinic, we were able to demonstrate the impact of the proposed framework on data integrity. By following the data cleaning and refining procedures of the framework, we were able to generate high-quality data that were available for research questions about the coordination and management of depression in a primary care setting. We describe the steps involved in converting clinically collected data into a viable research data set using registry cohorts of depressed adults to assess the impact on high-cost service use. Conclusions The systematic framework discussed in this study sheds light on the existing inconsistency and data quality issues in patient-centered registries. This study provided a step-by-step procedure for addressing these challenges and for generating high-quality data for both quality improvement and research that may enhance care and outcomes for patients. International Registered Report Identifier (IRRID) DERR1-10.2196/18366


2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Yu Qiao ◽  
Jun Wu ◽  
Hao Cheng ◽  
Zilan Huang ◽  
Qiangqiang He ◽  
...  

In the age of the development of artificial intelligence, we face the challenge on how to obtain high-quality data set for learning systems effectively and efficiently. Crowdsensing is a new powerful tool which will divide tasks between the data contributors to achieve an outcome cumulatively. However, it arouses several new challenges, such as incentivization. Incentive mechanisms are significant to the crowdsensing applications, since a good incentive mechanism will attract more workers to participate. However, existing mechanisms failed to consider situations where the crowdsourcer has to hire capacitated workers or workers from multiregions. We design two objectives for the proposed multiregion scenario, namely, weighted mean and maximin. The proposed mechanisms maximize the utility of services provided by a selected data contributor under both constraints approximately. Also, extensive simulations are conducted to verify the effectiveness of our proposed methods.


2019 ◽  
Vol 623 ◽  
pp. L9 ◽  
Author(s):  
M. Fredslund Andersen ◽  
P. Pallé ◽  
J. Jessen-Hansen ◽  
K. Wang ◽  
F. Grundahl ◽  
...  

Context. We present the first high-cadence multiwavelength radial-velocity observations of the Sun-as-a-star, carried out during 57 consecutive days using the stellar échelle spectrograph at the Hertzsprung SONG Telescope operating at the Teide Observatory. Aims. Our aim was to produce a high-quality data set and reference values for the global helioseismic parameters νmax, ⊙ and Δν⊙ of the solar p-modes using the SONG instrument. The obtained data set or the inferred values should then be used when the scaling relations are applied to other stars showing solar-like oscillations observed with SONG or similar instruments. Methods. We used different approaches to analyse the power spectrum of the time series to determine νmax, ⊙: simple Gaussian fitting and heavy smoothing of the power spectrum. We determined Δν⊙ using the method of autocorrelation of the power spectrum. The amplitude per radial mode was determined using the method described in Kjeldsen et al. (2008, ApJ, 682, 1370). Results. We found the following values for the solar oscillations using the SONG spectrograph: νmax, ⊙ = 3141 ± 12 μHz, Δν⊙ = 134.98 ± 0.04 μHz, and an average amplitude of the strongest radial modes of 16.6 ± 0.4 cm s−1. These values are consistent with previous measurements with other techniques.


Trials ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Jessica E. Lockery ◽  
◽  
Taya A. Collyer ◽  
Christopher M. Reid ◽  
Michael E. Ernst ◽  
...  

Abstract Background Large-scale studies risk generating inaccurate and missing data due to the complexity of data collection. Technology has the potential to improve data quality by providing operational support to data collectors. However, this potential is under-explored in community-based trials. The Aspirin in reducing events in the elderly (ASPREE) trial developed a data suite that was specifically designed to support data collectors: the ASPREE Web Accessible Relational Database (AWARD). This paper describes AWARD and the impact of system design on data quality. Methods AWARD’s operational requirements, conceptual design, key challenges and design solutions for data quality are presented. Impact of design features is assessed through comparison of baseline data collected prior to implementation of key functionality (n = 1000) with data collected post implementation (n = 18,114). Overall data quality is assessed according to data category. Results At baseline, implementation of user-driven functionality reduced staff error (from 0.3% to 0.01%), out-of-range data entry (from 0.14% to 0.04%) and protocol deviations (from 0.4% to 0.08%). In the longitudinal data set, which contained more than 39 million data values collected within AWARD, 96.6% of data values were entered within specified query range or found to be accurate upon querying. The remaining data were missing (3.4%). Participant non-attendance at scheduled study activity was the most common cause of missing data. Costs associated with cleaning data in ASPREE were lower than expected compared with reports from other trials. Conclusions Clinical trials undertake complex operational activity in order to collect data, but technology rarely provides sufficient support. We find the AWARD suite provides proof of principle that designing technology to support data collectors can mitigate known causes of poor data quality and produce higher-quality data. Health information technology (IT) products that support the conduct of scheduled activity in addition to traditional data entry will enhance community-based clinical trials. A standardised framework for reporting data quality would aid comparisons across clinical trials. Trial registration International Standard Randomized Controlled Trial Number Register, ISRCTN83772183. Registered on 3 March 2005.


2020 ◽  
Vol 4 (4) ◽  
pp. 354-359
Author(s):  
Ari Ercole ◽  
Vibeke Brinck ◽  
Pradeep George ◽  
Ramona Hicks ◽  
Jilske Huijben ◽  
...  

AbstractBackground:High-quality data are critical to the entire scientific enterprise, yet the complexity and effort involved in data curation are vastly under-appreciated. This is especially true for large observational, clinical studies because of the amount of multimodal data that is captured and the opportunity for addressing numerous research questions through analysis, either alone or in combination with other data sets. However, a lack of details concerning data curation methods can result in unresolved questions about the robustness of the data, its utility for addressing specific research questions or hypotheses and how to interpret the results. We aimed to develop a framework for the design, documentation and reporting of data curation methods in order to advance the scientific rigour, reproducibility and analysis of the data.Methods:Forty-six experts participated in a modified Delphi process to reach consensus on indicators of data curation that could be used in the design and reporting of studies.Results:We identified 46 indicators that are applicable to the design, training/testing, run time and post-collection phases of studies.Conclusion:The Data Acquisition, Quality and Curation for Observational Research Designs (DAQCORD) Guidelines are the first comprehensive set of data quality indicators for large observational studies. They were developed around the needs of neuroscience projects, but we believe they are relevant and generalisable, in whole or in part, to other fields of health research, and also to smaller observational studies and preclinical research. The DAQCORD Guidelines provide a framework for achieving high-quality data; a cornerstone of health research.


Metabolomics ◽  
2014 ◽  
Vol 10 (4) ◽  
pp. 539-540 ◽  
Author(s):  
Daniel W. Bearden ◽  
Richard D. Beger ◽  
David Broadhurst ◽  
Warwick Dunn ◽  
Arthur Edison ◽  
...  

2015 ◽  
Vol 21 (3) ◽  
pp. 358-374 ◽  
Author(s):  
Mustafa Aljumaili ◽  
Karina Wandt ◽  
Ramin Karim ◽  
Phillip Tretten

Purpose – The purpose of this paper is to explore the main ontologies related to eMaintenance solutions and to study their application area. The advantages of using these ontologies to improve and control data quality will be investigated. Design/methodology/approach – A literature study has been done to explore the eMaintenance ontologies in the different areas. These ontologies are mainly related to content structure and communication interface. Then, ontologies will be linked to each step of the data production process in maintenance. Findings – The findings suggest that eMaintenance ontologies can help to produce a high-quality data in maintenance. The suggested maintenance data production process may help to control data quality. Using these ontologies in every step of the process may help to provide management tools to provide high-quality data. Research limitations/implications – Based on this study, it can be concluded that further research could broaden the investigation to identify more eMaintenance ontologies. Moreover, studying these ontologies in more technical details may help to increase the understandability and the use of these standards. Practical implications – It has been concluded in this study that applying eMaintenance ontologies by companies needs additional cost and time. Also the lack or the ineffective use of eMaintenance tools in many enterprises is one of the limitations for using these ontologies. Originality/value – Investigating eMaintenance ontologies and connecting them to maintenance data production is important to control and manage the data quality in maintenance.


2018 ◽  
Vol 10 (11) ◽  
pp. 1739 ◽  
Author(s):  
Xianxian Guo ◽  
Le Wang ◽  
Jinyan Tian ◽  
Dameng Yin ◽  
Chen Shi ◽  
...  

Accurate measurement of the field leaf area index (LAI) is crucial for assessing forest growth and health status. Three-dimensional (3-D) structural information of trees from terrestrial laser scanning (TLS) have information loss to various extents because of the occlusion by canopy parts. The data with higher loss, regarded as poor-quality data, heavily hampers the estimation accuracy of LAI. Multi-location scanning, which proved effective in reducing the occlusion effects in other forests, is hard to carry out in the mangrove forest due to the difficulty of moving between mangrove trees. As a result, the quality of point cloud data (PCD) varies among plots in mangrove forests. To improve retrieval accuracy of mangrove LAI, it is essential to select only the high-quality data. Several previous studies have evaluated the regions of occlusion through the consideration of laser pulses trajectories. However, the model is highly susceptible to the indeterminate profile of complete vegetation object and computationally intensive. Therefore, this study developed a new index (vegetation horizontal occlusion index, VHOI) by combining unmanned aerial vehicle (UAV) imagery and TLS data to quantify TLS data quality. VHOI is asymptotic to 0.0 with increasing data quality. In order to test our new index, the VHOI values of 102 plots with a radius of 5 m were calculated with TLS data and UAV image. The results showed that VHOI had a strong linear relationship with estimation accuracy of LAI (R2 = 0.72, RMSE = 0.137). In addition, as TLS data were selected by VHOI less than different thresholds (1.0, 0.9, …, 0.1), the number of remaining plots decreased while the agreement between LAI derived from TLS and field-measured LAI was improved. When the VHOI threshold is 0.3, the optimal trade-off is reached between the number of plots and LAI measurement accuracy (R2 = 0.67). To sum up, VHOI can be used as an index to select high-quality data for accurately measuring mangrove LAI and the suggested threshold is 0.30.


2017 ◽  
Vol 6 (2) ◽  
pp. 505-521 ◽  
Author(s):  
Luděk Vecsey ◽  
Jaroslava Plomerová ◽  
Petr Jedlička ◽  
Helena Munzarová ◽  
Vladislav Babuška ◽  
...  

Abstract. This paper focuses on major issues related to the data reliability and network performance of 20 broadband (BB) stations of the Czech (CZ) MOBNET (MOBile NETwork) seismic pool within the AlpArray seismic experiments. Currently used high-resolution seismological applications require high-quality data recorded for a sufficiently long time interval at seismological observatories and during the entire time of operation of the temporary stations. In this paper we present new hardware and software tools we have been developing during the last two decades while analysing data from several international passive experiments. The new tools help to assure the high-quality standard of broadband seismic data and eliminate potential errors before supplying data to seismological centres. Special attention is paid to crucial issues like the detection of sensor misorientation, timing problems, interchange of record components and/or their polarity reversal, sensor mass centring, or anomalous channel amplitudes due to, for example, imperfect gain. Thorough data quality control should represent an integral constituent of seismic data recording, preprocessing, and archiving, especially for data from temporary stations in passive seismic experiments. Large international seismic experiments require enormous efforts from scientists from different countries and institutions to gather hundreds of stations to be deployed in the field during a limited time period. In this paper, we demonstrate the beneficial effects of the procedures we have developed for acquiring a reliable large set of high-quality data from each group participating in field experiments. The presented tools can be applied manually or automatically on data from any seismic network.


Sign in / Sign up

Export Citation Format

Share Document