scholarly journals An Indicator Function for Insufficient Data Quality – A Contribution to Data Accuracy

Author(s):  
Quirin Görz ◽  
Marcus Kaiser
2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Michelle Amri ◽  
Christina Angelakis ◽  
Dilani Logan

Abstract Objective Through collating observations from various studies and complementing these findings with one author’s study, a detailed overview of the benefits and drawbacks of asynchronous email interviewing is provided. Through this overview, it is evident there is great potential for asynchronous email interviews in the broad field of health, particularly for studies drawing on expertise from participants in academia or professional settings, those across varied geographical settings (i.e. potential for global public health research), and/or in circumstances when face-to-face interactions are not possible (e.g. COVID-19). Results Benefits of asynchronous email interviewing and additional considerations for researchers are discussed around: (i) access transcending geographic location and during restricted face-to-face communications; (ii) feasibility and cost; (iii) sampling and inclusion of diverse participants; (iv) facilitating snowball sampling and increased transparency; (v) data collection with working professionals; (vi) anonymity; (vii) verification of participants; (viii) data quality and enhanced data accuracy; and (ix) overcoming language barriers. Similarly, potential drawbacks of asynchronous email interviews are also discussed with suggested remedies, which centre around: (i) time; (ii) participant verification and confidentiality; (iii) technology and sampling concerns; (iv) data quality and availability; and (v) need for enhanced clarity and precision.


Author(s):  
David J. Yates ◽  
Jennifer Xu

This research is motivated by data mining for wireless sensor network applications. The authors consider applications where data is acquired in real-time, and thus data mining is performed on live streams of data rather than on stored databases. One challenge in supporting such applications is that sensor node power is a precious resource that needs to be managed as such. To conserve energy in the sensor field, the authors propose and evaluate several approaches to acquiring, and then caching data in a sensor field data server. The authors show that for true real-time applications, for which response time dictates data quality, policies that emulate cache hits by computing and returning approximate values for sensor data yield a simultaneous quality improvement and cost saving. This “win-win” is because when data acquisition response time is sufficiently important, the decrease in resource consumption and increase in data quality achieved by using approximate values outweighs the negative impact on data accuracy due to the approximation. In contrast, when data accuracy drives quality, a linear trade-off between resource consumption and data accuracy emerges. The authors then identify caching and lookup policies for which the sensor field query rate is bounded when servicing an arbitrary workload of user queries. This upper bound is achieved by having multiple user queries share the cost of a sensor field query. Finally, the authors discuss the challenges facing sensor network data mining applications in terms of data collection, warehousing, and mining techniques.


2020 ◽  
Author(s):  
SUSAN F. RUMISHA ◽  
EMANUEL P. LYIMO ◽  
IRENE R. MREMI ◽  
PATRICK K. TUNGU ◽  
VICTOR S. MWINGIRA ◽  
...  

Abstract Background: Effective planning for disease prevention and control requires accurate, adequately-analysed, interpreted and communicated data. In recent years, efforts have been put in strengthening health management information systems (HMIS) in Sub-Saharan Africa to improve data accessibility to decision-makers. This study assessed the quality of routine HMIS data at primary healthcare facility (HF) and district levels in Tanzania. Methods: This cross-sectional study involved reviews of documents, systems and databases, and collection of primary data from facility registers, tally sheets and monthly summary reports. Thirty-four indicators from Outpatient, Inpatient, Antenatal care, Family Planning, Post-natal care, Labour and Delivery, and Provider-Initiated Testing and Counselling service areas were assessed. Indicator records were tracked and compared across the process of data collection, compilation and submission to the district office. Monthly report forms submitted by facilities to the district were also reviewed. The availability and utilization of HMIS tools were assessed, while completeness and data accuracy levels were quantified for each phase of the reporting system. Results: A total of 115 HFs (including hospitals, health centres, dispensaries) in 11 districts were involved. Registers (availability rate=91.1%; interquartile range (IQR):66.7%-100%) and report forms (86.9%; IQR:62.2%-100%) were the most utilized tools. There was a limited use of tally-sheets (77.8%; IQR:35.6%-100%). Tools availability at the dispensary was 91.1%, health centre 82.2% and hospital 77.8%. The availability rate at the district level was 65% (IQR:48%-75%). Wrongly filled or empty cells in registers and poor adherence to the coding procedures were observed. Reports were highly over-represented in comparison to registers’ records, with large differences observed at the HF phase of the reporting system. The OPD and IPD areas indicated the highest levels of mismatch between data source and district office. Indicators with large number of clients, multiple variables, disease categorization, or those linked with dispensing medicine performed poorly. Conclusion: There are high variations in the tool utilisation and data accuracy at facility and district levels. The routine HMIS is weak and data at district level inaccurately reflects what is available at the source. These results highlight the need to design tailored and inter-service strategies for improving data quality.


2021 ◽  
Vol 869 (1) ◽  
pp. 012020
Author(s):  
S A Raup ◽  
S Patmiarsih ◽  
R D Juniar ◽  
B Setyadji

Abstract Tuna and tuna-like fisheries play a vital role in Indonesian livelihood, especially in the archipelagic waters. However, despite the importance, the concern in general data collection activities for tuna, i.e., limited, with incomplete scientific knowledge and insufficient data has hampered the assessment. The purpose of this study was to analyse on how fisheries-dependent data system could transform the data quality. E-logbook has the best attribute for reaching the goals, especially for small-scale tuna fisheries. Characterised by low cost and vast spatial and temporal coverage, it is convinced on why the program should be expanded and monitored carefully. Analysis on fisheries indicators showed a promising result, especially for filling the gap which could not be covered by research.


2020 ◽  
Author(s):  
Charles Kuria Njuguna ◽  
Mohamed Vandi ◽  
Malimbo Mugagga ◽  
Joseph Kanu ◽  
Evans Liyosi ◽  
...  

Abstract Background Public health agencies require valid, timely and complete health information for early detection of outbreaks. Towards the end of the Ebola Virus Disease (EVD) outbreak in 2015, the Ministry of Health and Sanitation (MoHS), Sierra Leone revitalized the Integrated Disease Surveillance and Response System (IDSR). Data quality assessments were conducted to monitor accuracy of IDSR data. Methods Starting 2016, data quality assessments (DQA) were conducted in randomly selected health facilities. Structured electronic checklist was used to interview district health management teams (DHMT) and health facility staff. We used malaria data, to assess data accuracy as malaria was endemic in Sierra Leone. Verification factors (VF) calculated as the ratio of verified malaria cases in health facility registers to the number of malaria cases in the national health information database, were used to assess data accuracy. Allowing a 5% margin of error, VF <95% were considered over reporting while VF >105 was underreporting. Differences in the proportion of accurate reports at baseline and subsequent assessments were compared using Z-test for two proportions. Results Between 2016 -2018, four DQA were conducted in 444 health facilities where 1,729 IDSR reports were reviewed. Registers and IDSR technical guidelines were available in health facilities and health care workers were conversant with reporting requirements. Overall data accuracy improved from over- reporting of 4.7% (VF 95.3%) in 2016 to under-reporting of 0.2% (VF 100.2%) in 2018. Compared to 2016, proportion of accurate IDSR reports increased by 14.8 % (95% CI 7.2%, 22.3%) in May 2017 and 19.5% (95% CI 12.5% -26.5%) by 2018. Over reporting was more common in private clinics and not- for profit facilities while under-reporting was more common in lower level government health facilities. Leading reasons for data discrepancies included counting errors in 358 (80.6%) health facilities and missing source documents in 47 (10.6%) health facilities. Conclusion This is the first attempt to institutionalize routine monitoring of IDSR data quality in Sierra Leone. Regular data quality assessments may have contributed to improved data accuracy over time. Data compilation errors accounted for most discrepancies and should be minimized to improve accuracy of IDSR data.


2019 ◽  
Author(s):  
Charles Kuria Njuguna ◽  
Mohamed Vandi ◽  
Malimbo Mugagga ◽  
Joseph Kanu ◽  
Evans Liyosi ◽  
...  

Abstract Background Public health agencies require valid, timely and complete health information for early detection of outbreaks. Towards the end of the Ebola Virus Disease (EVD) outbreak in 2015, the Ministry of Health and Sanitation (MoHS), Sierra Leone revitalized the Integrated Disease Surveillance and Response System (IDSR). Data quality assessments were conducted to monitor the accuracy of data generated through the IDSR system.Methods Starting 2016, regular data quality assessments (DQA)were conducted in randomly selected health facilities. A structured electronic checklist was used to interview district health management team (DHMT) members and health facility staff. We used malaria data to assess data accuracy as malaria was endemic in Sierra Leone. Verification factors (VF) calculated as the ratio of verified malaria cases in the health facility register to the number of malaria cases recorded in the national health information database, were used to assess data accuracy. Allowing a 5% margin of error, VF <95% were considered over reporting while a VF >105 was underreporting. Differences in the proportion of accurate reports in the first and fourth assessments were compared using Z-test for two proportions.Results Between 2016 -2018, four DQA were conducted in 444 health facilities where 1,729 IDSR reports were reviewed. Registers and IDSR technical guidelines were widely available in health facilities and health care workers were conversant with reporting requirements. Overall data accuracy improved from VF of 95.3% in 2016 to 100.2% in 2018. Compared to the baseline in 2016, the proportion of accurate IDSR reports in 2018 increased by 19.5% (CI 12.5% -26.5%). Over reporting was more common in private clinics and not for profit facilities while under-reporting was more common in lower level government health facilities. Leading reasons for data discrepancies included counting errors in 358 (80.6%) health facilities, and missing source documents in 47 (10.6%) health facilities.Conclusion This is the first attempt to institutionalize routine monitoring of IDSR data quality in Sierra Leone. Regular data quality assessments may have contributed to improved data accuracy over time. Data compilation errors accounted for most discrepancies and should be minimized to improve accuracy of IDSR data.


2020 ◽  
Author(s):  
SUSAN F. RUMISHA ◽  
EMANUEL P. LYIMO ◽  
IRENE R. MREMI ◽  
PATRICK K. TUNGU ◽  
VICTOR S. MWINGIRA ◽  
...  

Abstract Background: Effective planning for disease prevention and control requires accurate, adequately-analysed, interpreted and communicated data. In recent years, efforts have been put in strengthening health management information systems (HMIS) in Sub-Saharan Africa to improve data accessibility to decision-makers. This study assessed the quality of routine HMIS data at primary healthcare facility (HF) and district levels in Tanzania.Methods: This cross-sectional study involved reviews of documents, information systems and databases, and collection of primary data from facility-level registers, tally sheets and monthly summary reports. Thirty-four indicators from Outpatient, Inpatient, Antenatal care, Family Planning, Post-natal care, Labour and Delivery, and Provider-Initiated Testing and Counselling service areas were assessed. Indicator records were tracked and compared across the process of data collection, compilation and submission to the district office. Copies of monthly report forms submitted by facilities to the district were also reviewed. The availability and utilization of HMIS tools were assessed, while completeness and data accuracy levels were quantified for each phase of the reporting system.Results: A total of 115 HFs (including hospitals, health centres, dispensaries) in 11 districts were involved. Registers (availability rate=91.1%; interquartile range (IQR):66.7%-100%) and report forms (86.9%; IQR:62.2%-100%) were the most utilized tools. There was a limited use of tally-sheets (77.8%; IQR:35.6%-100%). Tools availability at the dispensary was 91.1%, health centre 82.2% and hospital 77.8%, and was low in urban districts. The availability rate at the district level was 65% (IQR:48%-75%). Wrongly filled or empty cells in registers and poor adherence to the coding procedures were observed. Reports were highly over-represented in comparison to registers’ records, with large differences observed at the HF phase of the reporting system. The OPD and IPD areas indicated the highest levels of mismatch between data source and district office. Indicators with large number of clients, multiple variables, disease categorization, or those linked with dispensing medicine performed poorly. Conclusion: There are high variations in the tool utilisation and data accuracy at facility and district levels. The routine HMIS is weak and data at district level inaccurately reflects what is available at the source. These results highlight the need to design tailored and inter-service strategies for improving data quality.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Susan F. Rumisha ◽  
Emanuel P. Lyimo ◽  
Irene R. Mremi ◽  
Patrick K. Tungu ◽  
Victor S. Mwingira ◽  
...  

Abstract Background Effective planning for disease prevention and control requires accurate, adequately-analysed, interpreted and communicated data. In recent years, efforts have been put in strengthening health management information systems (HMIS) in Sub-Saharan Africa to improve data accessibility to decision-makers. This study assessed the quality of routine HMIS data at primary healthcare facility (HF) and district levels in Tanzania. Methods This cross-sectional study involved reviews of documents, information systems and databases, and collection of primary data from facility-level registers, tally sheets and monthly summary reports. Thirty-four indicators from Outpatient, Inpatient, Antenatal care, Family Planning, Post-natal care, Labour and Delivery, and Provider-Initiated Testing and Counselling service areas were assessed. Indicator records were tracked and compared across the process of data collection, compilation and submission to the district office. Copies of monthly report forms submitted by facilities to the district were also reviewed. The availability and utilization of HMIS tools were assessed, while completeness and data accuracy levels were quantified for each phase of the reporting system. Results A total of 115 HFs (including hospitals, health centres, dispensaries) in 11 districts were involved. Registers (availability rate = 91.1%; interquartile range (IQR) 66.7–100%) and report forms (86.9%; IQR 62.2–100%) were the most utilized tools. There was a limited use of tally-sheets (77.8%; IQR 35.6–100%). Tools availability at the dispensary was 91.1%, health centre 82.2% and hospital 77.8%, and was low in urban districts. The availability rate at the district level was 65% (IQR 48–75%). Wrongly filled or empty cells in registers and poor adherence to the coding procedures were observed. Reports were highly over-represented in comparison to registers’ records, with large differences observed at the HF phase of the reporting system. The OPD and IPD areas indicated the highest levels of mismatch between data source and district office. Indicators with large number of clients, multiple variables, disease categorization, or those linked with dispensing medicine performed poorly. Conclusion There are high variations in the tool utilisation and data accuracy at facility and district levels. The routine HMIS is weak and data at district level inaccurately reflects what is available at the source. These results highlight the need to design tailored and inter-service strategies for improving data quality.


Sign in / Sign up

Export Citation Format

Share Document