The Effect of New Data Collection Technologies on Survey Data Quality

Author(s):  
William L. Nicholls ◽  
Reginald P. Baker ◽  
Jean Martin
2017 ◽  
Vol 59 (2) ◽  
pp. 199-220
Author(s):  
G.W. Roughton ◽  
Iain Mackay

This paper investigates whether a ‘wisdom of the crowd’ approach might offer an alternative to recent political polls that have raised questions about survey data quality. Data collection costs have become so low that, as well as the question of data quality, concerns have also been raised about low response rates, professional respondents and respondent interaction. There are also uncertainties about self-selecting ‘samples’. This paper looks at more than 100 such surveys and reports that, in five out of the six cases discussed, £0.08p interviews delivered results in line with known outcomes. The results discussed in the paper show that such interviews are not a waste of money.


10.2196/17619 ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. e17619
Author(s):  
Neha Shah ◽  
Diwakar Mohan ◽  
Jean Juste Harisson Bashingwa ◽  
Osama Ummer ◽  
Arpita Chakraborty ◽  
...  

Background Data quality is vital for ensuring the accuracy, reliability, and validity of survey findings. Strategies for ensuring survey data quality have traditionally used quality assurance procedures. Data analytics is an increasingly vital part of survey quality assurance, particularly in light of the increasing use of tablets and other electronic tools, which enable rapid, if not real-time, data access. Routine data analytics are most often concerned with outlier analyses that monitor a series of data quality indicators, including response rates, missing data, and reliability of coefficients for test-retest interviews. Machine learning is emerging as a possible tool for enhancing real-time data monitoring by identifying trends in the data collection, which could compromise quality. Objective This study aimed to describe methods for the quality assessment of a household survey using both traditional methods as well as machine learning analytics. Methods In the Kilkari impact evaluation’s end-line survey amongst postpartum women (n=5095) in Madhya Pradesh, India, we plan to use both traditional and machine learning–based quality assurance procedures to improve the quality of survey data captured on maternal and child health knowledge, care-seeking, and practices. The quality assurance strategy aims to identify biases and other impediments to data quality and includes seven main components: (1) tool development, (2) enumerator recruitment and training, (3) field coordination, (4) field monitoring, (5) data analytics, (6) feedback loops for decision making, and (7) outcomes assessment. Analyses will include basic descriptive and outlier analyses using machine learning algorithms, which will involve creating features from time-stamps, “don’t know” rates, and skip rates. We will also obtain labeled data from self-filled surveys, and build models using k-folds cross-validation on a training data set using both supervised and unsupervised learning algorithms. Based on these models, results will be fed back to the field through various feedback loops. Results Data collection began in late October 2019 and will span through March 2020. We expect to submit quality assurance results by August 2020. Conclusions Machine learning is underutilized as a tool to improve survey data quality in low resource settings. Study findings are anticipated to improve the overall quality of Kilkari survey data and, in turn, enhance the robustness of the impact evaluation. More broadly, the proposed quality assurance approach has implications for data capture applications used for special surveys as well as in the routine collection of health information by health workers. International Registered Report Identifier (IRRID) DERR1-10.2196/17619


2019 ◽  
Author(s):  
Neha Shah ◽  
Diwakar Mohan ◽  
Jean Juste Harisson Bashingwa ◽  
Osama Ummer ◽  
Arpita Chakraborty ◽  
...  

BACKGROUND Data quality is vital for ensuring the accuracy, reliability, and validity of survey findings. Strategies for ensuring survey data quality have traditionally used quality assurance procedures. Data analytics is an increasingly vital part of survey quality assurance, particularly in light of the increasing use of tablets and other electronic tools, which enable rapid, if not real-time, data access. Routine data analytics are most often concerned with outlier analyses that monitor a series of data quality indicators, including response rates, missing data, and reliability of coefficients for test-retest interviews. Machine learning is emerging as a possible tool for enhancing real-time data monitoring by identifying trends in the data collection, which could compromise quality. OBJECTIVE This study aimed to describe methods for the quality assessment of a household survey using both traditional methods as well as machine learning analytics. METHODS In the Kilkari impact evaluation’s end-line survey amongst postpartum women (n=5095) in Madhya Pradesh, India, we plan to use both traditional and machine learning–based quality assurance procedures to improve the quality of survey data captured on maternal and child health knowledge, care-seeking, and practices. The quality assurance strategy aims to identify biases and other impediments to data quality and includes seven main components: (1) tool development, (2) enumerator recruitment and training, (3) field coordination, (4) field monitoring, (5) data analytics, (6) feedback loops for decision making, and (7) outcomes assessment. Analyses will include basic descriptive and outlier analyses using machine learning algorithms, which will involve creating features from time-stamps, “don’t know” rates, and skip rates. We will also obtain labeled data from self-filled surveys, and build models using k-folds cross-validation on a training data set using both supervised and unsupervised learning algorithms. Based on these models, results will be fed back to the field through various feedback loops. RESULTS Data collection began in late October 2019 and will span through March 2020. We expect to submit quality assurance results by August 2020. CONCLUSIONS Machine learning is underutilized as a tool to improve survey data quality in low resource settings. Study findings are anticipated to improve the overall quality of Kilkari survey data and, in turn, enhance the robustness of the impact evaluation. More broadly, the proposed quality assurance approach has implications for data capture applications used for special surveys as well as in the routine collection of health information by health workers. CLINICALTRIAL INTERNATIONAL REGISTERED REPORT DERR1-10.2196/17619


2021 ◽  
pp. 000276422110216
Author(s):  
Kazimierz M. Slomczynski ◽  
Irina Tomescu-Dubrow ◽  
Ilona Wysmulek

This article proposes a new approach to analyze protest participation measured in surveys of uneven quality. Because single international survey projects cover only a fraction of the world’s nations in specific periods, researchers increasingly turn to ex-post harmonization of different survey data sets not a priori designed as comparable. However, very few scholars systematically examine the impact of the survey data quality on substantive results. We argue that the variation in source data, especially deviations from standards of survey documentation, data processing, and computer files—proposed by methodologists of Total Survey Error, Survey Quality Monitoring, and Fitness for Intended Use—is important for analyzing protest behavior. In particular, we apply the Survey Data Recycling framework to investigate the extent to which indicators of attending demonstrations and signing petitions in 1,184 national survey projects are associated with measures of data quality, controlling for variability in the questionnaire items. We demonstrate that the null hypothesis of no impact of measures of survey quality on indicators of protest participation must be rejected. Measures of survey documentation, data processing, and computer records, taken together, explain over 5% of the intersurvey variance in the proportions of the populations attending demonstrations or signing petitions.


Author(s):  
Christopher D O’Connor ◽  
John Ng ◽  
Dallas Hill ◽  
Tyler Frederick

Policing is increasingly being shaped by data collection and analysis. However, we still know little about the quality of the data police services acquire and utilize. Drawing on a survey of analysts from across Canada, this article examines several data collection, analysis, and quality issues. We argue that as we move towards an era of big data policing it is imperative that police services pay more attention to the quality of the data they collect. We conclude by discussing the implications of ignoring data quality issues and the need to develop a more robust research culture in policing.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Michelle Amri ◽  
Christina Angelakis ◽  
Dilani Logan

Abstract Objective Through collating observations from various studies and complementing these findings with one author’s study, a detailed overview of the benefits and drawbacks of asynchronous email interviewing is provided. Through this overview, it is evident there is great potential for asynchronous email interviews in the broad field of health, particularly for studies drawing on expertise from participants in academia or professional settings, those across varied geographical settings (i.e. potential for global public health research), and/or in circumstances when face-to-face interactions are not possible (e.g. COVID-19). Results Benefits of asynchronous email interviewing and additional considerations for researchers are discussed around: (i) access transcending geographic location and during restricted face-to-face communications; (ii) feasibility and cost; (iii) sampling and inclusion of diverse participants; (iv) facilitating snowball sampling and increased transparency; (v) data collection with working professionals; (vi) anonymity; (vii) verification of participants; (viii) data quality and enhanced data accuracy; and (ix) overcoming language barriers. Similarly, potential drawbacks of asynchronous email interviews are also discussed with suggested remedies, which centre around: (i) time; (ii) participant verification and confidentiality; (iii) technology and sampling concerns; (iv) data quality and availability; and (v) need for enhanced clarity and precision.


2021 ◽  
Vol 13 (6) ◽  
pp. 3320
Author(s):  
Amy R. Villarosa ◽  
Lucie M. Ramjan ◽  
Della Maneze ◽  
Ajesh George

The COVID-19 pandemic has resulted in many changes, including restrictions on indoor gatherings and visitation to residential aged care facilities, hospitals and certain communities. Coupled with potential restrictions imposed by health services and academic institutions, these changes may significantly impact the conduct of population health research. However, the continuance of population health research is beneficial for the provision of health services and sometimes imperative. This paper discusses the impact of COVID-19 restrictions on the conduct of population health research. This discussion unveils important ethical considerations, as well as potential impacts on recruitment methods, face-to-face data collection, data quality and validity. In addition, this paper explores potential recruitment and data collection methods that could replace face-to-face methods. The discussion is accompanied by reflections on the challenges experienced by the authors in their own research at an oral health service during the COVID-19 pandemic and alternative methods that were utilised in place of face-to-face methods. This paper concludes that, although COVID-19 presents challenges to the conduct of population health research, there is a range of alternative methods to face-to-face recruitment and data collection. These alternative methods should be considered in light of project aims to ensure data quality is not compromised.


2004 ◽  
Vol 22 (5) ◽  
pp. 255-265 ◽  
Author(s):  
JAMES A. BOBULA ◽  
LORI S. ANDERSON ◽  
SUSAN K. RIESCH ◽  
JANIE CANTY-MITCHELL ◽  
ANGELA DUNCAN ◽  
...  
Keyword(s):  

2021 ◽  
pp. 147078532098679
Author(s):  
Kylie Brosnan ◽  
Bettina Grün ◽  
Sara Dolnicar

Survey data quality suffers when respondents have difficulty completing complex tasks in questionnaires. Cognitive load theory informed the development of strategies for educators to reduce the cognitive load of learning tasks. We investigate whether these cognitive load reduction strategies can be used in questionnaire design to reduce task difficulty and, in so doing, improve survey data quality. We find that this is not the case and conclude that some of the traditional survey answer formats, such as grid questions, which have been criticized in the past lead to equally good data and do not frustrate respondents more than alternative formats.


Sign in / Sign up

Export Citation Format

Share Document