scholarly journals Going Digital: Added Value of Electronic Data Collection In 2018 Afghanistan Health Survey

Author(s):  
Christina Mergenthaler ◽  
Rajpal Singh Yadav ◽  
Sohrab Safi ◽  
Ente Rood ◽  
Sandra Alba

Abstract Background: Through a nationally representative household survey in Afghanistan, we conducted an operational study in two relatively secure provinces comparing effectiveness of computer-aided personal interviewing (CAPI) with paper-and-pencil interviewing (PAPI). Methods: In Panjshir and Parwan provinces, household survey data were collected using paper questionnaires in 15 clusters, and OpenDataKit (ODK) software on electronic tablets in 15 other clusters. Added value was evaluated from three perspectives: efficient implementation, data quality, and acceptability. Efficiency was measured through financial expenditures and time stamped data. Data quality was measured by examining completeness. Acceptability was studied through focus group discussions with survey staff.Results: Survey costs were 68% more expensive in CAPI clusters compared to PAPI clusters, due primarily to the upfront one-time investment for survey programming. Enumerators spent significantly less time administering surveys in CAPI cluster households (248 minutes survey time) compared to PAPI (289 minutes), for an average savings of 41 minutes per household (95% CI: 25 – 55). CAPI offered a savings of 87 days for data management over PAPI.Among 49 tracer variables (meaning responses were required from all respondents), small differences were observed between PAPI and CAPI. 2.2% of the cleaned dataset’s tracer data points were missing in CAPI surveys (1,216/ 56,073 data points), compared to 3.2% in PAPI surveys (1,953/ 60,675 data points). In pre-cleaned datasets, 3.9% of tracer data points were missing in CAPI surveys (2,151/ 55,092 data points) compared to 3.2% in PAPI surveys (1,924/ 60,113 data points). Enumerators from Panjsher and Parwan preferred CAPI over PAPI due to time savings, user-friendliness, improved data security, and less conspicuity when traveling; however approximately half of enumerators trained from all 34 provinces reported feeling unsafe due to Taliban presence. Community and household respondent skepticism could be resolved by enumerator reassurance. Enumerators shared that in the future, they prefer collecting data using CAPI when possible.Conclusions: CAPI offers clear gains in efficiency over PAPI for data collection and management time, although costs are relatively comparable even without the programming investment. However, serious field staff concerns around Taliban threats and general insecurity mean that CAPI should only be conducted in relatively secure areas.

2021 ◽  
Vol 18 (1) ◽  
Author(s):  
Christina Mergenthaler ◽  
Rajpal Singh Yadav ◽  
Sohrab Safi ◽  
Ente Rood ◽  
Sandra Alba

Abstract Background Through a nationally representative household survey in Afghanistan, we conducted an operational study in two relatively secure provinces comparing effectiveness of computer-aided personal interviewing (CAPI) with paper-and-pencil interviewing (PAPI). Methods In Panjshir and Parwan provinces, household survey data were collected using paper questionnaires in 15 clusters, and OpenDataKit (ODK) software on electronic tablets in 15 other clusters. Added value was evaluated from three perspectives: efficient implementation, data quality, and acceptability. Efficiency was measured through financial expenditures and time stamped data. Data quality was measured by examining completeness. Acceptability was studied through focus group discussions with survey staff. Results Survey costs were 68% more expensive in CAPI clusters compared to PAPI clusters, due primarily to the upfront one-time investment for survey programming. Enumerators spent significantly less time administering surveys in CAPI cluster households (248 min survey time) compared to PAPI (289 min), for an average savings of 41 min per household (95% CI 25–55). CAPI offered a savings of 87 days for data management over PAPI. Among 49 tracer variables (meaning responses were required from all respondents), small differences were observed between PAPI and CAPI. 2.2% of the cleaned dataset’s tracer data points were missing in CAPI surveys (1216/ 56,073 data points), compared to 3.2% in PAPI surveys (1953/ 60,675 data points). In pre-cleaned datasets, 3.9% of tracer data points were missing in CAPI surveys (2151/ 55,092 data points) compared to 3.2% in PAPI surveys (1924/ 60,113 data points). Enumerators from Panjsher and Parwan preferred CAPI over PAPI due to time savings, user-friendliness, improved data security, and less conspicuity when traveling; however approximately half of enumerators trained from all 34 provinces reported feeling unsafe due to Taliban presence. Community and household respondent skepticism could be resolved by enumerator reassurance. Enumerators shared that in the future, they prefer collecting data using CAPI when possible. Conclusions CAPI offers clear gains in efficiency over PAPI for data collection and management time, although costs are relatively comparable even without the programming investment. However, serious field staff concerns around Taliban threats and general insecurity mean that CAPI should only be conducted in relatively secure areas.


2021 ◽  
Author(s):  
Christina Mergenthaler ◽  
Rajpal Singh Yadav ◽  
Sohrab Safi ◽  
Ente Rood ◽  
Sandra Alba

Abstract Background: Through a nationally representative household survey in Afghanistan, we conducted an operational study in two relatively secure provinces comparing effectiveness of electronic data collection (EDC) with paper data collection (PDC). Methods: In Panjshir and Parwan provinces, household survey data were collected using paper questionnaires in 15 clusters, versus OpenDataKit (ODK) software on electronic tablets in 15 clusters. Added value was evaluated from three perspectives: efficient implementation, data quality, and acceptability. Efficiency was measured through financial expenditures and time stamped data. Data quality was measured by examining completeness. Acceptability was studied through focus group discussions with survey staff.Results: Training, printing, material procurement, wages and transportation costs were 68% more expensive in electronic clusters compared to paper clusters, due to upfront one-time investment for survey programming. Enumerators spent significantly less time administering surveys in electronic cluster households (248 minutes survey time) compared to paper (289 minutes), for an average savings of 41 minutes per household (95% CI: 25 – 55). EDC offered a savings of 87 days for data management over PDC.Among 49 tracer variables (meaning responses were required from all respondents), small differences were observed between PDC and EDC. 2.2% of the cleaned dataset’s tracer data points were missing in electronic surveys (1,216/ 56,073 data points), compared to 3.2% in paper surveys (1,953/ 60,675 data points). In pre-cleaned datasets, 3.9% of tracer data points were missing in electronic surveys (2,151/ 55,092 data points) compared to 3.2% in paper surveys (1,924/ 60,113 data points). Enumerators from Panjsher and Parwan preferred EDC over PDC due to time savings, user-friendliness, improved data security, and less conspicuity when traveling; however approximately half of enumerators trained from all 34 provinces reported feeling unsafe due to Taliban presence. Community and household respondent skepticism could be resolved by enumerator reassurance. Enumerators shared that in the future, they prefer collecting data digitally when possible.Conclusions: EDC offers clear gains in efficiency over PDC for data collection and management time, although costs are relatively comparable even without the programming investment. However, serious field staff concerns around Taliban threats and general insecurity mean that EDC should only be conducted in relatively secure areas.


2019 ◽  
Author(s):  
Benedikt Ley ◽  
Komal Raj Rijal ◽  
Jutta Marfurt ◽  
Nabaraj Adhikari ◽  
Megha Banjara ◽  
...  

Abstract Objective: Electronic data collection (EDC) has become a suitable alternative to paper based data collection (PBDC) in biomedical research even in resource poor settings. During a survey in Nepal, data were collected using both systems and data entry errors compared between both methods. Collected data were checked for completeness, values outside of realistic ranges, internal logic and date variables for reasonable time frames. Variables were grouped into 5 categories and the number of discordant entries were compared between both systems, overall and per variable category. Results: Data from 52 variables collected from 358 participants were available. Discrepancies between both data sets were found in 12.6% of all entries (2352/18,616). Differences between data points were identified in 18.0% (643/3,580) of continuous variables, 15.8% of time variables (113/716), 13.0% of date variables (140/1,074), 12.0% of text variables (86/716), and 10.9% of categorical variables (1,370/12,530). Overall 64% (1,499/2,352) of all discrepancies were due to data omissions, 76.6% (1,148/1,499) of missing entries were among categorical data. Omissions in PBDC (n=1002) were twice as frequent as in EDC (n=497, p<0.001). Data omissions, specifically among categorical variables were identified as the greatest source of error. If designed accordingly, EDC can address this short fall effectively.


2020 ◽  
Author(s):  
Atinkut Alamirrew Zeleke ◽  
Tolga Naziyok ◽  
Fleur Fritz ◽  
Lara Christianson ◽  
Rainer Röhrig

BACKGROUND Population-level survey (PLS) is an essential standard method used in public health research. It supports to quantify sociodemographic events and support public health policy development and intervention designs with evidence. During survey, data collection mechanisms seem the most determinant to avoid mistakes before they happen. The use of electronic devices such as smartphones and tablet computers improve the quality and cost-effectiveness of public health surveys. However, there is a lack of systematically analyzed evidence to show the potential impact of electronic-based data collection tools on data quality and cost reduction in interviewer-administered surveys compared to the standard paper-based data collection system OBJECTIVE This systematic review aims to evaluate the impact of interviewer-administered electronic device data collection methods concerning data quality and cost reduction in PLS compared to the traditional paper-based methods. METHODS A systematic search was conducted in MEDLINE, CINAHL, PsycINFO, the Web of Science, EconLit and Cochrane CENTRAL, and CDSR to identify relevant studies from 2008 to 2018. We included randomized and non-randomized studies that examine data quality and cost reduction outcomes. Moreover, usability, user experience, and usage parameters from the same studies were included. Two independent authors screened the title, abstract, and finally extracted data from the included papers. A third author mediated in case of disagreement. The review authors used EndNote for de-duplication and Rayyan for screening RESULTS The search strategy from the electronic databases found 3,817 articles. After de-duplication, 2,533 articles were screened, and 14 articles fulfilled the inclusion criteria. None of the studies was designed as a randomized control trial. Most of the studies have a quasi-experimental design, like comparative experimental evaluation studies nested on the other ongoing cross-sectional surveys. 4 comparative evaluations, 2 pre-post intervention comparative evaluation, 2 retrospectives comparative evaluation, and 4 one arm non-comparative studies were included in our review. Meta-analysis was not possible because of the heterogeneity in study design, the type, and level of outcome measurements and the study settings. Individual article synthesis showed that data from electronic data collection systems possessed good quality data and delivered faster when compared to the paper-based data collection system. Only two studies linked the cost and data quality outcomes to describe the cost-effectiveness of electronic-based data collection systems. Despite the poor economic evaluation qualities, most of the reported results were in favor of EDC for the large-scale surveys. The field data collectors reported that an electronic data collection system was a feasible, acceptable and preferable tool for their work. Onsite data error prevention, fast data submission, and easy to handle devices were the comparative advantages of electronic data collection systems. Technical difficulties, accidental data loss, device theft, security concerns, power surges, and internet connection problems were reported as challenges during the implementation. CONCLUSIONS Though positive evidence existed about the comparative advantage of electronic data capture over paper-based tools, the included studies were not methodologically rigorous enough to combine. We need more rigorous studies that demonstrate the comparative evidence of paper and electronic-based data collection systems in public health surveys on data quality, work efficiency, and cost reduction CLINICALTRIAL The review protocol is registered in the International Prospective Register for Systematic Reviews (PROSPERO) CRD42018092259. The protocol of this article was also pre-published (JMIR Res Protoc 2019;8(1): e10678 doi:10.2196/10678).


2021 ◽  
Vol 19 (S1) ◽  
Author(s):  
Sanne M. Thysen ◽  
◽  
Charlotte Tawiah ◽  
Hannah Blencowe ◽  
Grace Manu ◽  
...  

Abstract Background Electronic data collection is increasingly used for household surveys, but factors influencing design and implementation have not been widely studied. The Every Newborn-INDEPTH (EN-INDEPTH) study was a multi-site survey using electronic data collection in five INDEPTH health and demographic surveillance system sites. Methods We described experiences and learning involved in the design and implementation of the EN-INDEPTH survey, and undertook six focus group discussions with field and research team to explore their experiences. Thematic analyses were conducted in NVivo12 using an iterative process guided by a priori themes. Results Five steps of the process of selecting, adapting and implementing electronic data collection in the EN-INDEPTH study are described. Firstly, we reviewed possible electronic data collection platforms, and selected the World Bank’s Survey Solutions® as the most suited for the EN-INDEPTH study. Secondly, the survey questionnaire was coded and translated into local languages, and further context-specific adaptations were made. Thirdly, data collectors were selected and trained using standardised manual. Training varied between 4.5 and 10 days. Fourthly, instruments were piloted in the field and the questionnaires finalised. During data collection, data collectors appreciated the built-in skip patterns and error messages. Internet connection unreliability was a challenge, especially for data synchronisation. For the fifth and final step, data management and analyses, it was considered that data quality was higher and less time was spent on data cleaning. The possibility to use paradata to analyse survey timing and corrections was valued. Synchronisation and data transfer should be given special consideration. Conclusion We synthesised experiences using electronic data collection in a multi-site household survey, including perceived advantages and challenges. Our recommendations for others considering electronic data collection include ensuring adaptations of tools to local context, piloting/refining the questionnaire in one site first, buying power banks to mitigate against power interruption and paying attention to issues such as GPS tracking and synchronisation, particularly in settings with poor internet connectivity.


2016 ◽  
Vol 07 (03) ◽  
pp. 672-681 ◽  
Author(s):  
Aluísio Barros ◽  
Cauane Blumenberg

SummaryThis paper describes the use of Research Electronic Data Capture (REDCap) to conduct one of the follow-up waves of the 2004 Pelotas birth cohort. The aim is to point out the advantages and limitations of using this electronic data capture environment to collect data and control every step of a longitudinal epidemiological research, specially in terms of time savings and data quality.We used REDCap as the main tool to support the conduction of a birth cohort follow-up. By exploiting several REDCap features, we managed to schedule assessments, collect data, and control the study workflow. To enhance data quality, we developed specific reports and field validations to depict inconsistencies in real time.Using REDCap it was possible to investigate more variables without significant increases on the data collection time, when comparing to a previous birth cohort follow-up. In addition, better data quality was achieved since negligible out of range errors and no validation or missing inconsistencies were identified after applying over 7,000 interviews.Adopting electronic data capture solutions, such as REDCap, in epidemiological research can bring several advantages over traditional paper-based data collection methods. In favor of improving their features, more research groups should migrate from paper to electronic-based epidemiological research.Citation: Blumenberg C, Barros AJD. Electronic data collection in epidemiological research: The use of REDCap in the Pelotas birth cohorts


1996 ◽  
Vol 1 (4) ◽  
pp. 23-37 ◽  
Author(s):  
Edith de Leeuw ◽  
William Nicholls

Whether computer assisted data collection methods should be used for survey data collection is no longer an issue. Most professional research organizations, commercial, government and academic, are adopting these new methods with enthusiasm. Computer assisted telephone interviewing (CATI) is most prevalent, and computer assisted personal interviewing (CAPI) is rapidly gaining in popularity. Also, new forms of electronic reporting of data using computers, telephones and voice recognition technology are emerging. This paper begins with a taxonomy of current computer assisted data collection methods. It then reviews conceptual and theoretical arguments and empirical evidence on such topics as: (1) respondents and interviewer acceptance of new techniques, (2) effect of computer assisted interviewing on data quality, (3) consequences for survey costs and (4) centralized vs. decentralized deployment of CATI.


2019 ◽  
Author(s):  
Benedikt Ley ◽  
Komal Raj Rijal ◽  
Jutta Marfurt ◽  
Nabaraj Adhikari ◽  
Megha Banjara ◽  
...  

Abstract Objective: Electronic data collection (EDC) has become a suitable alternative to paper based data collection (PBDC) in biomedical research even in resource poor settings. During a survey in Nepal, data were collected using both systems and data entry errors compared between both methods. Collected data were checked for completeness, values outside of realistic ranges, internal logic and date variables for reasonable time frames. Variables were grouped into 5 categories and the number of discordant entries were compared between both systems, overall and per variable category. Results: Data from 52 variables collected from 358 participants were available. Discrepancies between both data sets were found in 12.6% of all entries (2352/18,616). Differences between data points were identified in 18.0% (643/3,580) of continuous variables, 15.8% of time variables (113/716), 13.0% of date variables (140/1,074), 12.0% of text variables (86/716), and 10.9% of categorical variables (1,370/12,530). Overall 64% (1,499/2,352) of all discrepancies were due to data omissions, 76.6% (1,148/1,499) of missing entries were among categorical data. Omissions in PBDC (n=1002) were twice as frequent as in EDC (n=497, p<0.001). Data omissions, specifically among categorical variables were identified as the greatest source of error. If designed accordingly, EDC can address this short fall effectively.


2019 ◽  
Author(s):  
Katrin Drasch

Retrospective life course are extremely valuable for analysis. Unfortunately, quality requirements for this kind of data are high. Therefore, data have to be edited often with a time consuming process. Notwithstanding such enormous efforts, survey methodology has up until now paid little attention to the subject of data editing as well as to other post-data-collection procedures. I aim at filling this gap. In the following, I use the IAB-ALWA study that collected data on life courses of a nationally representative sample of 10,000 respondents. It will be examined whether such a design is beneficial from the perspective of data quality and efficiency. The results show that despite corrections the results are robust concerning different specifications of the dependent variable stemming from different stages of the editing process.


2018 ◽  
Vol 2 (S1) ◽  
pp. 37-38
Author(s):  
Amelia Barwise ◽  
Lisha Yi ◽  
Jun Guo ◽  
Ognjen Gajic ◽  
Moldovan Sabov ◽  
...  

OBJECTIVES/SPECIFIC AIMS: Missing data is a common problem in research studies that may lead to inconclusive or inaccurate results. It may even lead to harm secondary to wrong research conclusions. The purpose of this ancillary study is to measure the differences in missing data following implementation of a variety of mechanisms to improve data quality and documentation in a global quality improvement study. Many of the sites involved in the study were in low-income or middle-income countries with minimal research infrastructure. Missing data is defined as “values that are not available that would be meaningful for analysis if they were observed” (The prevention and treatment of missing data, New Engl J Med 367; 14, nejm.org, October 4, 2012). METHODS/STUDY POPULATION: All study sites used REDCap software to enter various data points including hospital and ICU admission and discharge dates as well as whether items on a Checklist relevant to processes of care in the ICU were reviewed. After initial general data collection phase, we categorized data as “must have” and “good to have.” “Must have” variables were defined as data variables that were essential for the study outcomes. “Good to have” variables would not affect the main outcomes of the study if missing. We measured completeness of data using the in-built REDCap data quality check feature. We used several strategies to encourage reduction of missing data. We initially did random data checks but noted that the amount of missing data was substantial and could not be adequately addressed this way. Second, we created excel sheets highlighting missing data for each site and notified sites. This proved onerous to create and made it burdensome for sites to identify easily where data was missing. Third, we built a custom report form in REDCap specifically able to identify which “must have” data points were missing. This could be easily accessed by the principal investigator at each site and made completing the data forms more straightforward. We encouraged all sites to complete their data collection by sending weekly data reports to each site highlighting the patients with missing data. An instructional YouTube tutorial was also created and the link was shared with all sites to demonstrate how to use the custom built report form in REDCap and how to appropriately fill in the missing data. Since this was a global study, we communicated with sites using a variety of locally favored mechanisms including Zoom, FaceTime, WeChat, WhatsApp as well as email. By harnessing the buy-in of local champions our approach was successful. RESULTS/ANTICIPATED RESULTS: The total number of patients recruited for the CERTAIN study is 4843. The rate of all missing variables improved with the efforts described above. Hospital admission dates were missing in 8.4% pre efforts and 4.2% post efforts (p<0.01). ICU admission dates were missing in 5.5% pre and 2.0% post (p<0.01). Documentation of completion of processes of care (including central line review, urinary catheter review, consideration for blood transfusion) improved significantly from pre to post (p<0.01). DISCUSSION/SIGNIFICANCE OF IMPACT: Missing data can be a problem in all types of research studies. This study provides some preliminary evidence for effective approaches that can reduce the problem of missing data when conducting a global study at sites with limited research infrastructure in place. By addressing the concern about missing data, we can be more confident that our results can be accurately analyzed and interpreted, improving the quality of the research.


Sign in / Sign up

Export Citation Format

Share Document