scholarly journals Streamlining Foodborne Disease Surveillance with Open-Source Data Management Software

2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Michael Judd ◽  
Karen Wong

Objective: The “ledsmanageR”, a data management platform built in R, aims to improve the timeliness and accuracy of national foodborne surveillance data submitted to the Laboratory-based Enteric Disease Surveillance (LEDS) system by automating the data processing, validating, and reporting workflow.Introduction: The National Surveillance Team in the Enteric Diseases Epidemiology Branch of the Centers for Disease Control and Prevention (CDC) collects electronic data from all state and regional public health laboratories on human infections caused by Campylobacter, Salmonella, Shiga toxin-producing E. coli, and Shigella in LEDS. These data inform annual estimates of the burden of illness, assessments of patterns in bacterial subtypes, and can be used to describe trends in incidence. Robust digital infrastructure is required to process, validate, and summarize data on approximately 60,000 infections annually while optimizing use of financial and personnel resources.Methods: We leveraged the robust and extensible programming facilities of the R programming language and the active community of R users to develop a data integration, processing, and reporting pipeline for LEDS via an internal software package we named “ledsmanageR”. We designed all data retrieval, cleaning, and provisioning algorithms using tools from RStudio software packages1–3 and tracked changes to source code and data using CDC’s internal Gitlab server. We automated data validation requests to reporting partners by generating customizable emails directly from the R console4. We streamlined the data reconciliation process using OpenRefine5, a point-and-click tool for cleaning big data. We automated generation of annual reports, a process that was previously manual, using parameterized RMarkdown documents. Staff epidemiologists performed design and implementation internally, requiring no external consulting.Results: Developing our free and open-source software platform for national foodborne surveillance data management has saved the Enteric Diseases Epidemiology Branch thousands of dollars because we no longer depend on proprietary software requiring annual licensing fees. This transition occurred without any disruption in surveillance operations. Partial automation of email-based data validation and annual report generation processes reduced employee time requirement from one full-time position to one part-time position. The modular nature of ledsmanageR permitted LEDS to collect an expanded set of data elements with no changes to the core data processing and reporting workflow.Conclusions: We developed and implemented a flexible tool that helps maintain the integrity of surveillance data and reduces the need for manual data cleaning, which can be laborious and error-prone. The user-friendly design features of ledsmanageR demonstrate that data management can be optimized using programming skills that are increasingly common among epidemiologists. Our work on improving the accuracy and efficiency of enteric disease surveillance has served as a proof of concept for plans to streamline data processing for other surveillance systems.

Author(s):  
Eleni Galanis ◽  
Marsha Taylor ◽  
Kamila Romanowski ◽  
Olga Bitzikos ◽  
Jennifer Jeyes ◽  
...  

Timely surveillance of enteric diseases is necessary to identify and control cases and outbreaks. Our objective was to evaluate the timeliness of enteric disease surveillance in British Columbia, Canada, compare these results to other settings, and recommend improvements. In 2012 and 2013, information was collected from case report forms and laboratory information systems on 2615Salmonella, shigatoxin-producingE. coli,Shigella, andListeriainfections. Twelve date variables representing the surveillance process from onset of symptoms to case interview and final laboratory results were collected, and intervals were measured. The median time from onset of symptoms to reporting subtyping results to BC epidemiologists was 26–36 days and from onset of symptoms to case interview was 12–14 days. Our findings were comparable to the international literature except for a longer time (up to 29 day difference) to reporting of PFGE results to epidemiologists in BC. Such a delay may impact our ability to identify and solve outbreaks. Several process and system changes were implemented which should improve the timeliness of enteric disease surveillance.


Vaccine ◽  
2016 ◽  
Vol 34 (43) ◽  
pp. 5181-5186 ◽  
Author(s):  
Alain Poy ◽  
Etienne Minkoulou ◽  
Keith Shaba ◽  
Ali Yahaya ◽  
Peter Gaturuku ◽  
...  

Author(s):  
E. E. Akimkina

The problems of structuring of indicators in multidimensional data cubes with their subsequent processing with the help of end-user tools providing multidimensional visualization and data management are analyzed; the possibilities of multidimensional data processing technologies for managing and supporting decision making at a design and technological enterprise are shown; practical recommendations on the use of domestic computer environments for the structuring and visualization of multidimensional data cubes are given.


2020 ◽  
Author(s):  
K. Thirumalesh ◽  
Salgeri Puttaswamy Raju ◽  
Hiriyur Mallaiah Somashekarappa ◽  
Kumaraswamy Swaroop

2008 ◽  
Vol 14 (2) ◽  
pp. 311-313 ◽  
Author(s):  
Craig W. Hedberg ◽  
Jesse F. Greenblatt ◽  
Bela T. Matyas ◽  
Jennifer Lemmings ◽  
Donald J. Sharp ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Daming Yang ◽  
Yongjian Huang ◽  
Zongyang Chen ◽  
Qinghua Huang ◽  
Yanguang Ren ◽  
...  

AbstractFischer plots are widely used in paleoenvironmental research as graphic representations of sea- and lake-level changes through mapping linearly corrected variation of accumulative cycle thickness over cycle number or stratum depth. Some kinds of paleoenvironmental proxy data (especially subsurface data, such as natural gamma-ray logging data), which preserve continuous cyclic signals and have been largely collected, are potential materials for constructing Fischer Plots. However, it is laborious to count the cycles preserved in these proxy data manually and map Fischer plots with these cycles. In this paper, we introduce an original open-source Python code “PyFISCHERPLOT” for constructing Fischer Plots in batches utilizing paleoenvironmental proxy data series. The principle of constructing Fischer plots based on proxy data, the data processing and usage of the PyFISCHERPLOT code and the application cases of the code are presented. The code is compared with existing methods for constructing Fischer plots.


Sign in / Sign up

Export Citation Format

Share Document