scholarly journals Comparison of National and Local Syndromic Surveillance Data - Cook County, IL, 2017

2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Zachary Heth ◽  
Kelly Bemis ◽  
Demian Christiansen

ObjectiveThis analysis was undertaken to determine how the data completeness, consistency, and other attributes of our local syndromic surveillance program compared to the National Syndromic Surveillance Platform.IntroductionIn 2005, the Cook County Department of Public Health (CCDPH) began using the Electronic Surveillance System for the Early Notification of Community-based Epidemics (ESSENCE) as an emergency department (ED)-based local syndromic surveillance program (LSSP); 23 (100%) of 23 hospitals in suburban Cook County report to the LSSP. Data are transmitted in delimited ASCII text files (i.e., flat files) and contain a unique patient identifier, visit date and time, zip code, age, sex, and chief complaint. Discharge diagnosis and disposition are optional data elements. Prior to 2017, the Illinois Department of Public Health placed facilities participating in the Cook LSSP in a holding queue to transform their flat file submissions into a HL7 compliant message; however as of 2017, eligible hospitals must submit HL7 formatted production data to IDPH to fulfill Meaningful Use. The primary syndromic surveillance system for Illinois is the National Syndromic Surveillance Program (NSSP), which transitioned to an ESSENCE interface in 2016. As of December 2016, 20 (87%) of 23 hospitals reporting to the LSSP also reported to IDPH and the NSSP. As both syndromic surveillance systems aim to collect the same data, and now can be analyzed with the same interface, CCDPH sought to compare the LSSP and NSSP for data completeness, consistency, and other attributes.MethodsOur comparison of NSSP to the LSSP focused on data completeness for key demographic and medical variables and consistency in total visit counts. Analysis of completeness utilized data from December 2016 for 20 hospitals contributing HL7 production data to IDPH at that time. Total visit counts in both systems were compared for the same 20 hospitals from February 5th-11th 2017, a randomly chosen time period. A target threshold of less than 3% difference in total visit counts was set by the CCDPH system users. Analysis was completed in Microsoft Excel 2010. Other attributes of the surveillance systems were qualitatively assessed by the primary system users at CCDPH.ResultsAll variables required by the LSSP had 98-100% completeness in both the LSSP and NSSP (unique patient identifier, age, sex, zip code, visit time and date, and chief complaint). However, the LSSP optional data elements, discharge diagnosis and discharge disposition, were less complete in the LSSP, compared to the NSSP (Diagnosis: 56% versus 83%, Disposition: 66% versus 80%). Among variables required for NSSP reporting but not reported to the LSSP, completeness ranged from 100% (race, ethnicity) to 82% (county). Optional data elements within NSSP ranged in completeness from 73% (initial pulse oximetry) to 0% (initial blood pressure, insurance coverage). Of the 20 hospitals evaluated for visit counts, only one hospital had <3% difference in visit counts in the LSSP and NSSP for all 7 days assessed. Ten hospitals had >3% difference in visit counts on all seven days. Average seven day differences for hospitals ranged from 0% to 54%. Eighteen (90%) of 20 hospitals were reporting larger numbers of visits to NSSP than to the LSSP.ConclusionsOverall completeness of data was similar between the national and our local ESSENCE systems with most required variables having over 98% completeness. NSSP had higher completeness over the LSSP for discharge diagnosis and disposition. Additional data elements required by NSSP, but unavailable in the LSSP, had similarly high completeness but optional NSSP variables of interest showed greater variability in reporting. Differences in visit counts were higher than expected. An ongoing exploration of these differences has shown they are multifaceted and require hospital-specific interventions. There are strengths and limitations to both the NSSP and LSSP. CCDPH has direct control over data sharing between jurisdictions in the LSSP and there has historically been less system “down time” in the LSSP compared to the NSSP; however, the use of flat files instead of HL7, as well as having fewer incentives for hospital participation (e.g. Meaningful Use) after 2016, results in limited data collection and stagnant growth compared to the NSSP. Jurisdictions using their own LSSPs should consider analyzing their data completeness, consistency, and quality compared to the NSSP.  

2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Girum S. Ejigu ◽  
Kakshmi Radhakrishnan ◽  
Paul McMurray ◽  
Roseanne English

ObjectiveReview the impact of applying regular data quality checks to assess completeness of core data elements that support syndromic surveillance.IntroductionThe National Syndromic Surveillance Program (NSSP) is a community focused collaboration among federal, state, and local public health agencies and partners for timely exchange of syndromic data. These data, captured in nearly real time, are intended to improve the nation's situational awareness and responsiveness to hazardous events and disease outbreaks. During CDC’s previous implementation of a syndromic surveillance system (BioSense 2), there was a reported lack of transparency and sharing of information on the data processing applied to data feeds, encumbering the identification and resolution of data quality issues. The BioSense Governance Group Data Quality Workgroup paved the way to rethink surveillance data flow and quality. Their work and collaboration with state and local partners led to NSSP redesigning the program’s data flow. The new data flow provided a ripe opportunity for NSSP analysts to study the data landscape (e.g., capturing of HL7 messages and core data elements), assess end-to-end data flow, and make adjustments to ensure all data being reported were processed, stored, and made accessible to the user community. In addition, NSSP extensively documented the new data flow, providing the transparency the community needed to better understand the disposition of facility data. Even with a new and improved data flow, data quality issues that were issues in the past, but went unreported, remained issues in the new data. However, these issues were now identified. The newly designed data flow provided opportunities to report and act on issues found in the data unlike previous versions. Therefore, an important component of the NSSP data flow was the implementation of regularly scheduled standard data quality checks, and release of standard data quality reports summarizing data quality findings.MethodsNSSP data was assessed for the national-level completeness of chief complaint and discharge diagnosis data. Completeness is the rate of non- null values (Batini et al., 2009). It was defined as the percent of visits (e.g., emergency department, urgent care center) with a non-null value found among the one or more records associated with the visit. National completeness rates for visits in 2016 were compared with completeness rates of visits in 2017 (a partial year including visits through August 2017). In addition, facility-level progress was quantified after scoring each facility based on the percent completeness change between 2016 and 2017. Legacy data processed prior to introducing the new NSSP data flow were not included in this assessment.ResultsNationally, the percent completeness of chief complaint for visits in 2016 was 82.06% (N=58,192,721), and the percent completeness of chief complaint for visits in 2017 was 87.15% (N=80,603,991). Of the 2,646 facilities that sent visits data in 2016 and 2017, 114 (4.31%) facilities showed an increase of at least 10% in chief complaint completeness in 2017 compared with 2016. As for discharge diagnosis, national results showed the percent completeness of discharge diagnosis for 2016 visits was 50.83% (N=36,048,334), and the percent completeness of discharge diagnosis for 2017 was 59.23% (N=54,776,310). Of the 2,646 facilities that sent data for visits in 2016 and 2017, 306 (11.56%) facilities showed more than a 10% increase in percent completeness of discharge diagnosis in 2017 compared with 2016.ConclusionsNationally, the percent completeness of chief complaint for visits in 2016 was 82.06% (N=58,192,721), and the percent completeness of chief complaint for visits in 2017 was 87.15% (N=80,603,991). Of the 2,646 facilities that sent visits data in 2016 and 2017, 114 (4.31%) facilities showed an increase of at least 10% in chief complaint completeness in 2017 compared with 2016. As for discharge diagnosis, national results showed the percent completeness of discharge diagnosis for 2016 visits was 50.83% (N=36,048,334), and the percent completeness of discharge diagnosis for 2017 was 59.23% (N=54,776,310). Of the 2,646 facilities that sent data for visits in 2016 and 2017, 306 (11.56%) facilities showed more than a 10% increase in percent completeness of discharge diagnosis in 2017 compared with 2016.ReferencesBatini, C., Cappiello. C., Francalanci, C. and Maurino, A. (2009) Methodologies for data quality assessment and improvement. ACM Comput. Surv., 41(3). 1-52.


2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Zachary M. Stein ◽  
Sophia Crossen

ObjectiveTo compare and contrast two ESSENCE syndrome definition query methods and establish best practices for syndrome definition creation.IntroductionThe Kansas Syndromic Surveillance Program (KSSP) utilizes the ESSENCE v.1.20 program provided by the National Syndromic Surveillance Program to view and analyze Kansas Emergency Department (ED) data.Methods that allow an ESSENCE user to query both the Discharge Diagnosis (DD) and Chief Complaint (CC) fields simultaneously allow for more specific and accurate syndromic surveillance definitions. As ESSENCE use increases, two common methodologies have been developed for querying the data in this way.The first is a query of the field named “CC and DD.” The CC and DD field contains a concatenation of the parsed patient chief complaint and the discharge diagnosis. The discharge diagnosis consists of the last non-null value for that patient visit ID and the chief complaint parsed is the first non-null chief complaint value for that patient visit ID that is parsed by the ESSENCE platform. For this comparison, this method shall be called the CCDD method.The second method involves a query of the fields named, “Chief Complaint History” and “Discharge Diagnosis History.” While the first requires only one field be queried, this method queries the CC History and DD History fields, combines the resulting data and de-duplicates this final data set by the C_BioSense_ID. Chief Complaint History is a list of all chief complaint values related to a singular ED visit, and Discharge Diagnosis History is the same concept, except involving all Discharge Diagnosis values. For this comparison, this method shall be called the CCDDHX method.While both methods are based on the same query concept, each method can yield different results.MethodsA program was created in R Studio to analyze a user-provided query.Simple queries were randomly generated. Twenty randomly generated queries were run through the R Studio program and disparities between data sets were recorded. All KSSP production facility ED visits during the month of August 2017 were analyzed.Secondly, three queries actively utilized in KSSP practice were run through the program. These queries were Firework-Related Injuries, Frostbite and Cold Exposure, and Rabies Exposure. The queries were run on all KSSP production facility ED visits, and coincided with the timeline of relevant exposures.ResultsIn the random query trials, an average of 5.4% of the cases captured using the CCDD field method were unique and not captured by the same query in the CCDDHX method. Using the CCDDHX method, an average of 6.1% of the cases captured were unique and not captured by the CCDD method.When using the program to compare syndromes from actively utilized KSSP practice, the disparity between the two methods was much lower.Firework-Related InjuriesDuring the time period queried, the CCDD method returned 171 cases and the CCDDHX method returned 169 cases. All CCDDHX method cases were captured by the CCDD method. The CCDD method returned 2 cases not captured by the CCDDHX method. These two cases were confirmed as true positive firework-related injury cases.Frostbite and Cold ExposureDuring the time period queried, CCDD method returned 328 cases and the CCDDHX method returned 344 cases. The CCDDHX method captured 16 cases that the CCDD method did not. The CCDD method did not capture any additional cases when compared to the CCDDHX method. After review, 10 (62.5%) of these 16 cases not captured by the CCDD method were true positive cases.Rabies ExposureDuring the time period queried, the CCDD method returned 474 cases and the CCDDHX method returned 473 cases. The CCDDHX method captured 7 cases that the CCDD method did not. The CCDD method returned 8 cases not captured by the CCDDHX method. After review, the 7 unique cases captured in the CCDDHX method contained 3 (42.9%) true positive cases and 3 (37.5%) of the 8 cases not captured by the CCDDHX method were true positives.ConclusionsThe twenty random queries showed a disparity between methods. When utilizing the same program to analyze three actively utilized KSSP definitions, both methods yielded similar results with a much smaller disparity. The CCDDHX method inherently requires more steps and requires more queries to be run through ESSENCE, making the method less timely and more difficult to share. Despite these downsides, CCDDHX will capture cases that appear throughout the history of field updates.Further variance between methods is likely due to the CCDD field utilizing the ESSENCE-processed CC while the CCDDHX field utilizes the CC verbatim as produced by the ED facility. This allows the CCDD method to tap into the powerful spelling correction and abbreviation-parsing steps that ESSENCE employs, but incorrect machine corrections and replacements, while rare, can negatively affect syndrome definition performance.The greater disparity in methods for the random queries may be due to the short (3 letter) text portion of the queries. Short segments are more likely to be found in multiple words than text of actual queries. Utilizing larger randomly generated text segments may resolve this and is a planned next step for this research.Our next step is to share the R Studio program to allow further replication. The Kansas Syndromic Surveillance Program is also continuing similar research to ensure that best practices are being met. 


2011 ◽  
Vol 4 (0) ◽  
Author(s):  
Zachary Faigen ◽  
Anikah Salim ◽  
Kishok Rojohn ◽  
Ajit Isaac ◽  
Sherry Adams

2021 ◽  
Vol 6 ◽  
Author(s):  
Cara Jane Bergo ◽  
Jennifer R. Epstein ◽  
Stacey Hoferka ◽  
Marynia Aniela Kolak ◽  
Mai T. Pho

The current opioid crisis and the increase in injection drug use (IDU) have led to outbreaks of HIV in communities across the country. These outbreaks have prompted country and statewide examination into identifying factors to determine areas at risk of a future HIV outbreak. Based on methodology used in a prior nationwide county-level analysis by the US Centers for Disease Control and Prevention (CDC), we examined Illinois at the ZIP code level (n = 1,383). Combined acute and chronic hepatitis C virus (HCV) infection among persons &lt;40 years of age was used as an outcome proxy measure for IDU. Local and statewide data sources were used to identify variables that are potentially predictive of high risk for HIV/HCV transmission that fell within three main groups: health outcomes, access/resources, and the social/economic/physical environment. A multivariable negative binomial regression was performed with population as an offset. The vulnerability score for each ZIP code was created using the final regression model that consisted of 11 factors, six risk factors, and five protective factors. ZIP codes identified with the highest vulnerability ranking (top 10%) were distributed across the state yet focused in the rural southern region. The most populous county, Cook County, had only one vulnerable ZIP code. This analysis reveals more areas vulnerable to future outbreaks compared to past national analyses and provides more precise indications of vulnerability at the ZIP code level. The ability to assess the risk at sub-county level allows local jurisdictions to more finely tune surveillance and preventive measures and target activities in these high-risk areas. The final model contained a mix of protective and risk factors revealing a heightened level of complexity underlying the relationship between characteristics that impact HCV risk. Following this analysis, Illinois prioritized recommendations to include increasing access to harm reduction services, specifically sterile syringe services, naloxone access, infectious disease screening and increased linkage to care for HCV and opioid use disorder.


2018 ◽  
Vol 18 (1) ◽  
Author(s):  
Felipe J. Colón-González ◽  
Iain R. Lake ◽  
Roger A. Morbey ◽  
Alex J. Elliot ◽  
Richard Pebody ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document