Data quality in health care data warehouse environments

Author(s):  
R.L. Leitheiser
Author(s):  
Eric Infield ◽  
Laura Sebastian-Coleman

This paper is a case study of the data quality program implemented for Galaxy, a large health care data warehouse owned by UnitedHealth Group and operated by Ingenix. The paper presents an overview of the program’s goals and components. It focuses on the program’s metrics and includes examples of the practical application of statistical process control (SPC) for measuring and reporting on data quality. These measurements pertain directly to the quality of the data and have implications for the wider question of information quality. The paper provides examples of specific measures, the benefits gained in applying them in a data warehouse setting, and lessons learned in the process of implementing and evolving the program.


2019 ◽  
Vol 10 (05) ◽  
pp. 794-803 ◽  
Author(s):  
Kristine E. Lynch ◽  
Stephen A. Deppen ◽  
Scott L. DuVall ◽  
Benjamin Viernes ◽  
Aize Cao ◽  
...  

Abstract Background The development and adoption of health care common data models (CDMs) has addressed some of the logistical challenges of performing research on data generated from disparate health care systems by standardizing data representations and leveraging standardized terminology to express clinical information consistently. However, transforming a data system into a CDM is not a trivial task, and maintaining an operational, enterprise capable CDM that is incrementally updated within a data warehouse is challenging. Objectives To develop a quality assurance (QA) process and code base to accompany our incremental transformation of the Department of Veterans Affairs Corporate Data Warehouse health care database into the Observational Medical Outcomes Partnership (OMOP) CDM to prevent incremental load errors. Methods We designed and implemented a multistage QA) approach centered on completeness, value conformance, and relational conformance data-quality elements. For each element we describe key incremental load challenges, our extract, transform, and load (ETL) solution of data to overcome those challenges, and potential impacts of incremental load failure. Results Completeness and value conformance data-quality elements are most affected by incremental changes to the CDW, while updates to source identifiers impact relational conformance. ETL failures surrounding these elements lead to incomplete and inaccurate capture of clinical concepts as well as data fragmentation across patients, providers, and locations. Conclusion Development of robust QA processes supporting accurate transformation of OMOP and other CDMs from source data is still in evolution, and opportunities exist to extend the existing QA framework and tools used for incremental ETL QA processes.


2006 ◽  
Vol 63 (2) ◽  
pp. 135-157 ◽  
Author(s):  
Meredith B. Rosenthal ◽  
Richard G. Frank

2009 ◽  
Vol 99 (8) ◽  
pp. 462-466 ◽  
Author(s):  
Susan E. Brien ◽  
Elijah Dixon ◽  
William A. Ghali

Sign in / Sign up

Export Citation Format

Share Document