Record Linkage Studies for Postmarketing Drug Surveillance: Data Quality and Validity Considerations

1988 ◽  
Vol 22 (2) ◽  
pp. 157-161 ◽  
Author(s):  
Andy S. Stergachis

Large automated databases are the source of information for many record linkage studies, including postmarketing drug surveillance. Despite this reliance on prerecorded data, there have been few attempts to assess data quality and validity. This article presents some of the basic data quality and validity issues in applying record linkage methods to postmarketing surveillance. Studies based on prerecorded data, as in most record linkage studies, have all the inherent problems of the data from which they are derived. Sources of threats to the validity of record linkage studies include the completeness of data, the ability to accurately identify and follow the records of individuals through time and place, and the validity of data. This article also describes techniques for evaluating data quality and validity. Postmarketing surveillance could benefit from more attention to identifying and solving the problems associated with record linkage studies.

Author(s):  
William E. Winkler

Fayyad and Uthursamy (2002) have stated that the majority of the work (representing months or years) in creating a data warehouse is in cleaning up duplicates and resolving other anomalies. This paper provides an overview of two methods for improving quality. The first is record linkage for finding duplicates within files or across files. The second is edit/imputation for maintaining business rules and for filling-in missing data. The fastest record linkage methods are suitable for files with hundreds of millions of records (Winkler, 2004a, 2008). The fastest edit/imputation methods are suitable for files with millions of records (Winkler, 2004b, 2007a).


1979 ◽  
Vol 18 (02) ◽  
pp. 89-97 ◽  
Author(s):  
Martha E. Smith ◽  
H. B. Newcombe

Empirical tests of the application of computer record linkage methods versus the use of routine clerical searching, for bringing together various vital and ill-health records, have shown that the success rate for the computer operation was higher (98.3 versus 96.7 per cent) and the proportion of false linkages very much lower (0.1 versus 2.3 per cent). The rate at which the ill-health records were processed by the computer was approximately 14,000 per minute of central processor time, representing a cost of a half a cent apiece.Factors affecting the speed, accuracy and cost of computerized record linkage are discussed.


2008 ◽  
pp. 137-147 ◽  
Author(s):  
M. Hodkiewicz ◽  
P. Kelly ◽  
J. Sikorska ◽  
L. Gouws

2016 ◽  
Vol 49 (10) ◽  
pp. 3969-3979 ◽  
Author(s):  
Christopher Hall ◽  
Andrea Hamilton

2008 ◽  
Vol 13 (5) ◽  
pp. 378-389 ◽  
Author(s):  
Xiaohua Douglas Zhang ◽  
Amy S. Espeseth ◽  
Eric N. Johnson ◽  
Jayne Chin ◽  
Adam Gates ◽  
...  

RNA interference (RNAi) not only plays an important role in drug discovery but can also be developed directly into drugs. RNAi high-throughput screening (HTS) biotechnology allows us to conduct genome-wide RNAi research. A central challenge in genome-wide RNAi research is to integrate both experimental and computational approaches to obtain high quality RNAi HTS assays. Based on our daily practice in RNAi HTS experiments, we propose the implementation of 3 experimental and analytic processes to improve the quality of data from RNAi HTS biotechnology: (1) select effective biological controls; (2) adopt appropriate plate designs to display and/or adjust for systematic errors of measurement; and (3) use effective analytic metrics to assess data quality. The applications in 5 real RNAi HTS experiments demonstrate the effectiveness of integrating these processes to improve data quality. Due to the effectiveness in improving data quality in RNAi HTS experiments, the methods and guidelines contained in the 3 experimental and analytic processes are likely to have broad utility in genome-wide RNAi research. ( Journal of Biomolecular Screening 2008:378-389)


Author(s):  
G. Shankaranarayanan ◽  
Adir Even

Maintaining data at a high quality is critical to organizational success. Firms, aware of the consequences of poor data quality, have adopted methodologies and policies for measuring, monitoring, and improving it (Redman, 1996; Eckerson, 2002). Today’s quality measurements are typically driven by physical characteristics of the data (e.g., item counts, time tags, or failure rates) and assume an objective quality standard, disregarding the context in which the data is used. The alternative is to derive quality metrics from data content and evaluate them within specific usage contexts. The former approach is termed as structure-based (or structural), and the latter, content-based (Ballou and Pazer, 2003). In this chapter we propose a novel framework to assess data quality within specific usage contexts and link it to data utility (or utility of data) - a measure of the value contribution associated with data within specific usage contexts. Our utility-driven framework addresses the limitations of structural measurements and offers alternative measurements for evaluating completeness, validity, accuracy, and currency, as well as a single measure that aggregates these data quality dimensions.


2005 ◽  
Vol 24 (S1) ◽  
pp. 153-170 ◽  
Author(s):  
Leslie L. Roos ◽  
Sumit Gupta ◽  
Ruth-Ann Soodeen ◽  
Laurel Jebamani

ABSTRACTThis review evaluates the quality of available administrative data in the Canadian provinces, emphasizing the information needed to create integrated systems. We explicitly compare approaches to quality measurement, indicating where record linkage can and cannot substitute for more expensive record re-abstraction. Forty-nine original studies evaluating Canadian administrative data (registries, hospital abstracts, physician claims, and prescription drugs) are summarized in a structured manner. Registries, hospital abstracts, and physician files appear to be generally of satisfactory quality, though much work remains to be done. Data quality did not vary systematically among provinces. Primary data collection to check place of residence and longitudinal follow-up in provincial registries is needed. Promising initial checks of pharmaceutical data should be expanded. Because record linkage studies were “conservative” in reporting reliability, the reduction of time-consuming record re-abstraction appears feasible in many cases. Finally, expanding the scope of administrative data to study health, as well as health care, seems possible for some chronic conditions. The research potential of the information-rich environments being created highlights the importance of data quality.


Sign in / Sign up

Export Citation Format

Share Document