Regression with linked datasets subject to linkage error

Author(s):  
Zhenbang Wang ◽  
Emanuel Ben‐David ◽  
Guoqing Diao ◽  
Martin Slawski
Keyword(s):  
Author(s):  
Zhong Jiang ◽  
Jiexiong Ding ◽  
Qicheng Ding ◽  
Li Du ◽  
Wei Wang

Nowadays the five-axis machine tool is one of the most important foundations of manufacturing industry. To guarantee the accuracy of the complex surface machining, multi-axis linkage performance detection and compensation of five-axis machine tools is necessary. RTCP (Rotation Tool Center Point) is one of the basic essential functions for the five-axis machine tools, which can keep the tool center with the machining trajectory when five axes move synchronously. On the basis of RTCP function, a way to detect multi-axes linkage performance of five-axis machine tools is briefly introduced, and linkage error model is built in accordance with the topological structure of machine tool. Based on the feature of the linkage errors of the five-axis machine tool, the error tracing and compensation method is proposed. Some simulations and experiments that verify the error tracing method could locate the linkage error category are established. Therefore, a new attempt to detect and compensate the linkage error of the five-axis machine tool is provided in this paper.


2018 ◽  
Author(s):  
Shama Sograte-Idrissi ◽  
Nazar Oleksiievets ◽  
Sebastian Isbaner ◽  
Mariana Eggert-Martinez ◽  
Jörg Enderlein ◽  
...  

AbstractDNA-PAINT is a rapidly developing fluorescence super-resolution technique which allows for reaching spatial resolutions below 10 nm. It also enables the imaging of multiple targets in the same sample. However, using DNA-PAINT to observe cellular structures at such resolution remains challenging. Antibodies, which are commonly used for this purpose, lead to a displacement between the target protein and the reporting fluorophore of 20-25 nm, thus limiting the resolving power. Here, we used nanobodies to minimize this linkage error to ~4 nm. We demonstrate multiplexed imaging by using 3 nanobodies, each able to bind to a different family of fluorescent proteins. We couple the nanobodies with single DNA strands via a straight forward and stoichiometric chemical conjugation. Additionally, we built a versatile computer-controlled microfluidic setup to enable multiplexed DNA-PAINT in an efficient manner. As a proof of principle, we labeled and imaged proteins on mitochondria, the Golgi apparatus, and chromatin. We obtained super-resolved images of the 3 targets with 20 nm resolution, and within only 35 minutes acquisition time.


Author(s):  
James C Doidge ◽  
Katie L Harron

Abstract Linked data are increasingly being used for epidemiological research, to enhance primary research, and in planning, monitoring and evaluating public policy and services. Linkage error (missed links between records that relate to the same person or false links between unrelated records) can manifest in many ways: as missing data, measurement error and misclassification, unrepresentative sampling, or as a special combination of these that is specific to analysis of linked data: the merging and splitting of people that can occur when two hospital admission records are counted as one person admitted twice if linked and two people admitted once if not. Through these mechanisms, linkage error can ultimately lead to information bias and selection bias; so identifying relevant mechanisms is key in quantitative bias analysis. In this article we introduce five key concepts and a study classification system for identifying which mechanisms are relevant to any given analysis. We provide examples and discuss options for estimating parameters for bias analysis. This conceptual framework provides the ‘links’ between linkage error, information bias and selection bias, and lays the groundwork for quantitative bias analysis for linkage error.


2019 ◽  
Vol 8 (3) ◽  
pp. 483-512 ◽  
Author(s):  
Paulina Pankowska ◽  
Bart F M Bakker ◽  
Daniel L Oberski ◽  
Dimitris Pavlopoulos

Abstract Hidden Markov models (HMMs) are increasingly used to estimate and correct for classification error in categorical, longitudinal data, without the need for a “gold standard,” error-free data source. To accomplish this, HMMs require multiple observations over time on a single indicator and assume that the errors in these indicators are conditionally independent. Unfortunately, this “local independence” assumption is often unrealistic, untestable, and a source of serious bias. Linking independent data sources can solve this problem by making the local independence assumption plausible across sources, while potentially allowing for local dependence within sources. However, record linkage introduces a new problem: the records may be erroneously linked or incorrectly not linked. In this paper, we investigate the effects of linkage error on HMM estimates of transitions between employment contract types. Our data come from linking a labor force survey to administrative employer records; this linkage yields two indicators per time point that are plausibly conditionally independent. Our results indicate that both false-negative and false-positive linkage error turn out to be problematic primarily if the error is large and highly correlated with the dependent variable. Moreover, under certain conditions, false-positive linkage error (mislinkage) in fact acts as another source of misclassification that the HMM can absorb into its error-rate estimates, leaving the latent transition estimates unbiased. In these cases, measurement error modeling already accounts for linkage error. Our results also indicate where these conditions break down and more complex methods would be needed.


2015 ◽  
Vol 31 (3) ◽  
pp. 415-429 ◽  
Author(s):  
Loredana Di Consiglio ◽  
Tiziana Tuoto

Abstract The Capture-recapture method is a well-known solution for evaluating the unknown size of a population. Administrative data represent sources of independent counts of a population and can be jointly exploited for applying the capture-recapture method. Of course, administrative sources are affected by over- or undercoverage when considered separately. The standard Petersen approach is based on strong assumptions, including perfect record linkage between lists. In reality, record linkage results can be affected by errors. A simple method for achieving linkage error-unbiased population total estimates is proposed in Ding and Fienberg (1994). In this article, an extension of the Ding and Fienberg model by relaxing their conditions is proposed. The procedures are illustrated for estimating the total number of road casualties, on the basis of a probabilistic record linkage between two administrative data sources. Moreover, a simulation study is developed, providing evidence that the adjusted estimator always performs better than the Petersen estimator.


Author(s):  
Daniel Leightley ◽  
Zoe Chui ◽  
Laura Goodwin

ABSTRACT ObjectiveSecondary health systems in the United Kingdom (UK) are unique for recording Outpatient, Inpatient and Accident & Emergency (A&E) visits in the form of electronic health (eHealth) records. Linking regional healthcare datasets is a problematic, further challenging when linking externally, such as to the King’s College Military Cohort Study (KCMCS). We introduce our methodology used for eRecord linkage. ApproacheHealth records from England, Scotland and Wales offer a variety of parameters such as admission/discharge date, diagnosis, treatment/procedure undertaken and the cost of treatment. To acquire eHealth records, unique patient identifiers: NHS number, forename, surname, sex and date of birth extracted from KCMCS were provided to each region. The KCMCS contains self-reported questionnaire results for 9,990 serving/ex-serving military personal, 8,602 participants consented to linkage. eHealth records prepared for linkage in two stages. First, admission and discharge date were checked to ensure a valid date. Second, episodes were checked for consistency, ensuring that no records for individual participants were duplicated. Data available varied based on the region, this disparity between regions can result in data type variation. Hence, linkage was performed on mutual variables to ensure a uniform admission history. Creation of the linked dataset was as follows. First, records and episodes relating to an individual were brought together, to create a personal admission history. Secondly, personal admission history were linked to the KCMCS. ResultsLinking to regional health datasets is not without its challenges. England, Scotland and Wales obtain, store and process eHealth records using different methodologies. A total of 6,336 (76.66%) participants were matched by regional health providers, with a total of 61,558 eHealth records. A total of 187 eHealth records were identified and discounted from linkage due to failure to meet criteria listed above. Verifying diagnoses completeness, Inpatient admissions were consistently code, with full completeness. Conversely, Outpatient admissions were poorly coded with 98% lacking any type of diagnosis. In addition, A&E records were sparsely coded; we identified four different regional and local coding systems to identify reason for admission. The eHealth records show promise for identifying health traits of the military. However, further work is required to identify synergy and overcome regional variations. ConclusionLinkage techniques provide new opportunities for exploring the health of serving and veteran population. However, quality of identifier and linkage error are still of major concern. Further, record completeness, diagnoses accuracy and data cleaning impact the data quality.


Author(s):  
James Doidge ◽  
Joan Morris ◽  
Katie Harron ◽  
Sarah Stevens ◽  
Ruth Gilbert

Background with rationalePatient registers and electronic health records are both valuable resources for disease surveillance but can be limited by variation in data quality over time. Variation may stem from changes in data collection methods, in the accuracy or completeness of clinical information, or in the quality of patient identifiers and the linkage that relies on these. Main AimBy linking the National Down Syndrome Cytogenetic Register (NDSCR) to Hospital Episode Statistics for England (HES), we aimed to assess the quality of each and establish a consistent approach for analysis of trends in prevalence of Down’s syndrome among live births in England. Methods/ApproachProbabilistic record linkage of NDSCR to HES for the period 1998–2013, supported by linkage of babies to mothers within HES. Comparison of prevalence estimates in England using NDSCR only, HES data only, and linked data. Capture-recapture analysis and quantitative bias analysis were used to account for potential errors, including false positive diagnostic codes, unrecorded diagnoses, and linkage error. ResultsAnalyses of single-source data indicated increasing live birth prevalence of Down’s syndrome, particularly steep in analysis of HES. Linked data indicated a contrastingly stable prevalence of 12.3 cases per 10,000 live births, with a plausible range of 11.6–12.7 cases per 10,000 live births allowing for potential errors. Conclusion Case ascertainment in NDSCR improved slightly over time, creating a picture of slowly increasing prevalence. The emerging epidemic suggested by HES primarily reflects improving linkage within HES (assignment of unique patient identifiers to hospital episodes). Administrative data are valuable but trends should be interpreted with caution, and with assessment of data quality over time. Linked data with quantitative bias analysis can provide more robust estimation and, in this case, reassurance that prevalence of Down’s syndrome is not increasing. Routine linkage of administrative and register data can enhance the value of each.


Sign in / Sign up

Export Citation Format

Share Document