linkage error
Recently Published Documents


TOTAL DOCUMENTS

40
(FIVE YEARS 21)

H-INDEX

5
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Federica Liccardo ◽  
Matteo Lo Monte ◽  
Brunella Corrado ◽  
Martina Veneruso ◽  
Simona Celentano ◽  
...  

Currently, a major technical limitation of microscopy based image analysis is the linkage error – which describes the distance between e.g. the target epitope of cellular protein to the fluorescence emitter, which position is finally detected in a microscope. With continuously improving resolution of today′s (super–resolution) microscopes, the linkage errors can severely hamper the correct interpretation of images and is usually introduced in experiments by the use of standard intracellular staining reagents such as fluorescently labelled antibodies. The linkage error of standard labelled antibodies is caused by the size of the antibody and the random distribution of fluorescent emitters on the antibody surface. Together, these two factors account for a fluorescence displacement of ~40nm when staining proteins by indirect immunofluorescence; and ~20nm when staining with fluorescently coupled primary antibodies. In this study, we describe a class of staining reagents that effectively reduce the linkage error by more than five–fold when compared to conventional staining techniques. These reagents, called Fluo–N–Fabs, consist of an antigen binding fragment of a full-length antibody (Fab / fragment antigen binding) that is selectively conjugated at the N-terminal amino group with fluorescent organic molecules, thereby reducing the distance between the fluorescent emitter and the protein target of the analysis. Fluo–N–Fabs also exhibit the capability to penetrate tissues and highly crowded cell compartments, thus allowing for the efficient detection of cellular epitopes of interest in a wide range of fixed samples. We believe this class of reagents realize an unmet need in cell biological super resolution imaging studies where the precise localization of the target of interest is crucial for the understanding of complex biological phenomena.


Author(s):  
Harrison G Zhang ◽  
Boris P Hejblum ◽  
Griffin M Weber ◽  
Nathan P Palmer ◽  
Susanne E Churchill ◽  
...  

Abstract Objective Large amounts of health data are becoming available for biomedical research. Synthesizing information across databases may capture more comprehensive pictures of patient health and enable novel research studies. When no gold standard mappings between patient records are available, researchers may probabilistically link records from separate databases and analyze the linked data. However, previous linked data inference methods are constrained to certain linkage settings and exhibit low power. Here, we present ATLAS, an automated, flexible, and robust association testing algorithm for probabilistically linked data. Materials and Methods Missing variables are imputed at various thresholds using a weighted average method that propagates uncertainty from probabilistic linkage. Next, estimated effect sizes are obtained using a generalized linear model. ATLAS then conducts the threshold combination test by optimally combining P values obtained from data imputed at varying thresholds using Fisher’s method and perturbation resampling. Results In simulations, ATLAS controls for type I error and exhibits high power compared to previous methods. In a real-world genetic association study, meta-analysis of ATLAS-enabled analyses on a linked cohort with analyses using an existing cohort yielded additional significant associations between rheumatoid arthritis genetic risk score and laboratory biomarkers. Discussion Weighted average imputation weathers false matches and increases contribution of true matches to mitigate linkage error-induced bias. The threshold combination test avoids arbitrarily choosing a threshold to rule a match, thus automating linked data-enabled analyses and preserving power. Conclusion ATLAS promises to enable novel and powerful research studies using linked data to capitalize on all available data sources.


Author(s):  
Zhenbang Wang ◽  
Emanuel Ben‐David ◽  
Guoqing Diao ◽  
Martin Slawski
Keyword(s):  

2021 ◽  
Vol 37 (3) ◽  
pp. 699-718
Author(s):  
Daan Zult ◽  
Peter-Paul de Wolf ◽  
Bart F. M. Bakker ◽  
Peter van der Heijden

Abstract The size of a partly observed population is often estimated with the capture-recapture model. An important assumption of this chat model is that sources can be perfectly linked. This assumption is of relevance if the identification of records is not obtained by some perfect identifier (such as an id code) but by indirect identifiers (such as name and address). In that case, the perfect linkage assumption is often violated, which in general leads to biased population size estimates. Initial suggestions to solve this use record linkage probabilities to correct the capture-recapture model. In this article we provide a general framework, based on the standard log-linear modelling approach, that generalises this work towards the inclusion of additional sources and covariates. We show that the method performs well in a simulation study.


Author(s):  
Nicole Watson

An important aspect of an indefinite life household panel study is to provide a sample of children who become new generations of respondents over time. The representativity of children and young adults in the Household, Income and Labour Dynamics in Australia (HILDA) Survey is assessed after 16 waves. Estimates from the HILDA Survey are compared to official data sources of the Australian Bureau of Statistics (ABS) and include demographic, education, employment, income and residential mobility variables. Both cross-section and longitudinal estimates are assessed. Overall, the HILDA Survey estimates are relatively close to the ABS estimates with the exception of the year of arrival of recent immigrants, having foreign-born parents, having a certificate level qualification, type of relationship in household, having zero income, the main source of income, and residential mobility. Most of these exceptions can be explained by differences in questionnaire design, respondent recall error, linkage error, and differences in the amount of missing data. The estimate of particular concern is the proportion of immigrants arriving in the last five years, which is underestimated in the HILDA Survey due to undercoverage of recent immigrants. This could be addressed by regular refreshment samples of recent immigrants.<br /><br />Key messages<br /><ul><li>A vital part of an indefinite life panel study are the children who later become respondents.</li><br /><li>This paper compares the HILDA Survey to official sources cross-sectionally and longitudinally.</li><br /><li>New generations of children and young adults are found to be representative.</li><br /><li>Some concern identified for undercoverage of recent immigrants.</li></ul>


2020 ◽  
Vol 36 (4) ◽  
pp. 1261-1279
Author(s):  
Paulina Pankowska ◽  
Dimitris Pavlopoulos ◽  
Bart Bakker ◽  
Daniel L. Oberski

This paper discusses how National Statistical Institutes (NSI’s) can use hidden Markov models (HMMs) to produce consistent official statistics for categorical, longitudinal variables using inconsistent sources. Two main challenges are addressed: first, the reconciliation of inconsistent sources with multi-indicator HMMs requires linking the sources on the micro level. Such linkage might lead to bias due to linkage error. Second, applying and estimating HMMs regularly is a complicated and expensive procedure. Therefore, it is preferable to use the error parameter estimates as a correction factor for a number of years. However, this might lead to biased structural estimates if measurement error changes over time or if the data collection process changes. Our results on these issues are highly encouraging and imply that the suggested method is appropriate for NSI’s. Specifically, linkage error only leads to (substantial) bias in very extreme scenarios. Moreover, measurement error parameters are largely stable over time if no major changes in the data collection process occur. However, when a substantial change in the data collection process occurs, such as a switch from dependent (DI) to independent (INDI) interviewing, re-using measurement error estimates is not advisable.


BMJ Open ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. e043540
Author(s):  
Emmert Roberts ◽  
James C Doidge ◽  
Katie L Harron ◽  
Matthew Hotopf ◽  
Jonathan Knight ◽  
...  

ObjectivesThe creation and evaluation of a national record linkage between substance misuse treatment, and inpatient hospitalisation data in England.DesignA deterministic record linkage using personal identifiers to link the National Drug Treatment Monitoring System (NDTMS) curated by Public Health England (PHE), and Hospital Episode Statistics (HES) Admitted Patient Care curated by National Health Service (NHS) Digital.Setting and participantsAdults accessing substance misuse treatment in England between 1 April 2018 and 31 March 2019 (n=268 251) were linked to inpatient hospitalisation records available since 1 April 1997.Outcome measuresUsing a gold-standard subset, linked using NHS number, we report the overall linkage sensitivity and precision. Predictors for linkage error were identified, and inverse probability weighting was used to interrogate any potential impact on the analysis of length of hospital stay.Results79.7% (n=213 814) people were linked to at least one HES record, with an estimated overall sensitivity of between 82.5% and 83.3%, and a precision of between 90.3% and 96.4%. Individuals were more likely to link if they were women, white and aged between 46 and 60. Linked individuals were more likely to have an average length of hospital stay ≥5 days if they were men, older, had no fixed residential address or had problematic opioid use. These associations did not change substantially after probability weighting, suggesting they were not affected by bias from linkage error.ConclusionsLinkage between substance misuse treatment and hospitalisation records offers a powerful new tool to evaluate the impact of treatment on substance related harm in England. While linkage error can produce misleading results, linkage bias appears to have little effect on the association between substance misuse treatment and length of hospital admission. As subsequent analyses are conducted, potential biases associated with the linkage process should be considered in the interpretation of any findings.


2020 ◽  
Author(s):  
Daniela Almeida ◽  
David Gorender ◽  
Maria Yury Ichihara ◽  
Samila Sena ◽  
Luan Menezes ◽  
...  

Abstract Background : Research using linked routine population-based data collected for non-research purposes has increased in recent years because they are a rich and detailed source of information. The objective of this study is to present an approach to prepare and link data from administrative sources in a middle-income country, to estimate its quality and to identify potential sources of bias by comparing linked and no-linked case. Methods: We linked two administrative datasets with data covering the period 2001 to 2015, using maternal attributes (name, age, date of birth, and municipally of residence) from Brazil: live birth information system and the 100 Million Brazilian Cohort (created using administrative records from over 114 million individuals whose families applied for social assistance via the Unified Register for Social Programmes) implementing an in house developed linkage tool CIDACS-RL. We then estimated the proportion of highly probably link and examined the characteristics of missed-matches to identify any potential source of bias. Results: A total of 27,699,891 live births linked with maternal information recorded in the baseline of the 100 Million Brazilian Cohort dataset of those, 16,447,414 (59.4%) children were found registered in the 100 Million Brazilian Cohort dataset. The proportion of highly probably link ranged from 39.3% in 2001 to 82.1% in 2014. A substantial improvement in the linkage after the introduction of maternal date of birth attribute, in 2011, was observed. Our analyses indicated a slightly higher proportion of missing data among missed matches and a higher proportion of people living in an urban area and self-declared as Caucasian among linked pairs when compared with non-linked sets. Discussion: We demonstrated that CIDACS-RL is capable of performing high quality linkage even with a limited number of common attributes, using indexation as a blocking strategy in larg e routine databases from a middle-income country. However, residual records occurred more among people under worse living conditions. The results presented in this study reinforce the need of evaluating linkage quality and when necessary to take linkage error into account for the analyses of any generated dataset.


Sign in / Sign up

Export Citation Format

Share Document