scholarly journals Assessing the accuracy of probabilistic record linkage of social and health databases in the 100 million Brazilian cohort

Author(s):  
Marcos Barreto ◽  
André Alves ◽  
Samila Sena ◽  
Rosemeire Fiaccone ◽  
Leila Amorim ◽  
...  

ABSTRACT Background and aimsThe Brazilian government has several social protection programmes that select their beneficiaries based on socioeconomic information kept in the CadastroÚnico (CADU) database. The CADU will be used to build a population-based cohort of approximately 100 million individuals. Among the social programmes is the Bolsa Família (PBF), a conditional cash transfer programme that provides extra income to poor families. These two databases must be deterministically linked to individuals who have received payments from PBF between 2004 and 2012. It will be used in epidemiological studies aiming to assess the impact of PBF on the occurrence and severity of several diseases and health problems (tuberculosis, leprosy, HIV, child health etc). This cohort must be probabilistically linked with databases from the Unified Health System (SUS), such as hospitalization, notifiable diseases, mortality, and live births, in order to produce data marts (domain-specific data) to the proposed studies. Our goals comprise the validation of probabilistic record linkage methods to support this cohort setup. ApproachThis paper emphasizes the accuracy assessment of our methods based on the linkage of SIH (hospitalization), SINAN (notifications), and SIM (mortality) records to the 2011 extraction of CADU. We focused on hospitalization and notification of tuberculosis, as well infant mortality for all causes in under-4 children, for a small sample with 30,029 records (CADU). Due to the absence of gold standards, we used two approaches to assess accuracy: a clerical review and an automatic (tool-based) search. In the first case, we used different cut-off points as similarity index to calculate sensitivity and specificity, and a ROC curve to separate matched and non-matched pairs. The second approach retrieves from CADU all matched and non-matched pairs for a given individual, serving as a gold standard for validation. ResultsWe retrieved 22 linked pairs, from which 18 are true positives for infant mortality (SIM database). From SINAN, our results were 434 linked pairs with 166 true positives, and with SIH, 121 linked pairs with 34 true positives. The sensitivity of manual scan for SIM (children mortality) ranges from 44% (specificity of 100%) to 95% (specificity of 94%), with similarity indices between 0.80 and 0.97, respectively. For automatic search, we obtained a sensitivity of 69.2% and specificity of 91.8%. ConclusionOur results show the need for a continuous improvement in our linkage routines and how to consistently evaluate their accuracy in the absence of adequate gold standards.

Blood ◽  
2008 ◽  
Vol 112 (11) ◽  
pp. 563-563 ◽  
Author(s):  
Ann E Woolfrey ◽  
John Klein ◽  
Michael D Haagenson ◽  
Stephen R Spellman ◽  
Minoo Battiwalla ◽  
...  

Abstract Criteria for the selection of HLA mismatched donors are needed when an HLA matched unrelated donor is not available. To define the risks associated with mismatching at HLA loci, and the impact of number of HLA mismatches on outcome, we studied 1933 patients receiving URD peripheral blood stem cell (PBSC) transplants facilitated by the National Marrow Donor Program between 1999–2006 for treatment of AML, ALL, CML or MDS. Myeloablative (65%) and reduced intensity (35%) regimens were included. The transplanted PBSC grafts were T cell-replete, and most patients received calcineurin-inhibitor based GVHD prophylaxis (99%) with T replete grafts. Median follow-up was 2 years. Pairs were typed for HLA-A, B, C, DRB1, DQA1 and DQB1 by high resolution typing methods. Matching was classified as low resolution (antigen-equivalent) or high resolution (allele) involving HLA-A, B, C, and DRB1 (8/8 match). Because of multiple comparisons, p-values <0.01 were considered significant. All analyses were adjusted for patient and transplant characteristics. Results: No effect of HLA-DQ mismatching was found for 8/8 or 7/8 matched transplant pairs, henceforth DQ mismatch was removed from subsequent models. Matching for 8/8 alleles was associated with better survival at one year (56% vs. 47%, p=0.001) compared with 7/8 matched pairs. Using patients with 8/8 match for comparison (n=1243), a single HLA-antigen mismatch (n=293) was associated with a significantly higher risk for overall mortality (OM), (relative risk (RR)=1.32, 95% confidence interval [CI] 1.12–1.55, p=0.0007), transplant-related mortality (TRM), (RR 1.54 [1.24–1.91] p=0.0001), grades III-IV graft-vs.-host disease (GVHD), (RR 1.93 [1.53–2.44] p<0.0001), and lower disease-free survival (DFS), (RR 1.29 [1.10–1.51] p=0.0013). No statistically significant decrement in survival was seen for those with a single (n=208) or double (n=28) HLA-allele mismatches involving HLA-A, B, C, and/or DRB1, although small sample size limits the power of the analysis. Two antigen or antigen plus allele mismatches [6/8 pairs] were associated with 2 to 3 times the risk for OM and TRM compared with 8/8 matched pairs, all p<0.001. Comparing 8/8 to 7/8 donor-recipient pairs mismatched at specific loci, only HLA-C antigen mismatches (n=187) were significantly associated with lower DFS (RR=1.36 [1.13–1.64] p=0.0010), and increased risk for OM (RR=1.41 [1.16–1.70], p=0.0005), TRM (RR=1.61 [1.25–2.08], p=0.0002), and GVHD grades III-IV (RR=1.98 [1.50–2.62], p<0.0001). No differences in outcome were observed for HLA-C allele mismatch (n=61), nor for mismatches at HLA-A antigen/allele (n=136), -B antigen/allele (n=73), -DRB1 allele (n=39) or -DQ antigen/allele (n=114) compared to 8/8 matching. HLA mismatching was not associated with relapse or chronic GVHD. Conclusion: These data suggest that when 8/8 matched PBSC donors are not available; HLA-C antigen mismatched donors should be avoided. The effects of HLA-mismatching in URD PBSC may be distinct from marrow transplants, although additional studies with larger numbers of patients may increase the power to detect effects of other specific locus mismatches.


2019 ◽  
Vol 82 (S 02) ◽  
pp. S131-S138
Author(s):  
Sebastian Bartholomäus ◽  
Yannik Siegert ◽  
Hans Werner Hense ◽  
Oliver Heidinger

Abstract Background The evaluation of population-based screening programs, like the German Mammography Screening Program (MSP), requires collection and linking data from population-based cancer registries and other sources of the healthcare system on a case- specific level. To link such sensitive data, we developed a method that is compliant with German data protection regulations and does not require written individual consent. Methods Our method combines a probabilistic record linkage on encrypted identifying data with ‘blinded anonymisation’. It ensures that all data either are encrypted or have a defined and measurable degree of anonymity. The data sources use a software to transform plain-text identifying data into a set of irreversibly encrypted person cryptograms, while the evaluation attributes are aggregated in multiple stages and are reversibly encrypted. A pseudonymisation service encrypts the person cryptograms into record assignment numbers and a downstream data-collecting centre uses them to perform the probabilistic record linkage. The blinded anonymisation solves the problem of quasi-identifiers within the evaluation data. It allows selecting a specific set of the encrypted aggregations to produce data export with ensured k-anonymity, without any plain-text information. These data are finally transferred to an evaluation centre where they are decrypted and analysed. Our approach allows creating several such generalisations, with different resulting suppression rates allowing dynamic balance information depth with privacy protection and also highlights how this affects data analysability. Results German data protection authorities approved our concept for the evaluation of the impact of the German MSP on breast cancer mortality. We implemented a prototype and tested it with 1.5 million simulated records, containing realistically distributed identifying data, calculated different generalisations and the respective suppression rates. Here, we also discuss limitations for large data sets in the cancer registry domain, as well as approaches for further improvements like l-diversity and how to reduce the amount of manual post-processing. Conclusion Our approach enables secure linking of data from population-based cancer registries and other sources of the healthcare system. Despite some limitations, it enables evaluation of the German MSP program and can be generalised to be applicable to other projects.


2004 ◽  
Vol 20 (4) ◽  
pp. 915-925 ◽  
Author(s):  
Carla Jorge Machado ◽  
Kenneth Hill

Probabilistic record linkage allows the assembling of information from different data sources. We present a procedure when a one-to-one relationship between records in different files is expected but not found. Data were births and infant deaths, 1998-birth cohort, city of São Paulo, Brazil. Pairs for which a one-to-one relationship was obtained and a best-link was found with the highest weight were taken as unequivocally matched pairs and provided information to decide on the remaining pairs. For these, an expected relationship between differences in dates of death and birth registration was found; and places of birth and death registration for neonatal deaths were likely to be the same. Such evidence was used to solve for the remaining pairs. We reduced the number of non-uniquely matched records and of uncertain matches, and increased the number of uniquely matched pairs from 2,249 to 2,827. Future research using record linkage should use strategies from first record linkage runs before a full clerical review (the standard procedure under uncertainty) to efficiently retrieve matches.


2019 ◽  
Vol 129 (4) ◽  
pp. 127-131
Author(s):  
Agnieszka Parfin ◽  
Krystian Wdowiak ◽  
Marzena Furtak-Niczyporuk ◽  
Jolanta Herda

AbstractIntroduction. The COVID-19 is the name of an infectious disease caused by a new strain of coronavirus SARS-CoV-2 (Severe Acute Respiratory Syndrome Coronavirus 2). It was first diagnosed in December 2019 in patients in Wuhan City, Hubei Province, China. The symptoms are dominated by features of respiratory tract infections, in some patients with a very severe course leading to respiratory failure and, in extreme cases to death. Due to the spread of the infection worldwide, the WHO declared a pandemic in March 2020.Aim. An investigation of the impact of social isolation introduced due to the coronavirus pandemic on selected aspects of life. The researchers focused on observing changes in habits related to physical activity and their connections with people’s subjective well-being and emotional state.Material and methods. The study was carried out within the international project of the group „IRG on COVID and exercise”. The research tool was a standardized questionnaire.Results. Based on the data collected and the analysis of the percentage results, it can be observed that the overwhelming majority of people taking up physical activity reported a better mood during the pandemic. However, statistical tests do not confirm these relationships due to the small sample size.Conclusions. Isolation favours physical activity. Future, in-depth studies, by enlarging the population group, are necessary to confirm the above observations.


2019 ◽  
Vol 118 (4) ◽  
pp. 129-141
Author(s):  
Mr. Y. EBENEZER

                   This paper deals with economic growth and infant mortality rate in Tamilnadu. The objects of this paper are to test the relationship between Per capita Net State Domestic Product and infant mortality rate and also to measure the impact of Per capita Net State Domestic Product on infant mortality rate in Tamil Nadu. This analysis has employed the ADF test and ARDL approach. The result of the study shows that IMR got reduced and Per capita Net State Domestic Product increased during the study period. This analysis also revealed that there is a negative relationship between IMR and the economic growth of Tamilnadu. In addition, ARDL bound test result has concluded that per capita Net State Domestic Product of Tamilnadu has long run association with IMR.


2019 ◽  
pp. 392-400 ◽  
Author(s):  
Gunnar Kleuker ◽  
Christa M. Hoffmann

The harvest of sugar beet leads to root tip breakage and surface damage through mechanical impacts, which increase storage losses. For the determination of textural properties of sugar beet roots with a texture analyzer a reliable method description is missing. This study aimed to evaluate the impact of washing, soil tare, storage period from washing until measurement, sample distribution and number of roots on puncture and compression measurements. For this purpose, in 2017 comprehensive tests were conducted with sugar beet roots grown in a greenhouse. In a second step these tests were carried out with different Beta varieties from a field trial, and in addition, a flexural test was included. Results show that the storage period after washing and the sample distribution had an influence on the puncture and compression strength. It is suggested to wash the roots by hand before the measurement and to determine the strength no later than 48 h after washing. For reliable and comparable results a radial distribution of measurement points around the widest circumference of the root is recommended for the puncture test. The sample position of the compression test had an influence on the compressive strength and therefore, needs to be clearly defined. For the puncture and the compression test it was possible to achieve stable results with a small sample size, but with increasing heterogeneity of the plant stand a higher number of roots is required. The flexural test showed a high variability and is, therefore, not recommended for the analysis of sugar beet textural properties.


2020 ◽  
Author(s):  
Qing Zhao ◽  
Pei Chen ◽  
Yu Zhang ◽  
Haining Liu ◽  
Xianwen Li

BACKGROUND Mobile health application has become an important tool for healthcare systems. One such tool is the delivery of assisting in people with cognitive impairment and their caregivers. OBJECTIVE This scoping review aims to explore and evaluate the existing evidence and challenges on the use of mHealth applications that assisting in people with cognitive impairment and their caregivers. METHODS Nine databases, including PubMed, EMBASE, Cochrane, PsycARTICLES, CINAHL, Web of Science, Applied Science & Technology Source, IEEE Xplore and the ACM Digital Library were searched from inception through June 2020 for the studies of mHealth applications on people with cognitive impairment and their caregivers. Two reviewers independently extracted, checked synthesized data independently. RESULTS Of the 6101 studies retrieved, 64 studies met the inclusion criteria. Three categories emerged from this scoping review. These categories are ‘application functionality’, ‘evaluation strategies’, ‘barriers and challenges’. All the included studies were categorized into 7 groups based on functionality: (1) cognitive assessment; (2) cognitive training; (3) life support; (4) caregiver support; (5) symptom management; (6) reminiscence therapy; (7) exercise intervention. The included studies were broadly categorized into four types: (1) Usability testing; (2) Pilot and feasibility studies; (3) Validation studies; and (4) Efficacy or Effectiveness design. These studies had many defects in research design such as: (1) small sample size; (2) deficiency in active control group; (3) deficiency in analyzing the effectiveness of intervention components; (4) lack of adverse reactions and economic evaluation; (5) lack of consideration about the education level, electronic health literacy and smartphone proficiency of the participants; (6) deficiency in assessment tool; (7) lack of rating the quality of mHealth application. Some progress should be improved in the design of smartphone application functionality, such as: (1) the design of cognitive measurements and training game need to be differentiated; (2) reduce the impact of the learning effect. Besides this, few studies used health behavior theory and performed with standardized reporting. CONCLUSIONS Preliminary results show that mobile technologies facilitate the assistance in people with cognitive impairment and their caregivers. The majority of mHealth application interventions incorporated usability outcome and health outcomes. However, these studies have many defects in research design that limit the extrapolation of research. The content of mHealth application is urgently improved to adapt to demonstrate the real effect. In addition, further research with strong methodological rigor and adequate sample size are needed to examine the feasibility, effectiveness, and cost-effectiveness of mHealth applications for people with cognitive impairment and their caregivers.


2020 ◽  
Author(s):  
Ahmed Youssef Kada

BACKGROUND Covid-19 is an emerging infectious disease like viral zoonosis caused by new coronavirus SARS CoV 2. On December 31, 2019, Wuhan Municipal Health Commission in Hubei province (China) reported cases of pneumonia, the origin of which is a new coronavirus. Rapidly extendable around the world, the World Health Organization (WHO) declares it pandemic on March 11, 2020. This pandemic reaches Algeria on February 25, 2020, date on which the Algerian minister of health, announced the first case of Covid-19, a foreign citizen. From March 1, a cluster is formed in Blida and becomes the epicentre of the coronavirus epidemic in Algeria, its total quarantine is established on March 24, 2020, it will be smoothly alleviated on April 24. A therapeutic protocol based on hydroxychloroquine and azithromycin was put in place on March 23, for complicated cases, it was extended to all the cases confirmed on April 06. OBJECTIVE This study aimed to demonstrate the effectiveness of hydroxychloroquin/azithromycin protocol in Algeria, in particular after its extension to all patients diagnosed COVID-19 positive on RT-PCR test. We were able to illustrate this fact graphically, but not to prove it statistically because the design of our study, indeed in the 7 days which followed generalization of therapeutic protocol, case fatality rate decrease and doubling time increase, thus confirming the impact of wide and early prescription of hydroxychloroquin/azithromycin protocol. METHODS We have analyzed the data collected from press releases and follow-ups published daily by the Ministry of Health, we have studied the possible correlations of these data with certain events or decisions having a possible impact on their development, such as confinement at home and its reduction, the prescription of hydroxychloroquine/azithromycin combination for serious patients and its extension to all positive COVID subjects. Results are presented in graphics, the data collection was closed on 31/05/2020. RESULTS Covid-19 pandemic spreads from February 25, 2020, when a foreign citizen is tested positive, on March 1 a cluster is formed in the city of Blida where sixteen members of the same family are infected during a wedding party. Wilaya of Blida becomes the epicentre of coronavirus epidemic in Algeria and lockdown measures taken, while the number of national cases diagnosed begins to increases In any event, the association of early containment measures combined with a generalized initial treatment for all positive cases, whatever their degree of severity, will have contributed to a reduction in the fatality rate of COVID 19 and a slowing down of its doubling time. CONCLUSIONS In Algeria, the rapid combination of rigorous containment measure at home and early generalized treatment with hydroxychloroquin have demonstrated their effectiveness in terms of morbidity and mortality, the classic measures of social distancing and hygiene will make it possible to perpetuate these results by reducing viral transmission, the only unknown, the reopening procedure which can only be started after being surrounded by precautions aimed at ensuring the understanding of the population. CLINICALTRIAL Algeria, Covid-19, pandemic, hydroxychloroquin, azithromycin, case fatality rate


2019 ◽  
Vol 11 (1) ◽  
pp. 156-173
Author(s):  
Spenser Robinson ◽  
A.J. Singh

This paper shows Leadership in Energy and Environmental Design (LEED) certified hospitality properties exhibit increased expenses and earn lower net operating income (NOI) than non-certified buildings. ENERGY STAR certified properties demonstrate lower overall expenses than non-certified buildings with statistically neutral NOI effects. Using a custom sample of all green buildings and their competitive data set as of 2013 provided by Smith Travel Research (STR), the paper documents potential reasons for this result including increased operational expenses, potential confusion with certified and registered LEED projects in the data, and qualitative input. The qualitative input comes from a small sample survey of five industry professionals. The paper provides one of the only analyses on operating efficiencies with LEED and ENERGY STAR hospitality properties.


Sign in / Sign up

Export Citation Format

Share Document