scholarly journals Defining Populations Within Linked Administrative Data

Author(s):  
Duncan Wijnberg

This presentation will outline methods developed by the ABS to define populations within linked administrative data. The methods developed use a combination of both direct and indirect signals (referred to as ‘signs of life’) to infer whether individuals are in a particular population at a given point in time. Data from multiple government sources are used. IntroductionLinked administrative datasets hold significant potential to unlock new insights and understanding about different populations of interest. A key challenge is to create a research dataset that is properly representative of the population of interest at a given point in time. Without a representative population to serve as the basis of analysis, research outcomes are harder to interpret and compare against those from other populations. This has flow on consequences for any research findings derived from linked admin data. Objectives and ApproachThis project sought to develop methods to define a representative Australian population from the Multi Agency Data Integration Project (MADIP) data asset. The core of the asset is created through a three-way linkage between Australian Medicare, Social Security and Taxation datasets. Together, these datasets have very high coverage of the Australian population and enable high quality linkage of other datasets into the asset. A ‘signs of life’ approach was used that sought to distinguish a representative population at a given point in time from the MADIP asset. ResultsAn experimental representative population has been developed from the linkage spine that closely approximates the national age distribution and other breakdowns of the ABS’s Estimated Resident Population. Conclusion / ImplicationsThis work demonstrates one approach that can be used to derive useful analytical populations from linked datasets with overly exhaustive scopes.

Author(s):  
James Mowle

IntroductionThe Census is the largest statistical collection undertaken by the Australian Bureau of Statistics (ABS), with its data critical to informing the planning and delivery of Government and community services. While the Census measure of income supports a wide range of analysis, demand exists for additional income topics to complement and extend the range of socio economic analysis that can be undertaken. The ABS has recently developed three experimental income topics for the 2016 Census using linked administrative data: main source of income; main source of government payments; and previous financial year income. Objectives and ApproachThis research utilised administrative data integrated by the ABS for the Multi-Agency Data Integration Project (MADIP). Taxation data from the Australian Taxation Office (ATO) and social security data from the Department of Social Services (DSS) were used in conjunction with the 2016 Census data to derive the additional topics. ResultsOverall, the three measures compare relatively closely to similar measures from the ABS Survey of Income and Housing (SIH). The ‘Main source of income’ and ‘Main source of government payments’ measures exhibit similar distributions to those from the SIH. The ‘Previous financial year income’ measures compare more closely with Census and SIH at the higher end of the income distribution, with some differences apparent at the lower end of the income distribution. Conclusion / ImplicationsThis work demonstrates the potential to supplement and enhance existing Census topics with linked administrative data. Further research, development and consultation with data users and the Australian community is needed.


Author(s):  
Heidi J Welberry ◽  
Henry Brodaty ◽  
Benjumin Hsu ◽  
Sebastiano Barbieri ◽  
Louisa R Jorm

IntroductionThere is no gold standard method for monitoring dementia incidence in Australia. Routinely collected linked administrative data are increasingly being used to monitor endpoints in observational studies and clinical trials and could benefit dementia research. Objectives and ApproachThis study examines dementia incidence within different Australian administrative datasets and how characteristics vary across datasets for groups detected as having dementia. This was an observational data linkage study based on a prospective cohort of 267,153 people in New South Wales, Australia from the 45 and Up Study. Participants completed a survey in 2006-2009 and dementia was identified using linked pharmaceutical claims (provided by Services Australia), hospitalisations, assessments of aged care eligibility, care needs at entry to residential aged care and death certificates. Data linkage was undertaken by the Centre for Health Record Linkage (CHeReL) and the Australian Institute of Health and Welfare. Age-specific and age-standardised incidence rates, incidence rate ratios and survival from first dementia diagnosis were calculated. ResultsAge-standardised dementia incidence was 16.9 cases per 1000 person years (PY) for people aged 65 years and over. Estimates for those aged 80-89 years were closest to published incidence rates (91% of rates for high-income countries). Relationships with dementia incidence were inconsistent across datasets for characteristics including sex, relative socio-economic disadvantage, support network size, marital status, functional limitations and diabetes. Median survival from first pharmaceutical claim for an anti-dementia medicine was 3.7 years compared to 3.0 years from first aged care eligibility assessment, 2.0 years from a dementia-related hospitalisation and 1.8 years from first residential aged care needs assessment. Conclusion / ImplicationsPeople identified with dementia in different administrative datasets have different characteristics, reflecting the factors that drive interaction with specific services. Bias may be introduced if single data sources are used to identify dementia as an outcome in observational studies.


2015 ◽  
Vol 139 (9) ◽  
pp. 1149-1155 ◽  
Author(s):  
Xiaohui Niu ◽  
Hairong Xu ◽  
Carrie Y. Inwards ◽  
Yuan Li ◽  
Yi Ding ◽  
...  

Context Although primary bone tumors are extremely rare, the literature suggests that there are variations in the epidemiologic characteristics in different populations. The most frequently cited epidemiologic characteristics of primary bone tumors are derived from a large US series (Mayo Clinic), with no comparable study thus far performed in China. Objective To identify any potential epidemiologic differences between Chinese patients and a US series of patients. Design We performed a comparison study between 9200 patients treated at Beijing Ji Shui Tan Hospital (JST) and 10 165 patients treated at Mayo Clinic (MC), Rochester Minnesota. Detailed epidemiologic features were analyzed. Results We found that giant cell tumor and osteosarcoma have significantly higher incidences in the JST than the MC patients (P < .001). However, JST patients had a significantly lower incidence of Ewing sarcoma, chordoma, fibrosarcoma, myeloma, and malignant lymphoma (P < .001). For most benign and malignant bone tumors, the Chinese cohort had a more distinct male predominance than the US cohort. Malignant bone tumors had a monomodal age distribution in the JST patient group, with a bimodal age distribution in the MC cohort. Also, there were was a predilection for tumors of the femur and tibia among the JST patients (P < .001). Conclusions Our data confirm that epidemiologic variations of primary bone tumors exist in different populations. Factors that may contribute to these observed differences are proposed and discussed.


2021 ◽  
Vol 12 ◽  
Author(s):  
Simon Boitard ◽  
Cyriel Paris ◽  
Natalia Sevane ◽  
Bertrand Servin ◽  
Kenza Bazi-Kabbaj ◽  
...  

Gene banks, framed within the efforts for conserving animal genetic resources to ensure the adaptability of livestock production systems to population growth, income, and climate change challenges, have emerged as invaluable resources for biodiversity and scientific research. Allele frequency trajectories over the few last generations contain rich information about the selection history of populations, which cannot be obtained from classical selection scan approaches based on present time data only. Here we apply a new statistical approach taking advantage of genomic time series and a state of the art statistic (nSL) based on present time data to disentangle both old and recent signatures of selection in the Asturiana de los Valles cattle breed. This local Spanish originally multipurpose breed native to Asturias has been selected for beef production over the last few generations. With the use of SNP chip and whole-genome sequencing (WGS) data, we detect candidate regions under selection reflecting the effort of breeders to produce economically valuable beef individuals, e.g., by improving carcass and meat traits with genes such as MSTN, FLRT2, CRABP2, ZNF215, RBPMS2, OAZ2, or ZNF609, while maintaining the ability to thrive under a semi-intensive production system, with the selection of immune (GIMAP7, GIMAP4, GIMAP8, and TICAM1) or olfactory receptor (OR2D2, OR2D3, OR10A4, and 0R6A2) genes. This kind of information will allow us to take advantage of the invaluable resources provided by gene bank collections from local less competitive breeds, enabling the livestock industry to exploit the different mechanisms fine-tuned by natural and human-driven selection on different populations to improve productivity.


Sign in / Sign up

Export Citation Format

Share Document