scholarly journals NEPS Starting Cohort 6 survey data linked to administrative data of the IAB (NEPS-SC6-ADIAB)

Author(s):  
Nadine Bachbauer

BackgroundNEPS-SC6-ADIAB is a new linked data product containing survey data of Starting Cohort 6 of the German National Educational Panel Study (NEPS) and administrative employment data from the Institute for Employment Research (IAB), the research institute of the Federal Employment Agency. NEPS is provided by the Leibniz Institute for Educational Trajectories (LIfBi). Starting Cohort 6 of this panel survey includes adults in their professional life, the survey focuses on education in adulthood and lifelong learning. The administrative data in NEPS-SC6-ADIAB consist of comprehensive information on the employment histories. ObjectivesCombining these two data sources increases for example the information about individual employment history. Overall, the data volume is increased by the linkage between the survey data and the administrative data. MethodsA record linkage process was used to link the two data sources. The data access is free for the whole scientific community. In addition to a large number of On-site access locations within Germany, there are also international On-site access locations. Including London and Colchester. In addition a Remote Data Access is offered. ConclusionsThis data linkage project is very innovative and creates an extensive database, which results in extensive analytical potential. A short application example is made to exemplify the comprehensive analytical potential of NEPS-SC6-ADIAB. This ongoing project deals with nonresponse in survey data. The linked data has a variety of variables collected in both data sources, administratively and through the NEPS survey, allowing for comparative analyses. In this case an idea to compensate nonresponse in income data with administrative data is drawn.

Author(s):  
Cordell Golden ◽  
Lisa Mirel

IntroductionThe linkage of survey data with administrative data enhances the scientific value and analytic potential of both sources of information. Combining multiple data sources facilitates richer analyses and allows data users to answer research questions that cannot be addressed easily using a single data source. Objectives and ApproachRecently, the United States National Center for Health Statistics (NCHS) and Department of Housing and Urban Development (HUD) collaborated to link two population health surveys conducted by NCHS with housing assistance program data maintained by HUD. The resulting linked data files enable researchers to examine relationships between the receipt of federal housing assistance and health. In this talk, we will describe some of the challenges faced when initiating a data sharing agreement between two federal agencies governed by distinct legislative authorities, particularly issues related to legal requirements and data access. ResultsWe will describe each of the data sources used in the linkage as well as the methodology used to combine the data. Lastly, the discussion will focus on the inter-agency collaboration that led to the production of the supporting technical documentation developed to assist researchers using the linked data files. The linkage of NCHS survey data and HUD administrative data serves as an example of how two agencies were able to overcome challenges to successfully form a data sharing partnership as a cost-effective means to develop a robust data source that benefits the collaborating agencies as well as policy makers and outside researchers. Conclusion/ImplicationsBoth agencies anticipate that this partnership will continue as additional survey and administrative data are collected.


2018 ◽  
Vol 108 ◽  
pp. 287-291 ◽  
Author(s):  
Michael D. Carr ◽  
Emily E. Wiemers

Despite the rise in cross-sectional inequality since the late 1990s, there is little consensus on trends in earnings volatility during this period. Using consistent samples and methods in administrative earnings data matched to the Survey of Income and Program Participation (SIPP GSF) and survey data from the Panel Study of Income Dynamics (PSID), we examine earnings volatility for men from 1978 through 2011. In contrast to the apparent inconsistency in trends across administrative and survey data in the existing literature, we find recent increases in volatility in the SIPP GSF and the PSID, though increases are larger in the PSID.


2015 ◽  
Vol 31 (3) ◽  
pp. 415-429 ◽  
Author(s):  
Loredana Di Consiglio ◽  
Tiziana Tuoto

Abstract The Capture-recapture method is a well-known solution for evaluating the unknown size of a population. Administrative data represent sources of independent counts of a population and can be jointly exploited for applying the capture-recapture method. Of course, administrative sources are affected by over- or undercoverage when considered separately. The standard Petersen approach is based on strong assumptions, including perfect record linkage between lists. In reality, record linkage results can be affected by errors. A simple method for achieving linkage error-unbiased population total estimates is proposed in Ding and Fienberg (1994). In this article, an extension of the Ding and Fienberg model by relaxing their conditions is proposed. The procedures are illustrated for estimating the total number of road casualties, on the basis of a probabilistic record linkage between two administrative data sources. Moreover, a simulation study is developed, providing evidence that the adjusted estimator always performs better than the Petersen estimator.


2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Tarek Al Baghal ◽  
Alexander Wenz ◽  
Luke Sloan ◽  
Curtis Jessop

AbstractLinked social media and survey data have the potential to be a unique source of information for social research. While the potential usefulness of this methodology is widely acknowledged, very few studies have explored methodological aspects of such linkage. Respondents produce planned amounts of survey data, but highly variant amounts of social media data. This study explores this asymmetry by examining the amount of social media data available to link to surveys. The extent of variation in the amount of data collected from social media could affect the ability to derive meaningful linked indicators and could introduce possible biases. Linked Twitter data from respondents to two longitudinal surveys representative of Great Britain, the Innovation Panel and the NatCen Panel, show that there is indeed substantial variation in the number of tweets posted and the number of followers and friends respondents have. Multivariate analyses of both data sources show that only a few respondent characteristics have a statistically significant effect on the number of tweets posted, with the number of followers being the strongest predictor of posting in both panels, women posting less than men, and some evidence that people with higher education post less, but only in the Innovation Panel. We use sentiment analyses of tweets to provide an example of how the amount of Twitter data collected can impact outcomes using these linked data sources. Results show that more negatively coded tweets are related to general happiness, but not the number of positive tweets. Taken together, the findings suggest that the amount of data collected from social media which can be linked to surveys is an important factor to consider and indicate the potential for such linked data sources in social research.


SERIEs ◽  
2021 ◽  
Author(s):  
Luis Ayala ◽  
Ana Pérez ◽  
Mercedes Prieto-Alaiz

AbstractThis paper aims to analyze the effect on measured inequality and its structure of using administrative data instead of survey data. Different analyses are carried out based on the Spanish Survey on Income and Living Conditions (ECV) that continued to ask households for their income despite assigning their income data as provided by the Tax Agency and the Social Security Administration. Our main finding is that the largest discrepancies between administrative and survey data are in the tails of the distribution. In addition to that, there are clear differences in the level and structure of inequality across data sources. These differences matter, and our results should be a wake-up call to interpret the results based on only one source of income data with caution.


PLoS ONE ◽  
2017 ◽  
Vol 12 (8) ◽  
pp. e0183817 ◽  
Author(s):  
Sanja Lujic ◽  
Judy M. Simpson ◽  
Nicholas Zwar ◽  
Hassan Hosseinzadeh ◽  
Louisa Jorm

Author(s):  
Manfred Antoni ◽  
Basha Vicari ◽  
Daniel Bela

ABSTRACTObjectivesWe investigate characteristics of respondents and interviewers influencing the accurateness of reported income by comparing survey data with administrative data. Questions on sensitive topics like respondents' income often produce relatively high rates of item nonresponse or measurement error. In this context several analyses have been done on item nonresponse, but little is known about accuracy of reporting. Existing evidence shows that it is unpleasant for respondents to report very low or very high income. In presence of an interviewer income questions might produce incorrect responses due to social desirability bias. On the other hand side interviewers can create a trustful atmosphere in which respondents give more accurate answers. ApproachUsing linked survey and administrative data we are able to measure the extent of deviation between reported and recorded incomes and explore the influence of respondent and interviewer characteristics on it. The starting point for the linkage is data from the German National Educational Panel Study (NEPS), Starting Cohort 6, which surveys adults from birth cohorts 1944 to 1986. More than 90% of the respondents consented to a linkage of their survey information with administrative data from the German Federal Employment Agency. These longitudinal earnings data are highly reliable as they are based on mandatory notifications of employers to the social security system.We include interviewer and respondent characteristics as well as their interactions into our model to estimate their respective impact on the incidence and size of any bias in reported incomes. This allows us to control for latent interviewer traits that might have influenced the respondent's answering behavior during each interview of a given interviewer. ResultsThe average deviation of reported from administrative earnings is relatively small (less than 10% of median earnings). Descriptive evidence shows only small variation of deviation across subgroups. Most importantly, female respondents show higher report accuracy. Multivariate results hint at a negligible influence of interviewer characteristics. The major predictors for deviation in respondents' characteristics are their sex, their absolute monthly personal income, their educational level and being born abroad. ConclusionAlthough the average measurement accuracy is rather high, there are some differences in deviations by subgroups. The impact of these deviations depends on the research question at hand. Research with a strong focus on the respondent’s earnings, e.g. when using them as a dependent variable, should use the linked data rather than only the NEPS survey data.


2019 ◽  
Vol 11 (2) ◽  
pp. 142-164 ◽  
Author(s):  
Nikolas Mittag

Data linkage studies often document, but do not remedy, severe survey errors. To improve survey estimates despite restricted linked data access, this paper develops a convenient and general estimation method that combines public use data with conditional distribution parameters estimated from linked data. Analyses using linked SNAP data show that this method sharply improves estimates and consistently outperforms corrections that mainly rely on survey data. Yet, some univariate corrections perform well when linked data do not exist. For SNAP, extrapolating from linked data across time and geography still improves upon estimates using survey data only, even after survey-based corrections. (JEL C81, C83, H75, I18, I38)


Author(s):  
Sallie-Anne Pearson ◽  
Nicole Pratt ◽  
Juliana de Oliveira de Oliveira Costa ◽  
Helga Zoega ◽  
Tracey-Lea Laba ◽  
...  

Australia spends more than $20 billion annually on medicines, delivering significant health benefits for the population. However, inappropriate prescribing and medicine use also result in harm to individuals and populations, and waste of precious health resources. Medication data linked with other routine collections enable evidence generation in pharmacoepidemiology; the science of quantifying the use, effectiveness and safety of medicines in real-world clinical practice. This review details the history of medicines policy and data access in Australia, the strengths of existing data sources, and the infrastructure and governance enabling and impeding evidence generation in the field. Currently, substantial gaps persist with respect to cohesive, contemporary linked data sources supporting quality use of medicines, effectiveness and safety research; exemplified by Australia’s limited capacity to contribute to the global effort in real-world studies of vaccine and disease-modifying treatments for COVID-19. We propose a roadmap to bolster the discipline, and population health more broadly, underpinned by a distinct capability governing and streamlining access to linked data assets for accredited researchers. Robust real-world evidence generation requires current data roadblocks to be remedied as a matter of urgency to deliver efficient and equitable health care and improve the health and well-being of all Australians.


Sign in / Sign up

Export Citation Format

Share Document