scholarly journals U.S. Decennial Census Digitization and Linkage Project

Author(s):  
Trent Alexander ◽  
Katie Genadek

The U.S. Census Bureau maintains a large longitudinal research infrastructure that currently includes linked data from the 1940 census, the 2000-2010 censuses, major national surveys going back to 1973, and administrative records dating back to the 1990s. These restricted data are accessible to researchers around the U.S. via the Federal Statistical Research Data Centers (FSRDC) network. The major shortcoming of this infrastructure is that it lacks linkable files from the decennial censuses of 1950 through 1990. Full-count microdata from these censuses are available for research, but datasets from these years do not include respondent names and therefore have not been linked over time. Respondent names for these censuses are available only via the original census returns, which are stored on 258,000 reels of microfilm. The Decennial Census Digitization and Linkage project (DCDL) is an initiative to recover names from the 1960-1990 censuses and to produce linked restricted microdata files for research use. We describe the results of a pilot project we completed on the 1990 census. For that pilot, we created digital images from census microfilm, hand-keyed "truth data" from those images, supported two teams' attempts to conduct Handwriting Recognition on the images, appended recovered names to already-existing microdata files, and linked the new 1990 census microdata records to previous and subsequent censuses. We describe our processes, the accuracy of the Handwriting Recognition, and the accuracy of the record linkage with the recovered names. We conclude by providing an update on the recently-initiated project to carry out these processes on a production scale for the 1960 through 1990 censuses. When combined with existing linkages between the censuses of 1940, 2000, 2010, the soon-to-be public 1950 census, and the future 2020 census, DCDL will provide the final component in a massive longitudinal data infrastructure that covers most of the U.S. population since 1940. As a multi-purpose statistical tool, the DCDL will further the U.S. Census Bureau's mission to provide high quality data on the U.S. population and support cutting-edge research in the FSRDC network. The resulting data resource will expand our understanding of population dynamics in the U.S. far beyond what is currently possible, providing transformational opportunities for research, education, and evidence-building across the social, behavioral, and economic sciences.

2021 ◽  
Author(s):  
Patrick Sullivan ◽  
Cory R Woodyatt ◽  
Oskian Kouzouian ◽  
Kristen Parrish ◽  
Jennifer Taussig ◽  
...  

UNSTRUCTURED Objectives: America’s HIV Epidemic Analysis Dashboard (AHEAD) is a data visualization tool that displays relevant data on the 6 HIV indicators provided by CDC that can be used to monitor progress towards ending the HIV epidemic in local communities across the U.S. The objective of AHEAD is to make data available to stakeholders that can be used to measure national and local progress towards 2025 and 2030 Ending the HIV Epidemic in the U.S. (EHE) goals and to help jurisdictions make local decisions that are grounded in high-quality data. Methods: AHEAD displays data from public health data systems (e.g., surveillance systems, Census data), organized around the six EHE indicators (incidence, knowledge of status, diagnoses, linkage to HIV medical care, viral suppression, and PrEP coverage). Data are displayed for each of the EHE priority areas (48 counties Washington, D.C. and San Juan, PR) which accounted for more than 50% of all U.S. HIV diagnoses in 2016 and 2017 and seven primarily Southern states with high rates of HIV in rural communities. AHEAD also displays data for the 43 remaining states for which data are available. Data features prioritize interactive data-visualization tools that allow users to compare indicator data stratified by sex at birth, race, age, and transmission category within a jurisdiction (when available) or compare data on EHE indicators between jurisdictions. Results: AHEAD was launched on August 14, 2020. In the 11 months since its launch, the Dashboard has been visited 26,591 times by 17,600 unique users. About a third of all users returned to the Dashboard at least once. On average, users engaged with 2.4 pages during their visit to the Dashboard, indicating that the average user goes beyond the informational landing page to engage with one or more pages of data and content. The most frequently visited content pages are the Jurisdictions webpages. Conclusions: The Ending the HIV Epidemic plan is described as a “whole of society” effort. Societal public health initiatives require objective indicators and require that all societal stakeholders have transparent access to indicator data at the level of the health jurisdictions responsible for meeting the goals of the plan. Data transparency empowers local stakeholders to track movement towards EHE goals, identify areas with needs for improvement, make data-informed adjustments to deploy the expertise and resources required to locally tailor and implement strategies to end the HIV epidemic in their jurisdiction.


Societies ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 65
Author(s):  
Clem Brooks ◽  
Elijah Harter

In an era of rising inequality, the U.S. public’s relatively modest support for redistributive policies has been a puzzle for scholars. Deepening the paradox is recent evidence that presenting information about inequality increases subjects’ support for redistributive policies by only a small amount. What explains inequality information’s limited effects? We extend partisan motivated reasoning scholarship to investigate whether political party identification confounds individuals’ processing of inequality information. Our study considers a much larger number of redistribution preference measures (12) than past scholarship. We offer a second novelty by bringing the dimension of historical time into hypothesis testing. Analyzing high-quality data from four American National Election Studies surveys, we find new evidence that partisanship confounds the interrelationship of inequality information and redistribution preferences. Further, our analyses find the effects of partisanship on redistribution preferences grew in magnitude from 2004 through 2016. We discuss implications for scholarship on information, motivated reasoning, and attitudes towards redistribution.


2016 ◽  
Vol 35 (5) ◽  
pp. 685-704 ◽  
Author(s):  
William P. O’Hare ◽  
J. Gregory Robinson ◽  
Kirsten West ◽  
Thomas Mule

2019 ◽  
Vol 48 (3) ◽  
pp. 677-677e ◽  
Author(s):  
Barry J Milne ◽  
June Atkinson ◽  
Tony Blakely ◽  
Hilary Day ◽  
Jeroen Douwes ◽  
...  

Author(s):  
Dimitrios Panagiotou ◽  
Athanassios Stavrakoudis

AbstractThe objective of this study is to assess the degree and the structure of price dependence between different cuts of the beef industry in the USA. This is pursued using the statistical tool of copulas. To this end, it utilizes retail monthly data of beef cuts, within and between the quality grades of Choice and Select, over the period 2000–2014. For the Choice quality grade, there was evidence of asymmetric price co-movements between all six pairs of beef cuts under consideration. No evidence of asymmetric price co-movements was found between the three pairs of beef cuts for the Select quality grade. For the pairs of beef cuts formed between the Choice and Select quality grades, the empirical results point to the existence of price asymmetry only for the case of the chuck roast cut.


2007 ◽  
Vol 3 (4) ◽  
Author(s):  
Amanda Wolf

Large statistical studies in the social sciences, including one-off or repeated cross-sectional surveys, time-series surveys and cohort longitudinal research, offer important numeric evidence for policy making. Although single studies rarely occasion dramatic policy shifts, statistical research findings can affect policy debate, even if not always directly or openly. At best, these studies reveal shapes and patterns in the social fabric relevant to health, safety, education and other social goals. Numerical measures of many social phenomena, such as unreported crime, illicit drug use, child-rearing practices or family composition, enter into a policy-making milieu crowded with competing numbers and qualitative information, as well as non-evidential values and power-based influences.


2015 ◽  
Vol 10 (6) ◽  
pp. 2253-2258
Author(s):  
Mohammad Salih Memon ◽  
Abdul Sattar Shah ◽  
Pir Roshah Shah Rashdi ◽  
Dr.Muhammad Munir Ahmadani ◽  
Mr. Sarmad Rahat

This research investigates the Statistical Correlation on Internal Issues of Spinning Data were collected from Primary as well as secondary sources It is a statistical research technique in decision making that is used for the selection of a limited number of tasks that produce significant overall effect. It separates the few major problems from the many possible problems. It is named after Vilfredo Pareto, a 19th-century Italian economist. It was revealed that Textile spinning sector detail of analysis performed for reducing a huge number of issues explored from textile industry of Pakistan in the era of trade liberalization. The research in this part of framework starts by conducting a survey of the textile industry of Pakistan for the collection of data through questionnaire consisted of the explored issues. The data collected through survey is then used to perform Pareto analysis and bring up the prioritized issues after reducing them by using statistical tool SPSS. Then statistical Correlation is performed on the prioritized issues reduced through Pareto analysis to find the level of relationship among these issues, so by solving one issue the other may routinely be solved.


2019 ◽  
Author(s):  
◽  
Chris S Clarkson ◽  
Alistair Miles ◽  
Nicholas J Harding ◽  
Eric R Lucas ◽  
...  

AbstractMosquito control remains a central pillar of efforts to reduce malaria burden in sub-Saharan Africa. However, insecticide resistance is entrenched in malaria vector populations, and countries with high malaria burden face a daunting challenge to sustain malaria control with a limited set of surveillance and intervention tools. Here we report on the second phase of a project to build an open resource of high quality data on genome variation among natural populations of the major African malaria vector species Anopheles gambiae and Anopheles coluzzii. We analysed whole genomes of 1,142 individual mosquitoes sampled from the wild in 13 African countries, and a further 234 individuals comprising parents and progeny of 11 lab crosses. The data resource includes high confidence single nucleotide polymorphism (SNP) calls at 57 million variable sites, genome-wide copy number variation (CNV) calls, and haplotypes phased at biallelic SNPs. We used these data to analyse genetic population structure, and characterise genetic diversity within and between populations. We also illustrate the utility of these data by investigating species differences in isolation by distance, genetic variation within proposed gene drive target sequences, and patterns of resistance to pyrethroid insecticides. This data resource provides a foundation for developing new operational systems for molecular surveillance, and for accelerating research and development of new vector control tools.


Sign in / Sign up

Export Citation Format

Share Document