scholarly journals Estimating variance components in population scale family trees

2018 ◽  
Author(s):  
Tal Shor ◽  
Dan Geiger ◽  
Yaniv Erlich ◽  
Omer Weissbrod

AbstractThe rapid digitization of genealogical and medical records enables the assembly of extremely large pedigree records spanning millions of individuals and trillions of pairs of relatives. Such pedigrees provide the opportunity to investigate the sociological and epidemiological history of human populations in scales much larger than previously possible. Linear mixed models (LMMs) are routinely used to analyze extremely large animal and plant pedigrees for the purposes of selective breeding. However, LMMs have not been previously applied to analyze population-scale human family trees. Here, we present Sparse Cholesky factorIzation LMM (Sci-LMM), a modeling framework for studying population-scale family trees that combines techniques from the animal and plant breeding literature and from human genetics literature. The proposed framework can construct a matrix of relationships between trillions of pairs of individuals and fit the corresponding LMM in several hours. We demonstrate the capabilities of Sci-LMM via simulation studies and by estimating the heritability of longevity and of reproductive fitness (quantified via number of children) in a large pedigree spanning millions of individuals and over five centuries of human history. Sci-LMM provides a unified framework for investigating the epidemiological history of human populations via genealogical records.Author SummaryThe advent of online genealogy services allows the assembly of population-scale family trees, spanning millions of individuals and centuries of human history. Such datasets enable answering genetic epidemiology questions on unprecedented scales. Here we present Sci-LMM, a pedigree analysis framework that combines techniques from animal and plant breeding research and from human genetics research for large-scale pedigree analysis. We apply Sci-LMM to analyze population-scale human genealogical records, spanning trillions of relationships. We have made both Sci-LMM and an anonymized dataset of millions of individuals freely available to download, making the analysis of population-scale human family trees widely accessible to the research community. Together, these resources allow researchers to investigate genetic and epidemiological questions on an unprecedented scale.

2017 ◽  
Author(s):  
Joanna Kaplanis ◽  
Assaf Gordon ◽  
Mary Wahl ◽  
Michael Gershovits ◽  
Barak Markus ◽  
...  

AbstractFamily trees have vast applications in multiple fields from genetics to anthropology and economics. However, the collection of extended family trees is tedious and usually relies on resources with limited geographical scope and complex data usage restrictions. Here, we collected 86 million profiles from publicly-available online data from genealogy enthusiasts. After extensive cleaning and validation, we obtained population-scale family trees, including a single pedigree of 13 million individuals. We leveraged the data to partition the genetic architecture of longevity by inspecting millions of relative pairs and to provide insights to population genetics theories on the dispersion of families. We also report a simple digital procedure to overlay other datasets with our resource in order to empower studies with population-scale genealogical data.One Sentence SummaryUsing massive crowd-sourced genealogy data, we created a population-scale family tree resource for scientific studies.


Science ◽  
2018 ◽  
Vol 360 (6385) ◽  
pp. 171-175 ◽  
Author(s):  
Joanna Kaplanis ◽  
Assaf Gordon ◽  
Tal Shor ◽  
Omer Weissbrod ◽  
Dan Geiger ◽  
...  

Author(s):  
Ana Barahona

Although their history can be traced further back to the study of heredity, variability, and evolution at the beginnings of the 20th century, studies on the genetic structure and ancestry of human populations became important at the end of World War II. From 1950 onward, the tools and practices of human genetics were systematically used to attack global health problems with the support of international health organizations and the founding of local institutions that extended these practices, thus contributing to global knowledge. These developments were not an exception for Mexican physicians and human geneticists in the Cold War years. The first studies, which appeared in the 1940s, reflect the emerging model of human genetics in clinical practice and in scientific research in postwar Mexico. Studies on the distribution of blood groups as well as on variant forms of hemoglobin in indigenous populations paved the way for long-term research programs on the characterization of Mexican indigenous populations. Research groups were formed at the Ministry of Health, the National Commission of Nuclear Energy, and the Mexican Social Security Institute in the 1960s. The key actors in this narrative were Rubén Lisker, Alfonso León de Garay, and Salvador Armendares. They consolidated solid communities in the fields of population and human genetics. For Lisker, the long-term effort to carry out research on indigenous populations in order to provide insights into the biological history of the human species, disease patterns, and biological relationships among populations was of particular interest. Alfonso León de Garay was interested in studying human and Drosophila populations, but in a completely different context, namely at the intersection of studies on nuclear energy and its effects on human populations as a result of World War II, with the life sciences, particularly genetics and radiobiology. In parallel, the study of chromosomes on a large scale using newly experimental techniques introduced by Salvador Armendares in Mexico in 1960 allowed researchers to tackle child malnutrition and health problems caused by Down and Turner syndromes. The history of population studies and genetics during the Cold War in Mexico (1945–1970s) shows how the Mexican human geneticists of the mid-20th century mobilized scientific resources and laboratory practices in the context of international trends marked by WWII, and national priorities owing to the construction movement of postrevolutionary Mexican governments. These research programs were not limited to collaborations between research laboratories but were developed within the institutional and political framework marked at the international level by the postwar period and at the national level by the construction of the modern Mexican state.


PLoS Genetics ◽  
2019 ◽  
Vol 15 (5) ◽  
pp. e1008124 ◽  
Author(s):  
Tal Shor ◽  
Iris Kalka ◽  
Dan Geiger ◽  
Yaniv Erlich ◽  
Omer Weissbrod

Sign in / Sign up

Export Citation Format

Share Document