ancestral recombination graphs Latest Research Papers

The ancestral recombination graph (ARG) is a structure that describes the joint genealogies of sampled DNA sequences along the genome. Recent computational methods have made impressive progress towards scalably estimating whole-genome genealogies. In addition to inferring the ARG, some of these methods can also provide ARGs sampled from a defined posterior distribution. Obtaining good samples of ARGs is crucial for quantifying statistical uncertainty and for estimating population genetic parameters such as effective population size, mutation rate, and allele age. Here, we use simulations to benchmark three popular ARG inference programs: ARGweaver, Relate, and tsdate. We use neutral coalescent simulations to 1) compare the true coalescence times to the inferred times at each locus; 2) compare the distribution of coalescence times across all loci to the expected exponential distribution; 3) evaluate whether the sampled coalescence times have the properties expected of a valid posterior distribution. We find that inferred coalescence times at each locus are more accurate in ARGweaver and Relate than in tsdate. However, all three methods tend to overestimate small coalescence times and underestimate large ones. Lastly, the posterior distribution of ARGweaver is closer to the expected posterior distribution than Relate's, but this higher accuracy comes at a substantial trade-off in scalability. The best choice of method will depend on the number and length of input sequences and on the goal of downstream analyses, and we provide guidelines for the best practices.

Download Full-text

Biobank-scale inference of ancestral recombination graphs enables genealogy-based mixed model association of complex traits

10.1101/2021.11.03.466843 ◽

2021 ◽

Author(s):

Brian C Zhang ◽

Arjun Biddanda ◽

Pier Francesco Palamara

Keyword(s):

Complex Traits ◽

Large Scale ◽

Mixed Model ◽

Large Population ◽

Complex Trait ◽

Genotype Imputation ◽

Loss Of Function ◽

Genome Wide ◽

Wide Range ◽

Ancestral Recombination Graphs

Accurate inference of gene genealogies from genetic data has the potential to facilitate a wide range of analyses. We introduce a method for accurately inferring biobank-scale genome-wide genealogies from sequencing or genotyping array data, as well as strategies to utilize genealogies within linear mixed models to perform association and other complex trait analyses. We use these new methods to build genome-wide genealogies using genotyping data for 337,464 UK Biobank individuals and to detect associations in 7 complex traits. Genealogy-based association detects more rare and ultra-rare signals (N = 133, frequency range 0.0004% - 0.1%) than genotype imputation from ~65,000 sequenced haplotypes (N = 65). In a subset of 138,039 exome sequencing samples, these associations strongly tag (average r = 0.72) underlying sequencing variants, which are enriched for missense (2.3×) and loss-of-function (4.5×) variation. Inferred genealogies also capture additional association signals in higher frequency variants. These results demonstrate that large-scale inference of gene genealogies may be leveraged in the analysis of complex traits, complementing approaches that require the availability of large, population-specific sequencing panels.

Download Full-text

Recoverability of Ancestral Recombination Graph Topologies

10.1101/2021.10.10.463724 ◽

2021 ◽

Author(s):

Elizabeth Hayman ◽

Anastasia Ignatieva ◽

Jotun Hein

Keyword(s):

Genetic Diversity ◽

Gene Conversion ◽

Evolutionary Process ◽

Challenging Problem ◽

Sequencing Data ◽

Ancestral Recombination Graph ◽

Reconstruction Methods ◽

Ancestral Recombination Graphs ◽

Parameter Values ◽

Biological Organisms

Recombination is a powerful evolutionary process that shapes the genetic diversity observed in the populations of many species. Reconstructing genealogies in the presence of recombination from sequencing data is a very challenging problem, as this relies on mutations having occurred on the correct lineages in order to detect the recombination and resolve the placement of edges in the local trees. We investigate the probability of recovering the true topology of ancestral recombination graphs (ARGs) under the coalescent with recombination and gene conversion. We explore how sample size and mutation rate affect the inherent uncertainty in reconstructed ARGs; this sheds light on the theoretical limitations of ARG reconstruction methods. We illustrate our results using estimates of evolutionary rates for several biological organisms; in particular, we find that for parameter values that are realistic for SARS-CoV-2, the probability of reconstructing genealogies that are close to the truth is low.

Download Full-text

The distribution of waiting distances in ancestral recombination graphs

Theoretical Population Biology ◽

10.1016/j.tpb.2021.06.003 ◽

2021 ◽

Author(s):

Yun Deng ◽

Yun S. Song ◽

Rasmus Nielsen

Keyword(s):

Ancestral Recombination Graphs

Download Full-text

The distribution of waiting distances in ancestral recombination graphs and its applications

10.1101/2020.12.24.424361 ◽

2020 ◽

Author(s):

Yun Deng ◽

Yun S. Song ◽

Rasmus Nielsen

Keyword(s):

Accurate Approximation ◽

Ancestral Recombination Graph ◽

Recombination Rates ◽

Inference Problems ◽

Topology Changes ◽

Population Genetic Inference ◽

Ancestral Recombination Graphs ◽

Inference Methods ◽

Genetic Inference ◽

Genealogical Information

AbstractThe ancestral recombination graph (ARG) contains the full genealogical information of the sample, and many population genetic inference problems can be solved using inferred or sampled ARGs. In particular, the waiting distance between tree changes along the genome can be used to make inference about the distribution and evolution of recombination rates. To this end, we here derive an analytic expression for the distribution of waiting distances between tree changes under the sequentially Markovian coalescent model and obtain an accurate approximation to the distribution of waiting distances for topology changes. We use these results to show that some of the recently proposed methods for inferring sequences of trees along the genome provide strongly biased distributions of waiting distances. In addition, we provide a correction to an undercounting problem facing all available ARG inference methods, thereby facilitating the use of ARG inference methods to estimate temporal changes in the recombination rate.

Download Full-text

KwARG: Parsimonious Reconstruction of Ancestral Recombination Graphs with Recurrent Mutation

10.1101/2020.12.17.423233 ◽

2020 ◽

Cited By ~ 1

Author(s):

Anastasia Ignatieva ◽

Rune B. Lyngsø ◽

Paul A. Jenkins ◽

Jotun Hein

Keyword(s):

Heuristic Algorithm ◽

Relative Proportion ◽

Genetic Data ◽

Recurrent Mutation ◽

Greedy Heuristic ◽

Challenging Problem ◽

Recurrent Mutations ◽

Input Dataset ◽

Ancestral Recombination Graphs ◽

Cost Parameters

AbstractThe reconstruction of possible histories given a sample of genetic data in the presence of recombination and recurrent mutation is a challenging problem, but can provide key insights into the evolution of a population. We present KwARG, which implements a parsimony-based greedy heuristic algorithm for finding plausible genealogical histories (ancestral recombination graphs) that are minimal or near-minimal in the number of posited recombination and mutation events. Given an input dataset of aligned sequences, KwARG outputs a list of possible candidate solutions, each comprising a list of mutation and recombination events that could have generated the dataset; the relative proportion of recombinations and recurrent mutations in a solution can be controlled via specifying a set of ‘cost’ parameters. We demonstrate that the algorithm performs well when compared against existing methods. The software is made available on GitHub.

Download Full-text

Inference of Ancestral Recombination Graphs Using ARGweaver

Methods in Molecular Biology - Statistical Population Genomics ◽

10.1007/978-1-0716-0199-0_10 ◽

2020 ◽

pp. 231-266 ◽

Cited By ~ 1

Author(s):

Melissa Hubisz ◽

Adam Siepel

Keyword(s):

Ancestral Recombination Graphs

Download Full-text

Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph

10.1101/687368 ◽

2019 ◽

Cited By ~ 1

Author(s):

Melissa J. Hubisz ◽

Amy L. Williams ◽

Adam Siepel

Keyword(s):

Gene Flow ◽

Genetic Relationships ◽

Convincing Evidence ◽

Demographic Model ◽

Modern Humans ◽

Bayesian Algorithm ◽

Branch Lengths ◽

Mapping Gene ◽

Ancestral Recombination Graphs ◽

And Migration

AbstractThe sequencing of Neanderthal and Denisovan genomes has yielded many new insights about interbreeding events between extinct hominins and the ancestors of modern humans. While much attention has been paid to the relatively recent gene flow from Neanderthals and Denisovans into modern humans, other instances of introgression leave more subtle genomic evidence and have received less attention. Here, we present an extended version of the ARGweaver algorithm, ARGweaver-D, which can infer local genetic relationships under a user-defined demographic model that includes population splits and migration events. This Bayesian algorithm probabilistically samples ancestral recombination graphs (ARGs) that specify not only tree topology and branch lengths along the genome, but also indicate migrant lineages. The sampled ARGs can therefore be parsed to produce probabilities of introgression along the genome. We show that this method is well powered to detect the archaic migration into modern humans, even with only a few samples. We then show that the method can also detect introgressed regions stemming from older migration events, or from unsampled populations. We apply it to human, Neanderthal, and Denisovan genomes, looking for signatures of older proposed migration events, including ancient humans into Neanderthal, and unknown archaic hominins into Denisovans. We identify 3% of the Neanderthal genome that is putatively introgressed from ancient humans, and estimate that the gene flow occurred between 200-300kya. We find no convincing evidence that negative selection acted against these regions. We also identify 1% of the Denisovan genome which was likely introgressed from an unsequenced hominin ancestor, and note that 15% of these regions have been passed on to modern humans through subsequent gene flow.

Download Full-text

A Hybrid Approach to Optimize the Number of Recombinations in Ancestral Recombination Graphs

Proceedings of the 2019 9th International Conference on Bioscience, Biochemistry and Bioinformatics - ICBBB '19 ◽

10.1145/3314367.3314385 ◽

2019 ◽

Author(s):

Nguyen Thi Phuong Thao ◽

Le Sy Vinh

Keyword(s):

Hybrid Approach ◽

Ancestral Recombination Graphs

Download Full-text

Bridging trees for posterior inference on ancestral recombination graphs

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2018.0568 ◽

2018 ◽

Vol 474 (2220) ◽

pp. 20180568

Author(s):

K. Heine ◽

A. Beskos ◽

A. Jasra ◽

D. Balding ◽

M. De Iorio

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Dna Sequence ◽

Dna Sequences ◽

Large Scale ◽

Monte Carlo Algorithm ◽

Posterior Inference ◽

History Of ◽

Ancestral Recombination Graphs

We present a new Markov chain Monte Carlo algorithm, implemented in the software Arbores, for inferring the history of a sample of DNA sequences. Our principal innovation is a bridging procedure, previously applied only for simple stochastic processes, in which the local computations within a bridge can proceed independently of the rest of the DNA sequence, facilitating large-scale parallelization.

Download Full-text

ancestral recombination graphs
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Evaluation of methods for the inference of ancestral recombination graphs

Biobank-scale inference of ancestral recombination graphs enables genealogy-based mixed model association of complex traits

Recoverability of Ancestral Recombination Graph Topologies

The distribution of waiting distances in ancestral recombination graphs

The distribution of waiting distances in ancestral recombination graphs and its applications

KwARG: Parsimonious Reconstruction of Ancestral Recombination Graphs with Recurrent Mutation

Inference of Ancestral Recombination Graphs Using ARGweaver

Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph

A Hybrid Approach to Optimize the Number of Recombinations in Ancestral Recombination Graphs

Bridging trees for posterior inference on ancestral recombination graphs

Export Citation Format

ancestral recombination graphsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Evaluation of methods for the inference of ancestral recombination graphs

Biobank-scale inference of ancestral recombination graphs enables genealogy-based mixed model association of complex traits

Recoverability of Ancestral Recombination Graph Topologies

The distribution of waiting distances in ancestral recombination graphs

The distribution of waiting distances in ancestral recombination graphs and its applications

KwARG: Parsimonious Reconstruction of Ancestral Recombination Graphs with Recurrent Mutation

Inference of Ancestral Recombination Graphs Using ARGweaver

Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph

A Hybrid Approach to Optimize the Number of Recombinations in Ancestral Recombination Graphs

Bridging trees for posterior inference on ancestral recombination graphs

ancestral recombination graphs
Recently Published Documents