multispecies coalescent
Recently Published Documents


TOTAL DOCUMENTS

161
(FIVE YEARS 92)

H-INDEX

21
(FIVE YEARS 6)

2022 ◽  
Author(s):  
XiaoXu Pang ◽  
Da-Yong Zhang

The species studied in any evolutionary investigation generally constitute a very small proportion of all the species currently existing or that have gone extinct. It is therefore likely that introgression, which is widespread across the tree of life, involves "ghosts," i.e., unsampled, unknown, or extinct lineages. However, the impact of ghost introgression on estimations of species trees has been rarely studied and is thus poorly understood. In this study, we use mathematical analysis and simulations to examine the robustness of species tree methods based on a multispecies coalescent model under gene flow sourcing from an extant or ghost lineage. We found that very low levels of extant or ghost introgression can result in anomalous gene trees (AGTs) on three-taxon rooted trees if accompanied by strong incomplete lineage sorting (ILS). In contrast, even massive introgression, with more than half of the recipient genome descending from the donor lineage, may not necessarily lead to AGTs. In cases involving an ingroup lineage (defined as one that diverged no earlier than the most basal species under investigation) acting as the donor of introgression, the time of root divergence among the investigated species was either underestimated or remained unaffected, but for the cases of outgroup ghost lineages acting as donors, the divergence time was generally overestimated. Under many conditions of ingroup introgression, the stronger the ILS was, the higher was the accuracy of estimating the time of root divergence, although the topology of the species tree is more prone to be biased by the effect of introgression.


2022 ◽  

Species delimitation is the process of determining whether a group of sampled individuals belong to the same species or to different species. The criteria used to delimit species differ across taxonomic groups, and the methods for delimiting species have changed over time, with a dramatic rise in the popularity of genomic approaches recently. Because inferred species boundaries have ramifications that extend beyond systematics, affecting all fields that rely upon species as a foundational unit, controversy has unsurprisingly surrounded not only the practices used to delimit species boundaries, but also the idea of what species are, which varies across taxa (e.g., the use of subspecies varies across the tree of life). This lack of consensus has no doubt contributed to the appeal of genetic-based delimitation. Specifically, genomic data can be collected from any taxon. Moreover, it can be analyzed in a common statistical framework (as popularized by the multispecies coalescent as a model for species delimitation). With the ease of collecting genetic data, the power of genomics, and the purported standardization for diagnosing species limits, genetic-based species delimitation is displacing traditional time-honored (albeit time-consuming) taxonomic practices of species diagnosis. It has also become an invaluable tool for discovering species in understudied groups, and genetic-based approaches are the foundation of international endeavors to generate a catalogue of DNA barcodes to illuminate biodiversity for all of life on the planet. Yet, genomic applications, and especially the sole reliance upon genetic data for inferring species boundaries, are not without their own set of challenges.


2021 ◽  
Author(s):  
Rohan S Mehta ◽  
Mike Steel ◽  
Noah A Rosenberg

Monophyly is a feature of a set of genetic lineages in which every lineage in the set is more closely related to all other members of the set than it is to any lineage outside the set. Multiple sets of lineages that are separately monophyletic are said to be reciprocally monophyletic, or jointly monophyletic. The prevalence of reciprocal monophyly, or joint monophyly, has been used to evaluate phylogenetic and phylogeographic hypotheses, as well as to delimit species. These applications often make use of a probability of joint monophyly under models of gene lineage evolution. Studies in coalescent theory have computed this joint monophyly probability for small numbers of separate groups in arbitrary species trees, and for arbitrary numbers of separate groups in trivial species trees. Here, generalizing existing results on monophyly probabilities under the multispecies coalescent, we derive the probability of joint monophyly for arbitrary numbers of separate groups in arbitrary species trees. We illustrate how our result collapses to previously examined cases. We also study the effect of tree height, sample size, and number of species on the probability of joint monophyly. The result also enables computation of relatively simple lower and upper bounds on the joint monophyly probability. Our results expand the scope of joint monophyly calculations beyond small numbers of species, subsuming past formulas that have been used in simpler cases.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12169
Author(s):  
Xinghao Li ◽  
Nan Song ◽  
Heng Zhang

The Coccinellidae are one of the most familiar beetle families, the ladybirds. Despite the great ecological and economic significance, the phylogenetic relationships of Coccinellidae remain poorly understood. One of the reasons is that the sequenced mitogenomes available for this family are very limited. We sequenced complete or nearly complete mitogenomes from seven species of the tribe Coccinellini with next-generation sequencing. All species have the same gene content and gene order as the putatively ancestral insect mitogenome. A large intergenic spacer region (> 890 bp) was found located between trnI and trnQ. The potential for using secondary structures of the large and small ribosomal subunits for phylogenetic reconstruction was predicted. The phylogenetic relationships were explored through comparative analyses across more than 30 coccinellid species. We performed phylogenetic analyses with both concatenation methods (Maximum Likelihood and Bayesian Inference) and multispecies coalescent method (ASTRAL). Phylogenetic results strongly supported the monophyly of Coccinellidae. Within Coccinellidae, the Epilachnini and the Coccinellini including Halyziini were monophyletic, while the Scymnini and Coccidulini were non-monophyletic.


Genetics ◽  
2021 ◽  
Author(s):  
Mark S Hibbins ◽  
Matthew W Hahn

Abstract Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.


2021 ◽  
Vol 5 (6) ◽  
Author(s):  
Jen-Pan Huang

Abstract The genealogical divergence index (gdi) was developed to aid in molecular species delimitation under the multispecies coalescent model, which has been shown to delimit genetic structures but not necessarily species. Although previous studies have used meta-analyses to show that gdi could be informative for distinguishing taxonomically good species, the biological and evolutionary implications of divergences showing different gdi values have yet to be studied. I showed that an increase in gdi value was correlated with later stages of divergence further along a speciation continuum in an Amazonian Hercules beetle system. Specifically, a gdi value of 0.7 or higher was associated with diverge between biological species that can coexist in geographic proximity while maintaining their evolutionary independence. Divergences between allopatric species that were conventionally given subspecific status, such as geographic taxa that may or may not be morphologically divergent, had gdi values that fell within the species delimitation ambiguous zone (0.2 < gdi < 0.7). However, the results could be drastically affected by the sampling design, i.e., the choice of different geographic populations and the lumping of distinct genetic groups when running the analyses. Different gdi values may prove to be biologically and evolutionarily informative should additional speciation continua from different empirical systems be investigated, and the results obtained may help with objectively delimiting species in the era of integrative taxonomy.


2021 ◽  
Author(s):  
Simone M Gable ◽  
Michael I Byars ◽  
Robert Literman ◽  
Marc Tollis

To examine phylogenetic heterogeneity in turtle evolution, we collected thousands of high-confidence single-copy orthologs from 19 genome assemblies representative of extant turtle diversity and estimated a phylogeny with multispecies coalescent and concatenated partitioned methods. We also collected next-generation sequences from 26 turtle species and assembled millions of biallelic markers to reconstruct phylogenies from annotated regions (coding regions, introns, untranslated regions, intergenic, and others) of the western painted turtle (Chrysemys picta bellii) genome. We then measured gene tree-species tree discordance, as well as gene and site heterogeneity at each node in the inferred trees, and tested for incomplete lineage sorting and temporal patterns in phylogenomic heterogeneity across turtle evolution. We found 100% support for all bifurcations in the inferred turtle species phylogenies. However, a number of genes, sites, and genomic features supported alternate relationships between turtle taxa, and some nodes in the turtle phylogeny were well-explained by incomplete lineage sorting. There was no clear pattern between site concordance, node age, and DNA substitution rate across most annotated genomic regions, suggesting a relatively uniform proportion of informative sites drive phylogenetic inference across the evolution of turtles. We found more gene concordance at older nodes in the turtle phylogeny, and suggest that, in addition to incomplete lineage sorting, an overall lack of gene informativeness stemming from a slow rate of evolution can confound inferred patterns in turtle phylogenomics, particularly at more recent divergences. Our study demonstrates that heterogeneity is to be expected even in well resolved clades such as turtles, and that future phylogenomic studies should aim to sample as much of the genome as possible.


2021 ◽  
Author(s):  
Jordan Douglas ◽  
Cinthy L. Jiménez-Silva ◽  
Remco Bouckaert

AbstractAs genomic sequence data becomes increasingly available, inferring the phylogeny of the species as that of concatenated genomic data can be enticing. However, this approach makes for a biased estimator of branch lengths and substitution rates and an inconsistent estimator of tree topology. Bayesian multispecies coalescent methods address these issues. This is achieved by embedding a set of gene trees within a species tree and jointly inferring both under a Bayesian framework. However, this approach comes at the cost of increased computational demand. Here, we introduce StarBeast3 – a software package for efficient Bayesian inference of the multispecies coalescent model via Markov chain Monte Carlo. We gain efficiency by introducing cutting-edge proposal kernels and adaptive operators, and StarBeast3 is particularly efficient when a relaxed clock model is applied. Furthermore, gene tree inference is parallelised, allowing the software to scale with the size of the problem. We validated our software and benchmarked its performance using three real and two synthetic datasets. Our results indicate that StarBeast3 is up to one-and-a-half orders of magnitude faster than StarBeast2, and therefore more than two orders faster than *BEAST, depending on the dataset and on the parameter, and is suitable for multispecies coalescent inference on large datasets (100+ genes). StarBeast3 is open-source and is easy to set up with a friendly graphical user interface.


2021 ◽  
Vol 17 (9) ◽  
pp. e1008380
Author(s):  
Charles-Elie Rabier ◽  
Vincent Berry ◽  
Marnus Stoltz ◽  
João D. Santos ◽  
Wensheng Wang ◽  
...  

For various species, high quality sequences and complete genomes are nowadays available for many individuals. This makes data analysis challenging, as methods need not only to be accurate, but also time efficient given the tremendous amount of data to process. In this article, we introduce an efficient method to infer the evolutionary history of individuals under the multispecies coalescent model in networks (MSNC). Phylogenetic networks are an extension of phylogenetic trees that can contain reticulate nodes, which allow to model complex biological events such as horizontal gene transfer, hybridization and introgression. We present a novel way to compute the likelihood of biallelic markers sampled along genomes whose evolution involved such events. This likelihood computation is at the heart of a Bayesian network inference method called SnappNet, as it extends the Snapp method inferring evolutionary trees under the multispecies coalescent model, to networks. SnappNet is available as a package of the well-known beast 2 software. Recently, the MCMC_BiMarkers method, implemented in PhyloNet, also extended Snapp to networks. Both methods take biallelic markers as input, rely on the same model of evolution and sample networks in a Bayesian framework, though using different methods for computing priors. However, SnappNet relies on algorithms that are exponentially more time-efficient on non-trivial networks. Using simulations, we compare performances of SnappNet and MCMC_BiMarkers. We show that both methods enjoy similar abilities to recover simple networks, but SnappNet is more accurate than MCMC_BiMarkers on more complex network scenarios. Also, on complex networks, SnappNet is found to be extremely faster than MCMC_BiMarkers in terms of time required for the likelihood computation. We finally illustrate SnappNet performances on a rice data set. SnappNet infers a scenario that is consistent with previous results and provides additional understanding of rice evolution.


Sign in / Sign up

Export Citation Format

Share Document