scholarly journals The probability of monophyly of a sample of gene lineages on a species tree

2016 ◽  
Vol 113 (29) ◽  
pp. 8002-8009 ◽  
Author(s):  
Rohan S. Mehta ◽  
David Bryant ◽  
Noah A. Rosenberg

Monophyletic groups—groups that consist of all of the descendants of a most recent common ancestor—arise naturally as a consequence of descent processes that result in meaningful distinctions between organisms. Aspects of monophyly are therefore central to fields that examine and use genealogical descent. In particular, studies in conservation genetics, phylogeography, population genetics, species delimitation, and systematics can all make use of mathematical predictions under evolutionary models about features of monophyly. One important calculation, the probability that a set of gene lineages is monophyletic under a two-species neutral coalescent model, has been used in many studies. Here, we extend this calculation for a species tree model that contains arbitrarily many species. We study the effects of species tree topology and branch lengths on the monophyly probability. These analyses reveal new behavior, including the maintenance of nontrivial monophyly probabilities for gene lineage samples that span multiple species and even for lineages that do not derive from a monophyletic species group. We illustrate the mathematical results using an example application to data from maize and teosinte.

2021 ◽  
Author(s):  
Helmut E Simon ◽  
Gavin A Huttley

The site frequency spectrum (SFS) is a commonly used statistic to summarize genetic variation in a sample of genomic sequences from a population. Such a genomic sample is associated with an imputed genealogical history with attributes such as branch lengths, coalescence times and the time to the most recent common ancestor (TMRCA) as well as topological and combinatorial properties. We present a Bayesian model for sampling from the joint posterior distribution of coalescence times conditional on the SFS associated with a sample of sequences in the absence of selection. In this model, the combinatorial properties of a genealogy, which is represented as a coalescent tree, are expressed as matrices. This facilitates the calculation of likelihoods and the effective sampling of the entire space of tree structures according to the Equal Rates Markov (or Yule-type) measure. Unlike previous methods, assumptions as to the type of stochastic process that generated the genealogical tree are not required. Novel approaches to defining both uninformative and informative prior distributions are employed. The uncertainty in inference due to the stochastic nature of mutation and the unknown tree structure is expressed by the shape of the posterior distributions. The method is implemented using the general purpose Markov Chain Monte Carlo software PyMC3. From the sampled posterior distribution of coalescence times, one can also infer related quantities such as the number of ancestors of a sample at a given time in the past (ancestral distribution) and the probability of specific relationships between branch lengths (for example, that the most recent branch is longer than all the others). The performance of the method is evaluated against simulated data and is also applied to historic mitochondrial data from the Nuu-Chah-Nulth people of North America. The method can be used to obtain estimates of the TMRCA of the sample. The relationship of these estimates to those given by ''Thomson's estimator'' is explored. Keywords: coalescent theory; Bayesian inference; time to most recent common ancestor; site frequency spectrum


Genetics ◽  
1998 ◽  
Vol 150 (3) ◽  
pp. 1187-1198 ◽  
Author(s):  
Mikkel H Schierup ◽  
Xavier Vekemans ◽  
Freddy B Christiansen

Abstract Expectations for the time scale and structure of allelic genealogies in finite populations are formed under three models of sporophytic self-incompatibility. The models differ in the dominance interactions among the alleles that determine the self-incompatibility phenotype: In the SSIcod model, alleles act codominantly in both pollen and style, in the SSIdom model, alleles form a dominance hierarchy, and in SSIdomcod, alleles are codominant in the style and show a dominance hierarchy in the pollen. Coalescence times of alleles rarely differ more than threefold from those under gametophytic self-incompatibility, and transspecific polymorphism is therefore expected to be equally common. The previously reported directional turnover process of alleles in the SSIdomcod model results in coalescence times lower and substitution rates higher than those in the other models. The SSIdom model assumes strong asymmetries in allelic action, and the most recessive extant allele is likely to be the most recent common ancestor. Despite these asymmetries, the expected shape of the allele genealogies does not deviate markedly from the shape of a neutral gene genealogy. The application of the results to sequence surveys of alleles, including interspecific comparisons, is discussed.


Author(s):  
Wenjun Cheng ◽  
Tianjiao Ji ◽  
Shuaifeng Zhou ◽  
Yong Shi ◽  
Lili Jiang ◽  
...  

AbstractEchovirus 6 (E6) is associated with various clinical diseases and is frequently detected in environmental sewage. Despite its high prevalence in humans and the environment, little is known about its molecular phylogeography in mainland China. In this study, 114 of 21,539 (0.53%) clinical specimens from hand, foot, and mouth disease (HFMD) cases collected between 2007 and 2018 were positive for E6. The complete VP1 sequences of 87 representative E6 strains, including 24 strains from this study, were used to investigate the evolutionary genetic characteristics and geographical spread of E6 strains. Phylogenetic analysis based on VP1 nucleotide sequence divergence showed that, globally, E6 strains can be grouped into six genotypes, designated A to F. Chinese E6 strains collected between 1988 and 2018 were found to belong to genotypes C, E, and F, with genotype F being predominant from 2007 to 2018. There was no significant difference in the geographical distribution of each genotype. The evolutionary rate of E6 was estimated to be 3.631 × 10-3 substitutions site-1 year-1 (95% highest posterior density [HPD]: 3.2406 × 10-3-4.031 × 10-3 substitutions site-1 year-1) by Bayesian MCMC analysis. The most recent common ancestor of the E6 genotypes was traced back to 1863, whereas their common ancestor in China was traced back to around 1962. A small genetic shift was detected in the Chinese E6 population size in 2009 according to Bayesian skyline analysis, which indicated that there might have been an epidemic around that year.


Genetics ◽  
1999 ◽  
Vol 151 (3) ◽  
pp. 1217-1228 ◽  
Author(s):  
Carsten Wiuf ◽  
Jotun Hein

Abstract In this article we discuss the ancestry of sequences sampled from the coalescent with recombination with constant population size 2N. We have studied a number of variables based on simulations of sample histories, and some analytical results are derived. Consider the leftmost nucleotide in the sequences. We show that the number of nucleotides sharing a most recent common ancestor (MRCA) with the leftmost nucleotide is ≈log(1 + 4N Lr)/4Nr when two sequences are compared, where L denotes sequence length in nucleotides, and r the recombination rate between any two neighboring nucleotides per generation. For larger samples, the number of nucleotides sharing MRCA with the leftmost nucleotide decreases and becomes almost independent of 4N Lr. Further, we show that a segment of the sequences sharing a MRCA consists in mean of 3/8Nr nucleotides, when two sequences are compared, and that this decreases toward 1/4Nr nucleotides when the whole population is sampled. A measure of the correlation between the genealogies of two nucleotides on two sequences is introduced. We show analytically that even when the nucleotides are separated by a large genetic distance, but share MRCA, the genealogies will show only little correlation. This is surprising, because the time until the two nucleotides shared MRCA is reciprocal to the genetic distance. Using simulations, the mean time until all positions in the sample have found a MRCA increases logarithmically with increasing sequence length and is considerably lower than a theoretically predicted upper bound. On the basis of simulations, it turns out that important properties of the coalescent with recombinations of the whole population are reflected in the properties of a sample of low size.


Botany ◽  
2013 ◽  
Vol 91 (9) ◽  
pp. 605-613 ◽  
Author(s):  
Claudia Ciotir ◽  
Chris Yesson ◽  
Joanna Freeland

Understanding the spatial distribution of genetic diversity and its evolutionary history is an essential part of developing effective biodiversity management plans. This may be particularly true when considering the value of peripheral or disjunct populations. Although conservation decisions are often made with reference to geopolitical boundaries, many policy-makers also consider global distributions, and therefore a species’ global status may temper its regional status. Many disjunct populations can be found in the Great Lakes region of North America, including those of Bartonia paniculata subsp. paniculata, a species that has been designated as threatened in Canada but globally secure. We compared chloroplast sequences between disjunct (Canada) and core (USA) populations of B. paniculata subsp. paniculata separated by 600 km, which is the minimum distance between disjunct and core populations in this subspecies. We found that although lineages within the disjunct populations shared a relatively recent common ancestor, the genetic divergence between plants from Ontario and New Jersey was substantially greater than expected for a consubspecific comparison. A coalescence-based analysis dated the most recent common ancestor of the Canadian and US populations at approximately 534 000 years ago with the lower confidence estimate at 226 000 years ago. This substantially predates the Last Glacial Maximum and suggests that disjunct and core populations have followed independent evolutionary trajectories throughout multiple glacial–interglacial cycles. Our findings provide important insight into the diverse processes that have resulted in numerous disjunct species in the Great Lakes region and highlight a need for additional work on Canadian B. paniculata subsp. paniculata taxonomy prior to a reevaluation of its conservation value.


Author(s):  
Satoshi Nakano ◽  
Takao Fujisawa ◽  
Bin Chang ◽  
Yutaka Ito ◽  
Hideki Akeda ◽  
...  

After the introduction of the seven-valent pneumococcal conjugate vaccine, the global spread of multidrug resistant serotype 19A-ST320 strains became a public health concern. In Japan, the main genotype of serotype 19A was ST3111, and the identification rate of ST320 was low. Although the isolates were sporadically detected in both adults and children, their origin remains unknown. Thus, by combining pneumococcal isolates collected in three nationwide pneumococcal surveillance studies conducted in Japan between 2008 and 2020, we analyzed 56 serotype 19A-ST320 isolates along with 931 global isolates, using whole-genome sequencing to uncover the transmission route of the globally distributed clone in Japan. The clone was frequently detected in Okinawa Prefecture, where the U.S. returned to Japan in 1972. Phylogenetic analysis demonstrated that the isolates from Japan were genetically related to those from the U.S.; therefore, the common ancestor may have originated in the U.S. In addition, Bayesian analysis suggested that the time to the most recent common ancestor of the isolates form Japan and the U.S. was approximately the 1990s to 2000, suggesting the possibility that the common ancestor could have already spread in the U.S. before the Taiwan 19F-14 isolate was first identified in a Taiwanese hospital in 1997. The phylogeographical analysis supported the transmission of the clone from the U.S. to Japan, but the analysis could be influenced by sampling bias. These results suggested the possibility that the serotype 19A-ST320 clone had already spread in the U.S. before being imported into Japan.


2021 ◽  
Vol 83 (2) ◽  
pp. 76-79
Author(s):  
Cristina Sousa

The origin of life is one of the most interesting and challenging questions in biology. This article discusses relevant contemporary theories and hypotheses about the origin of life, recent scientific evidence supporting them, and the main contributions of several scientists of different nationalities and specialties in different disciplines. Also discussed are several ideas about the characteristics of the most recent common ancestor, also called the “last universal common ancestor” (or LUCA), including cellular status (unicellular or community) and homogeneity level.


2020 ◽  
Author(s):  
Chul Lee ◽  
Seoae Cho ◽  
Kyu-Won Kim ◽  
DongAhn Yoo ◽  
Jae Yong Han ◽  
...  

Abstract Single amino acid variants (SAVs) may provide clues to understanding evolution of traits. A complex trait that has evolved convergently among species is vocal learning, the rare ability to imitate sounds heard and an important component of spoken-language. Here we assessed whether convergent vocal learning bird species have convergent SAVs (CSAVs) that could be associated with their specialized trait. We analyzed avian genomes and identified CSAVs in vocal learners, but also in most species combinations tested. The number of CSAVs among species was proportional to the product of the most recent common ancestor (MRCA; origin) branch lengths of the species in question, and vocal learning birds did not exceed the overall proportion in most test. However, genes with identical CSAVs (iCSAVs) in vocal learning species were uniquely enriched in ‘learning’ functions, and a subset of iCSAV genes were under positive selection and had enriched specialized regulation in vocal learning and their adjacent brain subdivisions. Several top candidate genes converge on the cAMP signaling pathway, including DRD1B and PRKAR2B. Our findings suggest a complex mechanism of amino acid convergences and specialized gene regulation upon which selection acts for specialized convergent traits.


1998 ◽  
Vol 95 (16) ◽  
pp. 9402-9406 ◽  
Author(s):  
Bruce G. Baldwin ◽  
Michael J. Sanderson

Comparisons between insular and continental radiations have been hindered by a lack of reliable estimates of absolute diversification rates in island lineages. We took advantage of rate-constant rDNA sequence evolution and an “external” calibration using paleoclimatic and fossil data to determine the maximum age and minimum diversification rate of the Hawaiian silversword alliance (Compositae), a textbook example of insular adaptive radiation in plants. Our maximum-age estimate of 5.2 ± 0.8 million years ago for the most recent common ancestor of the silversword alliance is much younger than ages calculated by other means for the Hawaiian drosophilids, lobelioids, and honeycreepers and falls approximately within the history of the modern high islands (≤5.1 ± 0.2 million years ago). By using a statistically efficient estimator that reduces error variance by incorporating clock-based estimates of divergence times, a minimum diversification rate for the silversword alliance was estimated to be 0.56 ± 0.17 species per million years. This exceeds average rates of more ancient continental radiations and is comparable to peak rates in taxa with sufficiently rich fossil records that changes in diversification rate can be reconstructed.


Sign in / Sign up

Export Citation Format

Share Document