scholarly journals TopHap: Rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity

2021 ◽  
Author(s):  
Marcos A. Caraballo-Ortiz ◽  
Sayaka Miura ◽  
Maxwell Sanderford ◽  
Tenzin Dolker ◽  
Qiqing Tao ◽  
...  

Motivation: Building reliable phylogenies from very large collections of sequences with a limited number of phylogenetically informative sites is challenging because sequencing errors and recurrent/backward mutations interfere with the phylogenetic signal, confounding true evolutionary relationships. Massive global efforts of sequencing genomes and reconstructing the phylogeny of SARS-CoV-2 strains exemplify these difficulties since there are only hundreds of phylogenetically informative sites and millions of genomes. For such datasets, we set out to develop a method for building the phylogenetic tree of genomic haplotypes consisting of positions harboring common variants to improve the signal-to-noise ratio for more accurate phylogenetic inference of resolvable phylogenetic features. Results: We present the TopHap approach that determines spatiotemporally common haplotypes of common variants and builds their phylogeny at a fraction of the computational time of traditional methods. To assess topological robustness, we develop a bootstrap resampling strategy that resamples genomes spatiotemporally. The application of TopHap to build a phylogeny of 68,057 genomes (68KG) produced an evolutionary tree of major SARS-CoV-2 haplotypes. This phylogeny is concordant with the mutation tree inferred using the co-occurrence pattern of mutations and recovers key phylogenetic relationships from more traditional analyses. We also evaluated alternative roots of the SARS-CoV-2 phylogeny and found that the earliest sampled genomes in 2019 likely evolved by four mutations of the most recent common ancestor of all SARS-CoV-2 genomes. An application of TopHap to more than 1 million genomes reconstructed the most comprehensive evolutionary relationships of major variants, which confirmed the 68KG phylogeny and provided evolutionary origins of major variants of concern. Availability: TopHap is available on the web at https://github.com/SayakaMiura/TopHap.

Author(s):  
Ajay Jasra ◽  
Maria De Iorio ◽  
Marc Chadeau-Hyam

In this paper, we consider a simulation technique for stochastic trees. One of the most important areas in computational genetics is the calculation and subsequent maximization of the likelihood function associated with such models. This typically consists of using importance sampling and sequential Monte Carlo techniques. The approach proceeds by simulating the tree, backward in time from observed data, to a most recent common ancestor. However, in many cases, the computational time and variance of estimators are often too high to make standard approaches useful. In this paper, we propose to stop the simulation, subsequently yielding biased estimates of the likelihood surface. The bias is investigated from a theoretical point of view. Results from simulation studies are also given to investigate the balance between loss of accuracy, saving in computing time and variance reduction.


2007 ◽  
Vol 76 (4) ◽  
pp. 261-278 ◽  
Author(s):  
J.W. Arntzen ◽  
G. Espregueira Themudo ◽  
B. Wielstra

Newts of the genus Triturus (Amphibia, Caudata, Salamandridae) are distributed across Europe and adjacent Asia. In spite of its prominence as a model system for evolutionary research, the phylogeny of Triturus has remained incompletely solved. Our aim was to rectify this situation, to which we employed nuclear encoded proteins (40 loci) and mitochondrial DNA-sequence data (mtDNA, 642 bp of the ND4 gene). We sampled up to four populations per species covering large parts of their ranges. Allozyme and mtDNA data were analyzed separately with parsimony, distance, likelihood and Bayesian methods of phylogenetic inference. Existing knowledge on taxonomic relationships was confirmed, including the monophyly of the genus and the groups of crested newts (four species) and marbled newts (two species). The genetic coherence of species and subspecies was also confirmed, but not always with high statistical support (depending on taxon, characters under consideration, and method of inference). In spite of our efforts we did not obtain sufficient phylogenetic signal to prefer one out of twelve potential topologies representing crested newts relationships. We hypothesize that the lack of phylogenetic resolution reflects a hard polytomy and represents the (near)-simultaneous origin of crested newt species. Using a calibration point of 24 Ma (million years before present) for the most recent common ancestor of Triturus-species, the crested newt radiation event is dated at 7-6 Ma (scenario 1) or at 11-10 Ma (scenario 2), depending on the application of an allozyme versus mtDNA molecular clock. The first biogeographical scenario involves the spread of crested newts from the central Balkans into four compass directions. This scenario cannot be brought into line with potential vicariance events for south-eastern Europe. The second scenario involves the more or less simultaneous origin of four species of crested newts through large-scale vicariance events and is supported by the paleogeographical reconstruction for the region at the end of the Middle Miocene. The subspecies Triturus carnifex macedonicus carries in one large area the mtDNA that is typical for the neighbouring species T. karelinii, which is attributed to introgression and a recent range shift. It is nevertheless a long distinct evolutionary lineage and we propose to elevate its taxonomic status to that of a species, i.e., from Triturus c. macedonicus (Karaman, 1922) to Triturus macedonicus (Karaman, 1922).


2021 ◽  
Vol 17 (3) ◽  
Author(s):  
Jessica A. Oswald ◽  
Ryan S. Terrill ◽  
Brian J. Stucky ◽  
Michelle J. LeFebvre ◽  
David W. Steadman ◽  
...  

Worldwide decline in biodiversity during the Holocene has impeded a comprehensive understanding of pre-human biodiversity and biogeography. This is especially true on islands, because many recently extinct island taxa were morphologically unique, complicating assessment of their evolutionary relationships using morphology alone. The Caribbean remains an avian hotspot but was more diverse before human arrival in the Holocene. Among the recently extinct lineages is the enigmatic genus Nesotrochis, comprising three flightless species. Based on morphology, Nesotrochis has been considered an aberrant rail (Rallidae) or related to flufftails (Sarothruridae). We recovered a nearly complete mitochondrial genome of Nesotrochis steganinos from fossils, discovering that it is not a rallid but instead is sister to Sarothruridae, volant birds now restricted to Africa and New Guinea, and the recently extinct, flightless Aptornithidae of New Zealand. This result suggests a widespread or highly dispersive most recent common ancestor of the group. Prior to human settlement, the Caribbean avifauna had a far more cosmopolitan origin than is evident from extant species.


2011 ◽  
Vol 57 (2) ◽  
pp. 125-139 ◽  
Author(s):  
Jennifer M. Gumm ◽  
Tamra C. Mendelson

Abstract As complex traits evolve, each component of the trait may be under different selection pressures and could respond independently to distinct evolutionary forces. We used comparative methods to examine patterns of evolution in multiple components of a complex courtship signal in darters, specifically addressing the question of how nuptial coloration evolves across different areas of the body. Using spectral reflectance, we defined 4 broad color classes present on the body and fins of 17 species of freshwater fishes (genus Etheostoma) and quantified differences in hue within each color class. Ancestral state reconstruction suggests that most color traits were expressed in the most recent common ancestor of sampled species and that differences among species are mostly due to losses in coloration. The evolutionary lability of coloration varied across body regions; we found significant phylogenetic signal for orange color on the body but not for most colors on fins. Finally, patterns of color evolution and hue of the colors were correlated among the two dorsal fins and between the anterior dorsal and anal fins, but not between any of the fins and the body. The observed patterns support the hypothesis that different components of complex signals may be subject to distinct evolutionary pressures, and suggests that the combination of behavioral displays and morphology in communication may have a strong influence on patterns of signal evolution.


2008 ◽  
Vol 32 ◽  
pp. 901-938 ◽  
Author(s):  
N. C. A. Moore ◽  
P. Prosser

A phylogenetic tree shows the evolutionary relationships among species. Internal nodes of the tree represent speciation events and leaf nodes correspond to species. A goal of phylogenetics is to combine such trees into larger trees, called supertrees, whilst respecting the relationships in the original trees. A rooted tree exhibits an ultrametric property; that is, for any three leaves of the tree it must be that one pair has a deeper most recent common ancestor than the other pairs, or that all three have the same most recent common ancestor. This inspires a constraint programming encoding for rooted trees. We present an efficient constraint that enforces the ultrametric property over a symmetric array of constrained integer variables, with the inevitable property that the lower bounds of any three variables are mutually supportive. We show that this allows an efficient constraint-based solution to the supertree construction problem. We demonstrate that the versatility of constraint programming can be exploited to allow solutions to variants of the supertree construction problem.


Zootaxa ◽  
2017 ◽  
Vol 4329 (1) ◽  
pp. 1 ◽  
Author(s):  
YURI G. ARZANOV ◽  
VASILY V. GREBENNIKOV

We summarize knowledge of the weevil tribe Cleonini worldwide, including its monophyly, relationships, distribution, biology, immature stages, economic significance and paleontology. We score adult morphological characters for 79 of a total of 96 extant genus-group Cleonini taxa considered valid to date. The resulting matrix contains 121 parsimoniously informative characters scored for 145 ingroup (Cleonini) and 29 outgroup terminals. Maximum Parsimony (MP) and Bayesian Inference (BI) analyses consistently recover monophyletic Lixinae and Cleonini. Relationships within the latter remain unresolved with either 47 (BI) or 37 (MP) branches radiating from the tribe’s most recent common ancestor. Most of the speciose genera of Cleonini emerge as monophyletic in both BI and MP analyses (generic names followed by the number of terminals, then by BI posterior probability / MP bootstrap): Adosomus (5, 94/77), Asproparthenis (6, 99/98), Chromonotus (6, 98/85), Cleonis (3, 64/76), Coniocleonus (10, 95/41), Conorhynchus (5, 95/51), Cyphoclenus (4, 65/76), Maximus (4, 84/68), Mecaspis (4, 95/91), Scaphomorphus (4, 90/84), Temnorhinus (8, 99/62) and Xanthochelus (6, 84/71). The genera Pseudocleonus (6, -/26) and Stephanocleonus (22, -/23) are not recovered in BI and weakly supported in MP. No genera are here added to, or removed from, Cleonini. We suggest that adult morphology of Cleonini was subject to widespread homoplasy obscuring the phylogenetic signal of morphological characters. Unlike the rest of Lixinae, all extant Cleonini are hypothesised to be flightless, even though often being macropterous. All 145 ingroup terminals are illustrated in three standard views; images of the type species of 15 of the 17 genus-group taxa that are not represented in our analysis are provided. 


2015 ◽  
Vol 89 (23) ◽  
pp. 11909-11925 ◽  
Author(s):  
Maria Luiza G. Medaglia ◽  
Nissin Moussatché ◽  
Andreas Nitsche ◽  
Pjotr Wojtek Dabrowski ◽  
Yu Li ◽  
...  

ABSTRACTSmallpox was declared eradicated in 1980 after an intensive vaccination program using different strains of vaccinia virus (VACV;Poxviridae). VACV strain IOC (VACV-IOC) was the seed strain of the smallpox vaccine manufactured by the major vaccine producer in Brazil during the smallpox eradication program. However, little is known about the biological and immunological features as well as the phylogenetic relationships of this first-generation vaccine. In this work, we present a comprehensive characterization of two clones of VACV-IOC. Both clones had low virulence in infected mice and induced a protective immune response against a lethal infection comparable to the response of the licensed vaccine ACAM2000 and the parental strain VACV-IOC. Full-genome sequencing revealed the presence of several fragmented virulence genes that probably are nonfunctional, e.g., F1L, B13R, C10L, K3L, and C3L. Most notably, phylogenetic inference supported by the structural analysis of the genome ends provides evidence of a novel, independent cluster in VACV phylogeny formed by VACV-IOC, the Brazilian field strains Cantagalo (CTGV) and Serro 2 viruses, and horsepox virus, a VACV-like virus supposedly related to an ancestor of the VACV lineage. Our data strongly support the hypothesis that CTGV-like viruses represent feral VACV that evolved in parallel with VACV-IOC after splitting from a most recent common ancestor, probably an ancient smallpox vaccine strain related to horsepox virus. Our data, together with an interesting historical investigation, revisit the origins of VACV and propose new evolutionary relationships between ancient and extant VACV strains, mainly horsepox virus, VACV-IOC/CTGV-like viruses, and Dryvax strain.IMPORTANCEFirst-generation vaccines used to eradicate smallpox had rates of adverse effects that are not acceptable by current health care standards. Moreover, these vaccines are genetically heterogeneous and consist of a pool of quasispecies of VACV. Therefore, the search for new-generation smallpox vaccines that combine low pathogenicity, immune protection, and genetic homogeneity is extremely important. In addition, the phylogenetic relationships and origins of VACV strains are quite nebulous. We show the characterization of two clones of VACV-IOC, a unique smallpox vaccine strain that contributed to smallpox eradication in Brazil. The immunogenicity and reduced virulence make the IOC clones good options for alternative second-generation smallpox vaccines. More importantly, this study reveals the phylogenetic relationship between VACV-IOC, feral VACV established in nature, and the ancestor-like horsepox virus. Our data expand the discussion on the origins and evolutionary connections of VACV lineages.


Genetics ◽  
1998 ◽  
Vol 150 (3) ◽  
pp. 1187-1198 ◽  
Author(s):  
Mikkel H Schierup ◽  
Xavier Vekemans ◽  
Freddy B Christiansen

Abstract Expectations for the time scale and structure of allelic genealogies in finite populations are formed under three models of sporophytic self-incompatibility. The models differ in the dominance interactions among the alleles that determine the self-incompatibility phenotype: In the SSIcod model, alleles act codominantly in both pollen and style, in the SSIdom model, alleles form a dominance hierarchy, and in SSIdomcod, alleles are codominant in the style and show a dominance hierarchy in the pollen. Coalescence times of alleles rarely differ more than threefold from those under gametophytic self-incompatibility, and transspecific polymorphism is therefore expected to be equally common. The previously reported directional turnover process of alleles in the SSIdomcod model results in coalescence times lower and substitution rates higher than those in the other models. The SSIdom model assumes strong asymmetries in allelic action, and the most recessive extant allele is likely to be the most recent common ancestor. Despite these asymmetries, the expected shape of the allele genealogies does not deviate markedly from the shape of a neutral gene genealogy. The application of the results to sequence surveys of alleles, including interspecific comparisons, is discussed.


Author(s):  
Wenjun Cheng ◽  
Tianjiao Ji ◽  
Shuaifeng Zhou ◽  
Yong Shi ◽  
Lili Jiang ◽  
...  

AbstractEchovirus 6 (E6) is associated with various clinical diseases and is frequently detected in environmental sewage. Despite its high prevalence in humans and the environment, little is known about its molecular phylogeography in mainland China. In this study, 114 of 21,539 (0.53%) clinical specimens from hand, foot, and mouth disease (HFMD) cases collected between 2007 and 2018 were positive for E6. The complete VP1 sequences of 87 representative E6 strains, including 24 strains from this study, were used to investigate the evolutionary genetic characteristics and geographical spread of E6 strains. Phylogenetic analysis based on VP1 nucleotide sequence divergence showed that, globally, E6 strains can be grouped into six genotypes, designated A to F. Chinese E6 strains collected between 1988 and 2018 were found to belong to genotypes C, E, and F, with genotype F being predominant from 2007 to 2018. There was no significant difference in the geographical distribution of each genotype. The evolutionary rate of E6 was estimated to be 3.631 × 10-3 substitutions site-1 year-1 (95% highest posterior density [HPD]: 3.2406 × 10-3-4.031 × 10-3 substitutions site-1 year-1) by Bayesian MCMC analysis. The most recent common ancestor of the E6 genotypes was traced back to 1863, whereas their common ancestor in China was traced back to around 1962. A small genetic shift was detected in the Chinese E6 population size in 2009 according to Bayesian skyline analysis, which indicated that there might have been an epidemic around that year.


Author(s):  
Ya-Fang Hu ◽  
Li-Ping Jia ◽  
Fang-Yuan Yu ◽  
Li-Ying Liu ◽  
Qin-Wei Song ◽  
...  

Abstract Background Coxsackievirus A16 (CVA16) is one of the major etiological agents of hand, foot and mouth disease (HFMD). This study aimed to investigate the molecular epidemiology and evolutionary characteristics of CVA16. Methods Throat swabs were collected from children with HFMD and suspected HFMD during 2010–2019. Enteroviruses (EVs) were detected and typed by real-time reverse transcription-polymerase chain reaction (RT-PCR) and RT-PCR. The genotype, evolutionary rate, the most recent common ancestor, population dynamics and selection pressure of CVA16 were analyzed based on viral protein gene (VP1) by bioinformatics software. Results A total of 4709 throat swabs were screened. EVs were detected in 3180 samples and 814 were CVA16 positive. More than 81% of CVA16-positive children were under 5 years old. The prevalence of CVA16 showed obvious periodic fluctuations with a high level during 2010–2012 followed by an apparent decline during 2013–2017. However, the activities of CVA16 increased gradually during 2018–2019. All the Beijing CVA16 strains belonged to sub-genotype B1, and B1b was the dominant strain. One B1c strain was detected in Beijing for the first time in 2016. The estimated mean evolutionary rate of VP1 gene was 4.49 × 10–3 substitution/site/year. Methionine gradually fixed at site-23 of VP1 since 2012. Two sites were detected under episodic positive selection, one of which (site-223) located in neutralizing linear epitope PEP71. Conclusions The dominant strains of CVA16 belonged to clade B1b and evolved in a fast evolutionary rate during 2010–2019 in Beijing. To provide more favorable data for HFMD prevention and control, it is necessary to keep attention on molecular epidemiological and evolutionary characteristics of CVA16.


Sign in / Sign up

Export Citation Format

Share Document