scholarly journals GHOST: Recovering Historical Signal from Heterotachously-evolved Sequence Alignments

2017 ◽  
Author(s):  
Stephen M Crotty ◽  
Bui Quang Minh ◽  
Nigel G Bean ◽  
Barbara R Holland ◽  
Jonathan Tuke ◽  
...  

AbstractMolecular sequence data that have evolved under the influence of heterotachous evolutionary processes are known to mislead phylogenetic inference. We introduce the General Heterogeneous evolution On a Single Topology (GHOST) model of sequence evolution, implemented under a maximum-likelihood framework in the phylogenetic program IQ-TREE (http://www.iqtree.org). Simulations show that using the GHOST model, IQ-TREE can accurately recover the tree topology, branch lengths and substitution model parameters from heterotachously-evolved sequences. We develop a model selection algorithm based on simulation results, and investigate the performance of the GHOST model on empirical data by sampling phylogenomic alignments of varying lengths from a plastome alignment. We then carry out inference under the GHOST model on a phylogenomic dataset composed of 248 genes from 16 taxa, where we find the GHOST model concurs with the currently accepted view, placing turtles as a sister lineage of archosaurs, in contrast to results obtained using traditional variable rates-across-sites models. Finally, we apply the model to a dataset composed of a sodium channel gene of 11 fish taxa, finding that the GHOST model is able to infer a subtle component of the historical signal, linked to the previously established convergent evolution of the electric organ in two geographically distinct lineages of electric fish. We compare inference under the GHOST model to partitioning by codon position and show that, owing to the minimization of model constraints, the GHOST model is able to offer unique biological insights when applied to empirical data.


Author(s):  
Stephen M Crotty ◽  
Bui Quang Minh ◽  
Nigel G Bean ◽  
Barbara R Holland ◽  
Jonathan Tuke ◽  
...  

Abstract Molecular sequence data that have evolved under the influence of heterotachous evolutionary processes are known to mislead phylogenetic inference. We introduce the General Heterogeneous evolution On a Single Topology (GHOST) model of sequence evolution, implemented under a maximum-likelihood framework in the phylogenetic program IQ-TREE (http://www.iqtree.org). Simulations show that using the GHOST model, IQ-TREE can accurately recover the tree topology, branch lengths, and substitution model parameters from heterotachously evolved sequences. We investigate the performance of the GHOST model on empirical data by sampling phylogenomic alignments of varying lengths from a plastome alignment. We then carry out inference under the GHOST model on a phylogenomic data set composed of 248 genes from 16 taxa, where we find the GHOST model concurs with the currently accepted view, placing turtles as a sister lineage of archosaurs, in contrast to results obtained using traditional variable rates-across-sites models. Finally, we apply the model to a data set composed of a sodium channel gene of 11 fish taxa, finding that the GHOST model is able to elucidate a subtle component of the historical signal, linked to the previously established convergent evolution of the electric organ in two geographically distinct lineages of electric fish. We compare inference under the GHOST model to partitioning by codon position and show that, owing to the minimization of model constraints, the GHOST model offers unique biological insights when applied to empirical data.



2020 ◽  
Vol 20 (4) ◽  
pp. 410-436
Author(s):  
Sarah E Heaps ◽  
Tom MW Nye ◽  
Richard J Boys ◽  
Tom A Williams ◽  
Svetlana Cherlin ◽  
...  

Phylogenetics uses alignments of molecular sequence data to learn about evolutionary trees relating species. Along branches, sequence evolution is modelled using a continuous-time Markov process characterized by an instantaneous rate matrix. Early models assumed the same rate matrix governed substitutions at all sites of the alignment, ignoring variation in evolutionary pressures. Substantial improvements in phylogenetic inference and model fit were achieved by augmenting these models with multiplicative random effects that describe the result of variation in selective constraints and allow sites to evolve at different rates which linearly scale a baseline rate matrix. Motivated by this pioneering work, we consider an extension using a quadratic, rather than linear, transformation. The resulting models allow for variation in the selective coefficients of different types of point mutation at a site in addition to variation in selective constraints. We derive properties of the extended models. For certain non-stationary processes, the extension gives a model that allows variation in sequence composition, both across sites and taxa. We adopt a Bayesian approach, describe an MCMC algorithm for posterior inference and provide software. Our quadratic models are applied to alignments spanning the tree of life and compared with site-homogeneous and linear models.



2020 ◽  
Author(s):  
Thomas KF Wong ◽  
Subha Kalyaanamoorthy ◽  
Karen Meusemann ◽  
David K Yeates ◽  
Bernhard Misof ◽  
...  

ABSTRACTMultiple sequence alignments (MSAs) play a pivotal role in studies of molecular sequence data, but nobody has developed a minimum reporting standard (MRS) to quantify the completeness of MSAs in terms of completely-specified nucleotides or amino acids. We present an MRS that relies on four simple completeness metrics. The metrics are implemented in AliStat, a program developed to support the MRS. A survey of published MSAs illustrates the benefits and unprecedented transparency offered by the MRS.



2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Thomas K F Wong ◽  
Subha Kalyaanamoorthy ◽  
Karen Meusemann ◽  
David K Yeates ◽  
Bernhard Misof ◽  
...  

Abstract Multiple sequence alignments (MSAs) play a pivotal role in studies of molecular sequence data, but nobody has developed a minimum reporting standard (MRS) to quantify the completeness of MSAs in terms of completely specified nucleotides or amino acids. We present an MRS that relies on four simple completeness metrics. The metrics are implemented in AliStat, a program developed to support the MRS. A survey of published MSAs illustrates the benefits and unprecedented transparency offered by the MRS.



2018 ◽  
Author(s):  
Qiqing Tao ◽  
Koichiro Tamura ◽  
Fabia Battistuzzi ◽  
Sudhir Kumar

AbstractNew species arise from pre-existing species and inherit similar genomes and environments. This predicts greater similarity of mutation rates and the tempo of molecular evolution between direct ancestors and descendants, resulting in autocorrelation of evolutionary rates within lineages in the tree of life. Surprisingly, molecular sequence data have not confirmed this expectation, possibly because available methods lack power to detect autocorrelated rates. Here we present a machine learning method to detect the presence evolutionary rate autocorrelation in large phylogenies. The new method is computationally efficient and performs better than the available state-of-the-art methods. Application of the new method reveals extensive rate autocorrelation in DNA and amino acid sequence evolution of mammals, birds, insects, metazoans, plants, fungi, and prokaryotes. Therefore, rate autocorrelation is a common phenomenon throughout the tree of life. These findings suggest concordance between molecular and non-molecular evolutionary patterns and will foster unbiased and precise dating of the tree of life.



2019 ◽  
Vol 36 (4) ◽  
pp. 811-824 ◽  
Author(s):  
Qiqing Tao ◽  
Koichiro Tamura ◽  
Fabia U. Battistuzzi ◽  
Sudhir Kumar

Abstract New species arise from pre-existing species and inherit similar genomes and environments. This predicts greater similarity of the tempo of molecular evolution between direct ancestors and descendants, resulting in autocorrelation of evolutionary rates in the tree of life. Surprisingly, molecular sequence data have not confirmed this expectation, possibly because available methods lack the power to detect autocorrelated rates. Here, we present a machine learning method, CorrTest, to detect the presence of rate autocorrelation in large phylogenies. CorrTest is computationally efficient and performs better than the available state-of-the-art method. Application of CorrTest reveals extensive rate autocorrelation in DNA and amino acid sequence evolution of mammals, birds, insects, metazoans, plants, fungi, parasitic protozoans, and prokaryotes. Therefore, rate autocorrelation is a common phenomenon throughout the tree of life. These findings suggest concordance between molecular and nonmolecular evolutionary patterns, and they will foster unbiased and precise dating of the tree of life.



Phytotaxa ◽  
2014 ◽  
Vol 176 (1) ◽  
pp. 219 ◽  
Author(s):  
ASHA J. DISSANAYAKE ◽  
RUVISHIKA S. JAYAWARDENA ◽  
SARANYAPHAT BOONMEE ◽  
KASUN M. THAMBUGALA ◽  
QING TIAN ◽  
...  

The family Myriangiaceae is relatively poorly known amongst the Dothideomycetes and includes genera which are saprobic, epiphytic and parasitic on the bark, leaves and branches of various plants. The family has not undergone any recent revision, however, molecular data has shown it to be a well-resolved family closely linked to Elsinoaceae in Myriangiales. Both morphological and molecular characters indicate that Elsinoaceae differs from Myriangiaceae. In Elsinoaceae, small numbers of asci form in locules in light coloured pseudostromata, which form typical scab-like blemishes on leaf or fruit surfaces. The coelomycetous, “Sphaceloma”-like asexual state of Elsinoaceae, form more frequently than the sexual state; conidiogenesis is phialidic and conidia are 1-celled and hyaline. In Myriangiaceae, locules with single asci are scattered in a superficial, coriaceous to sub-carbonaceous, black ascostromata and do not form scab-like blemishes. No asexual state is known. In this study, we revisit the family Myriangiaceae, and accept ten genera, providing descriptions and discussion on the generic types of Anhellia, Ascostratum, Butleria, Dictyocyclus, Diplotheca, Eurytheca, Hemimyriangium, Micularia, Myriangium and Zukaliopsis. The genera of Myriangiaceae are compared and contrasted. Myriangium duriaei is the type species of the family, while Diplotheca is similar and may possibly be congeneric. The placement of Anhellia in Myriangiaceae is supported by morphological and molecular data. Because of similarities with Myriangium, Ascostratum (A. insigne), Butleria (B. inaghatahani), Dictyocyclus (D. hydrangea), Eurytheca (E. trinitensis), Hemimyriangium (H. betulae), Micularia (M. merremiae) and Zukaliopsis (Z. amazonica) are placed in Myriangiaceae. Molecular sequence data from fresh collections is required to confirm the relationships and placement of the genera in this family.



Zootaxa ◽  
2017 ◽  
Vol 4238 (1) ◽  
pp. 58 ◽  
Author(s):  
ATSUSHI MOCHIZUKI ◽  
CHARLES S. HENRY ◽  
PETER DUELLI

The small lacewing genus Apertochrysa comprises species from Africa, Asia and Australia. All lack a tignum, but otherwise resemble distantly related genera. We show that Apertochrysa does not form a monophyletic clade, based on analyses of molecular sequence data and morphological traits such as the presence and shape of the male gonapsis, wing venation, and larval setae. Apertochrysa kichijoi forms a clade with Eremochrysa, Suarius and Chrysemosa, whereas A. albolineatoides belongs to a clade that includes Cunctochrysa. Apertochrysa albolineatoides should become a new combination as Cunctochrysa albolineatoides, while A. kichijoi will have to be transferred to a new genus. The Australian A. edwardsi, the African A. eurydera and the type species of the genus Apertochrysa, A. umbrosa, join the large Pseudomallada group. Relationships of A. umbrosa are less certain, because for it we could amplify only one of the three nuclear genes used in the overall analysis. However, in all morphological traits tested, that species strongly resembles A. edwardsi and A. eurydera and thus is very likely just another exceptional Pseudomallada lacking a tignum. The fate of the genus name Apertochrysa depends on additional molecular and morphological analyses of A. umbrosa. 



2009 ◽  
Vol 364 (1527) ◽  
pp. 2197-2207 ◽  
Author(s):  
Peter G. Foster ◽  
Cymon J. Cox ◽  
T. Martin Embley

The three-domains tree, which depicts eukaryotes and archaebacteria as monophyletic sister groups, is the dominant model for early eukaryotic evolution. By contrast, the ‘eocyte hypothesis’, where eukaryotes are proposed to have originated from within the archaebacteria as sister to the Crenarchaeota (also called the eocytes), has been largely neglected in the literature. We have investigated support for these two competing hypotheses from molecular sequence data using methods that attempt to accommodate the across-site compositional heterogeneity and across-tree compositional and rate matrix heterogeneity that are manifest features of these data. When ribosomal RNA genes were analysed using standard methods that do not adequately model these kinds of heterogeneity, the three-domains tree was supported. However, this support was eroded or lost when composition-heterogeneous models were used, with concomitant increase in support for the eocyte tree for eukaryotic origins. Analysis of combined amino acid sequences from 41 protein-coding genes supported the eocyte tree, whether or not composition-heterogeneous models were used. The possible effects of substitutional saturation of our data were examined using simulation; these results suggested that saturation is delayed by among-site rate variation in the sequences, and that phylogenetic signal for ancient relationships is plausibly present in these data.



Phytotaxa ◽  
2021 ◽  
Vol 514 (3) ◽  
pp. 247-260
Author(s):  
KASUN THAMBUGALA ◽  
DINUSHANI DARANAGAMA ◽  
SAGARIKA KANNANGARA ◽  
THENUKA KODITUWAKKU

Endophytic fungi are a diverse group of microorganisms that live asymptomatically in healthy tissues of host and they have been reported from all kinds of plant tissues such as leaves, stems, roots, flowers, and fruits. In this study, fungal endophytes associated with tea leaves (Camellia sinensis) were collected from Kandy, Kegalle, and Nuwara Eliya districts in Sri Lanka and were isolated, characterized, and identified. A total of twenty endophytic fungal isolates belonging to five genera were recovered and ITS-rDNA sequence data were used to identify them. All isolated endophytic fungal strains belong to the phylum Ascomycota and the majority of these isolates were identified as Colletotrichum species. Phyllosticta capitalensis was the most commonly found fungal endophyte in tea leaves and was recorded in all three districts where the samples were collected. This is the very first investigation on fungal endophytes associated with C. sinensis in Sri Lanka based on molecular sequence data. In addition, a comprehensive account of known endophytic fungi reported worldwide on Camellia sinensis is provided.



Sign in / Sign up

Export Citation Format

Share Document