scholarly journals Lower statistical support with larger datasets: insights from the Ochrophyta radiation

2021 ◽  
Author(s):  
Arnaud Di Franco ◽  
Denis Baurain ◽  
Gernot Glöckner ◽  
Michael Melkonian ◽  
Hervé Philippe

AbstractIt is commonly assumed that increasing the number of characters has the potential to resolving radiations. We studied photosynthetic stramenopiles (Ochrophyta) using alignments of heterogeneous size and origin (6,762 sites for mitochondrion, 21,692 sites for plastid and 209,105 sites for nucleus). While statistical support for the relationships between the six major Ochrophyta lineages increases when comparing the mitochondrion and plastid trees, it decreases in the nuclear tree. Statistical support is not simply related to the dataset size but also to the quantity of phylogenetic signal available at each position and our ability to extract it. Here, we show that proper signal extraction is difficult to attain, as demonstrated by conflicting results obtained when varying taxon sampling. Even though the use of a better fitting model improved signal extraction and reduced the observed conflicts, the plastid dataset provided higher statistical support for the ochrophyte radiation than the larger nucleus dataset. We propose that the higher support observed in the plastid tree is due to an acceleration of the evolutionary rate in one short deep internal branch, implying that more phylogenetic signal per position is available to resolve the Ochrophyta radiation in the plastid than in the nuclear dataset. Our work therefore suggests that, in order to resolve radiations, beyond the obvious use of datasets with more positions, we need to continue developing models of sequence evolution that better extract the phylogenetic signal and design methods to search for genes/characters that contain more signal specifically for short internal branches.

2017 ◽  
Author(s):  
Marek L. Borowiec

AbstractThe evolution of the suite of morphological and behavioral adaptations underlying the ecological success of army ants has been the subject of considerable debate. This “army ant syn-drome” has been argued to have arisen once or multiple times within the ant subfamily Dorylinae. To address this question I generated data from 2,166 loci and a comprehensive taxon sampling for a phylogenetic investigation. Most analyses show strong support for convergent evolution of the army ant syndrome in the Old and New World but certain relationships are sensitive to analytics. I examine the signal present in this data set and find that conflict is diminished when only loci less likely to violate common phylogenetic model assumptions are considered. I also provide a temporal and spatial context for doryline evolution with timecalibrated, biogeographic, and diversification rate shift analyses. This study underscores the need for cautious analysis of phylogenomic data and calls for more efficient algorithms employing better-fitting models of molecular evolution.SignificanceRecent interpretation of army ant evolution holds that army ant behavior and morphology originated only once within the subfamily Dorylinae. An inspection of phylogenetic signal in a large new data set shows that support for this hypothesis may be driven by bias present in the data. Convergent evolution of the army ant syndrome is consistently supported when sequences violating assumptions of a commonly used model of sequence evolution are excluded from the analysis. This hypothesis also fits with a simple scenario of doryline biogeography. These results highlight the importance of careful evaluation of signal and conflict within phylogenomic data sets, even when taxon sampling is comprehensive.


Genetics ◽  
2000 ◽  
Vol 154 (4) ◽  
pp. 1711-1720 ◽  
Author(s):  
Bryant F McAllister ◽  
Gilean A T McVean

Abstract The amino acid sequence of the transformer (tra) gene exhibits an extremely rapid rate of evolution among Drosophila species, although the gene performs a critical step in sex determination. These changes in amino acid sequence are the result of either natural selection or neutral evolution. To differentiate between selective and neutral causes of this evolutionary change, analyses of both intraspecific and interspecific patterns of molecular evolution of tra gene sequences are presented. Sequences of 31 tra alleles were obtained from Drosophila americana. Many replacement and silent nucleotide variants are present among the alleles; however, the distribution of this sequence variation is consistent with neutral evolution. Sequence evolution was also examined among six species representative of the genus Drosophila. For most lineages and most regions of the gene, both silent and replacement substitutions have accumulated in a constant, clock-like manner. In exon 3 of D. virilis and D. americana we find evidence for an elevated rate of nonsynonymous substitution, but no statistical support for a greater rate of nonsynonymous relative to synonymous substitutions. Both levels of analysis of the tra sequence suggest that, although the gene is evolving at a rapid pace, these changes are neutral in function.


2019 ◽  
Vol 70 (1) ◽  
pp. 132-135 ◽  
Author(s):  
Sabelle Jallow ◽  
Jo M Wilmshurst ◽  
Wayne Howard ◽  
Julie Copelyn ◽  
Lerato Seakamela ◽  
...  

Abstract Primary B-cell immunodeficiencies are risk factors for the generation of vaccine-derived polioviruses. We report immunodeficiency-associated vaccine-derived poliovirus serotype 3 in an 11-week-old boy with X-linked agammaglobulinemia. Unique characteristics of this case include early age of presentation, high viral evolutionary rate, and the child’s perinatal exposure to human immunodeficiency virus.


Author(s):  
Romain Sabroux ◽  
Laure Corbari ◽  
Franz Krapp ◽  
Céline Bonillo ◽  
Stéphanie Le Prieur ◽  
...  

The family Ammotheidae is the most diversified group of the class Pycnogonida, with 297 species described in 20 genera. Its monophyly and intergeneric relationships have been highly debated in previous studies. Here, we investigated the phylogeny of Ammotheidae using specimens from poorly studied areas. We sequenced the mitochondrial gene encoding the first subunit of cytochrome c oxidase (CO1) from 104 specimens. The complete nuclear 18S rRNA gene was sequenced from a selection of 80 taxa to provide further phylogenetic signal. The base composition in CO1 shows a higher heterogeneity in Ammotheidae than in other families, which may explain their apparent polyphyly in the CO1 tree. Although deeper nodes of the tree receive no statistical support, Ammotheidae was found to be monophyletic and divided into two clades, here defined as distinct subfamilies: Achelinae comprises the genera Achelia Hodge, 1864, Ammothella Verrill, 1900, Nymphopsis Haswell, 1884 and Tanystylum Miers, 1879; and Ammotheinae includes the genera Ammothea Leach, 1814, Acheliana Arnaud, 1971, Cilunculus Loman, 1908, Sericosura Fry & Hedgpeth, 1969 and also Teratonotum gen. nov., including so far only the type species Ammothella stauromata Child, 1982. The species Cilunculus gracilis Nakamura & Child, 1991 is reassigned to Ammothella, forming the binomen Ammothella gracilis (Nakamura & Child, 1991) comb. nov. Additional taxonomic re-arrangements are suggested for the genera Achelia, Acheliana, Ammothella and Cilunculus.


2020 ◽  
Author(s):  
Gus Waneka ◽  
Yumary M. Vasquez ◽  
Gordon M. Bennett ◽  
Daniel B. Sloan

ABSTRACTCompared to free-living bacteria, endosymbionts of sap-feeding insects have tiny and rapidly evolving genomes. Increased genetic drift, high mutation rates, and relaxed selection associated with host control of key cellular functions all likely contribute to genome decay. Phylogenetic comparisons have revealed massive variation in endosymbiont evolutionary rate, but such methods make it difficult to partition the effects of mutation vs. selection. For example, the ancestor of auchenorrhynchan insects contained two obligate endosymbionts, Sulcia and a betaproteobacterium (BetaSymb; called Nasuia in leafhoppers) that exhibit divergent rates of sequence evolution and different propensities for loss and replacement in the ensuing ∼300 Ma. Here, we use the auchenorrhynchan leafhopper Macrosteles sp. nr. severini, which retains both of the ancestral endosymbionts, to test the hypothesis that differences in evolutionary rate are driven by differential mutagenesis. We used a high-fidelity technique known as duplex sequencing to measure and compare low-frequency variants in each endosymbiont. Our direct detection of de novo mutations reveals that the rapidly evolving endosymbiont (Nasuia) has a much higher frequency of single-nucleotide variants than the more stable endosymbiont (Sulcia) and a mutation spectrum that is even more AT-biased than implied by the 83.1% AT content of its genome. We show that indels are common in both endosymbionts but differ substantially in length and distribution around repetitive regions. Our results suggest that differences in long-term rates of sequence evolution in Sulcia vs. BetaSymb, and perhaps the contrasting degrees of stability of their relationships with the host, are driven by differences in mutagenesis.SIGNIFICANCE STATEMENTTwo ancient endosymbionts in the same host lineage display stark differences in genome conservation over phylogenetic scales. We show the rapidly evolving endosymbiont has a higher frequency of mutations, as measured with duplex sequencing. Therefore, differential mutagenesis likely drives evolutionary rate variation in these endosymbionts.


2021 ◽  
Author(s):  
Euki Yazaki ◽  
Akinori Yabuki ◽  
Ayaka Imaizumi ◽  
Keitaro Kume ◽  
Tetsuo Hashimoto ◽  
...  

AbstractAs-yet-undescribed branches in the tree of eukaryotes are potentially represented by some of “orphan” protists (unicellular micro-eukaryotes), of which phylogenetic affiliations have not been clarified in previous studies. By clarifying the phylogenetic positions of orphan protists, we may fill the previous gaps in the diversity of eukaryotes and further uncover the novel affiliation between two (or more) major lineages in eukaryotes. Microheliella maris was originally described as a member of the phylum Heliozoa, but a pioneering large-scale phylogenetic analysis failed to place this organism within the previously described species/lineages with confidence. In this study, we analyzed a 319-gene alignment and demonstrated that M. maris represents a basal lineage of one of the major eukaryotic lineages, Cryptista. We here propose a new clade name “Pancryptista” for Cryptista plus M. maris. The 319-gene analyses also indicated that M. maris is a key taxon to recover the monophyly of Archaeplastida and the sister relationship between Archaeplastida and Pancryptista, which is collectively called as “CAM clade” here. Significantly, Cryptophyceae tend to be attracted to Rhodophyta depending on the taxon sampling (ex., in the absence of M. maris and Rhodelphidia) and the particular phylogenetic “signal” most likely hindered the stable recovery of the monophyly of Archaeplastida in previous studies. We hypothesize that many cryptophycean genes (including those in the 319-gene alignment) recombined partially with the homologous genes transferred from the red algal endosymbiont during secondary endosymbiosis and bear a faint phylogenetic affinity to the rhodophytan genes.


2017 ◽  
Author(s):  
Brigitte Boeckmann ◽  
David Dylus ◽  
Sebastien Moretti ◽  
Adrian Altenhoff ◽  
Clément-Marie Train ◽  
...  

AbstractMedium to large phylogenetic gene trees constructed from datasets of different species density and taxonomic range are rarely topologically consistent because of missing phylogenetic signal, non-phylogenetic signal and error. In this study, we first use simulations to show that taxon sampling unequally affects nodes in a gene tree, which likely contributes to controversial conclusions from taxon sampling experiments and contradicting species phylogenies such as for the boreoeutherians. Hence, because it is unlikely that a large gene tree can be reconstructed correctly based on a single optimized dataset, we take a two-step approach for the construction of model gene trees. First, stable and unstable clades are identified by comparing phylogenetic trees inferred from multiple datasets and data types (nucleotide, amino acid, codon) from the same gene family. Subsequently, data subsets are optimized for the analysis of individual uncertain clades. Results are summarized in form of a model tree that illustrates the evolutionary relationship of gene loci. A case study shows how a seemingly complex gene phylogeny becomes increasingly consistent with the reference species tree by attentive taxon sampling and subtree analysis. The procedure is progressively introduced to SwissTree (http://swisstree.vital-it.ch), a resource of high confidence model gene (locus) trees. Finally we demonstrate the usefulness of SwissTree for orthology benchmarking.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9554
Author(s):  
Patrick Evans ◽  
Nancy J. Cox ◽  
Eric R. Gamazon

The development of explanatory models of protein sequence evolution has broad implications for our understanding of cellular biology, population history, and disease etiology. Here we analyze the GTEx transcriptome resource to quantify the effect of the transcriptome on protein sequence evolution in a multi-tissue framework. We find substantial variation among the central nervous system tissues in the effect of expression variance on evolutionary rate, with highly variable genes in the cortex showing significantly greater purifying selection than highly variable genes in subcortical regions (Mann–Whitney U p = 1.4 × 10−4). The remaining tissues cluster in observed expression correlation with evolutionary rate, enabling evolutionary analysis of genes in diverse physiological systems, including digestive, reproductive, and immune systems. Importantly, the tissue in which a gene attains its maximum expression variance significantly varies (p = 5.55 × 10−284) with evolutionary rate, suggesting a tissue-anchored model of protein sequence evolution. Using a large-scale reference resource, we show that the tissue-anchored model provides a transcriptome-based approach to predicting the primary affected tissue of developmental disorders. Using gradient boosted regression trees to model evolutionary rate under a range of model parameters, selected features explain up to 62% of the variation in evolutionary rate and provide additional support for the tissue model. Finally, we investigate several methodological implications, including the importance of evolutionary-rate-aware gene expression imputation models using genetic data for improved search for disease-associated genes in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome-based analysis of a range of factors that may constrain molecular evolution and proposes a novel framework for the study of gene function and disease mechanism.


Sign in / Sign up

Export Citation Format

Share Document