scholarly journals Polynomial Phylogenetic Analysis of Tree Shapes

2020 ◽  
Author(s):  
Pengyu Liu ◽  
Priscila Biller ◽  
Matthew Gould ◽  
Caroline Colijn

AbstractPhylogenetic trees are a central tool in evolutionary biology. They demonstrate evolutionary patterns among species, genes, and with modern sequencing technologies, patterns of ancestry among sets of individuals. Phylogenetic trees usually consist of tree shapes, branch lengths and partial labels. Comparing tree shapes is a challenging aspect of comparing phylogenetic trees as there are few tools to describe tree shapes in a quantitative, accurate, comprehensive and easy-to-interpret way. Current methods to compare tree shapes are often based on scalar indices reflecting tree imbalance, and on frequencies of small subtrees. In this paper, we present tree comparisons and applications based on a polynomial that fully characterizes trees. Polynomials are important tools to describe discrete structures and have been used to study various objects including graphs and knots. There are also polynomials that describe rooted trees. We use tree-defining polynomials to compare tree shapes randomly generated by simulations and tree shapes reconstructed from data. Moreover, we show that the comparisons can be used to estimate parameters and to select the best-fit model that generates specific tree shapes.

ZooKeys ◽  
2022 ◽  
Vol 1081 ◽  
pp. 111-125
Author(s):  
Wenjing Li ◽  
Ning Qiu ◽  
Hejun Du

Rhodeus cyanorostris Li, Liao & Arai, 2020 is a freshwater fish that is endemic to China and restricted to Chengdu City in Sichuan Province. This study is the first to sequence and characterize the complete mitochondrial genome of R. cyanorostris. The mitogenome of R. cyanorostris is 16580 bp in length, including 13 protein-coding genes, two rRNA genes, 22 tRNA genes, and a control region (D-loop). The base composition of the sequence is 28.5% A, 27.6% C, 26.4% T, and 17.5% G, with a bias toward A+T. The genome structure, nucleotide composition, and codon usage of the mitogenome of R. cyanorostris are consistent with those of other species of Rhodeus. To verify the molecular phylogeny of the genus Rhodeus, we provide new insights to better understand the taxonomic status of R. cyanorostris. The phylogenetic trees present four major clades based on 19 mitogenomic sequences from 16 Rhodeus species. Rhodeus cyanorostris exhibits the closest phylogenetic relationship with R. pseudosericeus, R. amarus, and R. sericeus. This study discloses the complete mitochondrial genome sequence of R. cyanorostris for the first time and provides the most comprehensive phylogenetic reconstruction of the genus Rhodeus based on whole mitochondrial genome sequences. The information obtained in this study will provide new insights for conservation, phylogenetic analysis, and evolutionary biology research.


Author(s):  
Vadim Puller ◽  
Pavel Sagulenko ◽  
Richard A. Neher

AbstractNatural selection imposes a complex filter on which variants persist in a population resulting in evolutionary patterns that vary greatly along the genome. Some sites evolve close to neutrally, while others are highly conserved, allow only specific states or only change in concert with other sites. Most commonly used evolutionary models, however, ignore much of this complexity and at best account for variation in the rate at which different sites change. Here, we present an efficient algorithm to estimate more complex models that allow for site-specific preferences and explore the accuracy at which such models can be estimated from simulated data. We find that an iterative approximate maximum likelihood scheme uses information in the data efficiently and accurately estimates site-specific preferences from large data sets with moderately diverged sequences. Ignoring site-specific preferences during estimation of branch length of phylogenetic trees – an assumption of most phylogeny software – results in substantial underestimation comparable to the error incurred when ignoring rate variation. However, the joint estimation of branch lengths, site-specific rates, and site-specific preferences can suffer from identifiability problems and is typically unable to recover the correct branch lengths. Site-specific preferences estimated from large HIV pol alignments show qualitative concordance with intra-host estimates of fitness costs. Analysis of site-specific HIV substitution models suggests near saturation of divergence after a few hundred years. Such saturation can explain the inability to infer deep divergence times of HIV and SIVs using molecular clock approaches and time-dependent rate estimates.


2018 ◽  
Vol 18 (3-4) ◽  
pp. 656-672 ◽  
Author(s):  
THANH HAI NGUYEN ◽  
ENRICO PONTELLI ◽  
TRAN CAO SON

AbstractEvolutionary Biologists have long struggled with the challenge of developing analysis workflows in a flexible manner, thus facilitating the reuse of phylogenetic knowledge. An evolutionary biology workflow can be viewed as a plan which composes web services that can retrieve, manipulate, and produce phylogenetic trees. The Phylotastic project was launched two years ago as a collaboration between evolutionary biologists and computer scientists, with the goal of developing an open architecture to facilitate the creation of such analysis workflows. While composition of web services is a problem that has been extensively explored in the literature, including within the logic programming domain, the incarnation of the problem in Phylotastic provides a number of additional challenges. Along with the need to integrate preferences and formal ontologies in the description of the desired workflow, evolutionary biologists tend to construct workflows in an incremental manner, by successively refining the workflow, by indicating desired changes (e.g., exclusion of certain services, modifications of the desired output). This leads to the need of successive iterations of incremental replanning, to develop a new workflow that integrates the requested changes while minimizing the changes to the original workflow. This paper illustrates how Phylotastic has addressed the challenges of creating and refining phylogenetic analysis workflows using logic programming technology and how such solutions have been used within the general framework of the Phylotastic project.


2012 ◽  
Vol 39 (2) ◽  
pp. 217-233 ◽  
Author(s):  
J. David Archibald

Studies of the origin and diversification of major groups of plants and animals are contentious topics in current evolutionary biology. This includes the study of the timing and relationships of the two major clades of extant mammals – marsupials and placentals. Molecular studies concerned with marsupial and placental origin and diversification can be at odds with the fossil record. Such studies are, however, not a recent phenomenon. Over 150 years ago Charles Darwin weighed two alternative views on the origin of marsupials and placentals. Less than a year after the publication of On the origin of species, Darwin outlined these in a letter to Charles Lyell dated 23 September 1860. The letter concluded with two competing phylogenetic diagrams. One showed marsupials as ancestral to both living marsupials and placentals, whereas the other showed a non-marsupial, non-placental as being ancestral to both living marsupials and placentals. These two diagrams are published here for the first time. These are the only such competing phylogenetic diagrams that Darwin is known to have produced. In addition to examining the question of mammalian origins in this letter and in other manuscript notes discussed here, Darwin confronted the broader issue as to whether major groups of animals had a single origin (monophyly) or were the result of “continuous creation” as advocated for some groups by Richard Owen. Charles Lyell had held similar views to those of Owen, but it is clear from correspondence with Darwin that he was beginning to accept the idea of monophyly of major groups.


Studies of animal behavior often assume that all members of a species exhibit the same behavior. Geographic Variation in Behavior shows that, on the contrary, there is substantional variation within species across a wide range of taxa. Including work from pioneers in the field, this volume provides a balanced overview of research on behavioral characteristics that vary geographically. The authors explore the mechanisms by which behavioral differences evolve and examine related methodological issues. Taken together, the work collected here demonstrates that genetically based geographic variation may be far more widespread than previously suspected. The book also shows how variation in behavior can illuminate both behavioral evolution and general evolutionary patterns. Unique among books on behavior in its emphasis on geographic variation, this volume is a valuable new resource for students and researchers in animal behavior and evolutionary biology.


Insects ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 668
Author(s):  
Tinghao Yu ◽  
Yalin Zhang

More studies are using mitochondrial genomes of insects to explore the sequence variability, evolutionary traits, monophyly of groups and phylogenetic relationships. Controversies remain on the classification of the Mileewinae and the phylogenetic relationships between Mileewinae and other subfamilies remain ambiguous. In this study, we present two newly completed mitogenomes of Mileewinae (Mileewa rufivena Cai and Kuoh 1997 and Ujna puerana Yang and Meng 2010) and conduct comparative mitogenomic analyses based on several different factors. These species have quite similar features, including their nucleotide content, codon usage of protein genes and the secondary structure of tRNA. Gene arrangement is identical and conserved, the same as the putative ancestral pattern of insects. All protein-coding genes of U. puerana began with the start codon ATN, while 5 Mileewa species had the abnormal initiation codon TTG in ND5 and ATP8. Moreover, M. rufivena had an intergenic spacer of 17 bp that could not be found in other mileewine species. Phylogenetic analysis based on three datasets (PCG123, PCG12 and AA) with two methods (maximum likelihood and Bayesian inference) recovered the Mileewinae as a monophyletic group with strong support values. All results in our study indicate that Mileewinae has a closer phylogenetic relationship to Typhlocybinae compared to Cicadellinae. Additionally, six species within Mileewini revealed the relationship (U. puerana + (M. ponta + (M. rufivena + M. alara) + (M. albovittata + M. margheritae))) in most of our phylogenetic trees. These results contribute to the study of the taxonomic status and phylogenetic relationships of Mileewinae.


Pathogens ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 41
Author(s):  
Marcos Godoy ◽  
Daniel A. Medina ◽  
Rudy Suarez ◽  
Sandro Valenzuela ◽  
Jaime Romero ◽  
...  

Piscine orthoreovirus (PRV) belongs to the family Reoviridae and has been described mainly in association with salmonid infections. The genome of PRV consists of about 23,600 bp, with 10 segments of double-stranded RNA, classified as small (S1 to S4), medium (M1, M2 and M3) and large (L1, L2 and L3); these range approximately from 1000 bp (segment S4) to 4000 bp (segment L1). How the genetic variation among PRV strains affects the virulence for salmonids is still poorly understood. The aim of this study was to describe the molecular phylogeny of PRV based on an extensive sequence analysis of the S1 and M2 segments of PRV available in the GenBank database to date (May 2020). The analysis was extended to include new PRV sequences for S1 and M2 segments. In addition, subgenotype classifications were assigned to previously published unclassified sequences. It was concluded that the phylogenetic trees are consistent with the original classification using the PRV genomic segment S1, which differentiates PRV into two major genotypes, I and II, and each of these into two subgenotypes, designated as Ia and Ib, and IIa and IIb, respectively. Moreover, some clusters of country- and host-specific PRV subgenotypes were observed in the subset of sequences used. This work strengthens the subgenotype classification of PRV based on the S1 segment and can be used to enhance research on the virulence of PRV.


Biomolecules ◽  
2019 ◽  
Vol 9 (10) ◽  
pp. 572 ◽  
Author(s):  
Wang

MicroRNA (miRNA) is a small non-coding RNA that functions in the epigenetics control of gene expression, which can be used as a useful biomarker for diseases. Anti-NMDA receptor (anti-NMDAR) encephalitis is an acute autoimmune disorder. Some patients have been found to have tumors, specifically teratomas. This disease occurs more often in females than in males. Most of them have a significant recovery after tumor resection, which shows that the tumor may induce anti-NMDAR encephalitis. In this study, I review microRNA (miRNA) biomarkers that are associated with anti-NMDAR encephalitis and related tumors, respectively. To the best of my knowledge, there has not been any research in the literature investigating the relationship between anti-NMDAR encephalitis and tumors through their miRNA biomarkers. I adopt a phylogenetic analysis to plot the phylogenetic trees of their miRNA biomarkers. From the analyzed results, it may be concluded that (i) there is a relationship between these tumors and anti-NMDAR encephalitis, and (ii) this disease occurs more often in females than in males. This sheds light on this issue through miRNA intervention.


2019 ◽  
Vol 1 (1) ◽  
Author(s):  
D C Blackburn ◽  
G Giribet ◽  
D E Soltis ◽  
E L Stanley

Abstract Although our inventory of Earth’s biodiversity remains incomplete, we still require analyses using the Tree of Life to understand evolutionary and ecological patterns. Because incomplete sampling may bias our inferences, we must evaluate how future additions of newly discovered species might impact analyses performed today. We describe an approach that uses taxonomic history and phylogenetic trees to characterize the impact of past species discoveries on phylogenetic knowledge using patterns of branch-length variation, tree shape, and phylogenetic diversity. This provides a framework for assessing the relative completeness of taxonomic knowledge of lineages within a phylogeny. To demonstrate this approach, we use recent large phylogenies for amphibians, reptiles, flowering plants, and invertebrates. Well-known clades exhibit a decline in the mean and range of branch lengths that are added each year as new species are described. With increased taxonomic knowledge over time, deep lineages of well-known clades become known such that most recently described new species are added close to the tips of the tree, reflecting changing tree shape over the course of taxonomic history. The same analyses reveal other clades to be candidates for future discoveries that could dramatically impact our phylogenetic knowledge. Our work reveals that species are often added non-randomly to the phylogeny over multiyear time-scales in a predictable pattern of taxonomic maturation. Our results suggest that we can make informed predictions about how new species will be added across the phylogeny of a given clade, thus providing a framework for accommodating unsampled undescribed species in evolutionary analyses.


2019 ◽  
Author(s):  
Patrick Monnahan ◽  
Yaniv Brandvain

AbstractSearching for population genomic signals left behind by positive selection is a major focus of evolutionary biology, particularly as sequencing technologies develop and costs decline. The effect of the number of chromosome copies (i.e. ploidy) on the manifestation of these signals remains an outstanding question, despite a wide appreciation of ploidy being a fundamental parameter governing numerous biological processes. We clarify the principal forces governing the differential manifestation and persistence of the signal of selection by separating the effects of polyploidy on rates of fixation versus rates of diversity (i.e. mutation and recombination) with a set of coalescent simulations. We explore what the major consequences of polyploidy, such as a more localized signal, greater dependence on dominance, and longer persistence of the signal following fixation, mean for within- and across-ploidy inference on the strength and prevalence of selective sweeps. As genomic advances continue to open doors for interrogating natural systems, studies such as this aid our ability to anticipate, interpret, and compare data across ploidy levels.


Sign in / Sign up

Export Citation Format

Share Document