tree estimation
Recently Published Documents


TOTAL DOCUMENTS

131
(FIVE YEARS 44)

H-INDEX

27
(FIVE YEARS 3)

Biologia ◽  
2021 ◽  
Author(s):  
Tanja Plieger ◽  
Matthias Wolf

AbstractProtothecosis is an infectious disease caused by organisms currently classified within the green algal genus Prototheca. The disease can manifest as cutaneous lesions, olecranon bursitis or disseminated or systemic infections in both immunocompetent and immunosuppressed patients. Concerning diagnostics, taxonomic validity is important. Prototheca, closely related to the Chlorella species complex, is known to be polyphyletic, branching with Auxenochlorella and Helicosporidium. The phylogeny of Prototheca was discussed and revisited several times in the last decade; new species have been described. Phylogenetic analyses were performed using ribosomal DNA (rDNA) and partial mitochondrial cytochrome b (cytb) sequence data. In this work we use Internal Transcribed Spacer 2 (ITS2) as well as 18S rDNA data. However, for the first time, we reconstruct phylogenetic relationships of Prototheca using primary sequence and RNA secondary structure information simultaneously, a concept shown to increase robustness and accuracy of phylogenetic tree estimation. Using encoded sequence-structure data, Neighbor-Joining, Maximum-Parsimony and Maximum-Likelihood methods yielded well-supported trees in agreement with other trees calculated on rDNA; but differ in several aspects from trees using cytb as a phylogenetic marker. ITS2 secondary structures of Prototheca sequences are in agreement with the well-known common core structure of eukaryotes but show unusual differences in their helix lengths. An elongation of the fourth helix of some species seems to have occurred independently in the course of evolution.


2021 ◽  
Author(s):  
Max Hill ◽  
Sebastien Roch

We consider species tree estimation from multiple loci subject to intralocus recombination. We focus on R∗, a summary coalescent-based methods using rooted triplets. We demonstrate analytically that intralocus recombination gives rise to an inconsistency zone, in which correct inference is not assured even in the limit of infinite amount of data. In addition, we validate and characterize this inconsistency zone through a simulation study that suggests that differential rates of recombination between closely related taxa can amplify the effect of incomplete lineage sorting and contribute to inconsistency.


2021 ◽  
Author(s):  
Sazan Mahbub ◽  
Shashata Sawmya ◽  
Arpita Saha ◽  
Rezwana Reaz ◽  
M. Sohel Rahman ◽  
...  

Species tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, for a combination of reasons (ranging from sampling biases to more biological causes, as in gene birth and loss), gene trees are often incomplete, meaning that not all species of interest have a common set of genes. Incomplete gene trees can potentially impact the accuracy of phylogenomic inference. We, for the first time, introduce the problem of imputing the quartet distribution induced by a set of incomplete gene trees, which involves adding the missing quartets back to the quartet distribution. We present QT-GILD, an automated and specially tailored unsupervised deep learning technique, accompanied by cues from natural language processing (NLP), which learns the quartet distribution in a given set of incomplete gene trees and generates a complete set of quartets accordingly. QT-GILD is a general-purpose technique needing no explicit modeling of the subject system or reasons for missing data or gene tree heterogeneity. Experimental studies on a collection of simulated and empirical data sets suggest that QT-GILD can effectively impute the quartet distribution, which results in a dramatic improvement in the species tree accuracy. Remarkably, QT-GILD not only imputes the missing quartets but it can also account for gene tree estimation error. Therefore, QT-GILD advances the state-of-the-art in species tree estimation from gene trees in the face of missing data. QT-GILD is freely available in open source form at https://github.com/pythonLoader/QT-GILD .


2021 ◽  
Author(s):  
Theo Sanderson

Phylogenetic trees are an important tool for interpreting sequenced genomes, and their interrelationships. Estimating the date associated with each node of such a phylogeny creates a "time tree", which can be especially useful for visualising and analysing evolution of organisms such as viruses. Several tools have been developed for time-tree estimation, but the sequencing explosion in response to the SARS-CoV-2 pandemic has created phylogenies so large as to prevent the application of these previous approaches to full datasets. Here we introduce Chronumental, a tool that can rapidly infer time trees from phylogenies featuring large numbers of nodes. Chronumental uses stochastic gradient descent to identify lengths of time for tree branches which maximise the evidence lower bound under a probabilistic model, implemented in a framework which can be compiled into XLA for rapid computation. We show that Chronumental scales to phylogenies featuring millions of nodes, with chronological predictions made in minutes, and is able to accurately predict the dates of nodes for which it is not provided with metadata.


Author(s):  
Siavash Mirarab ◽  
Luay Nakhleh ◽  
Tandy Warnow

Species tree estimation is a basic part of many biological research projects, ranging from answering basic evolutionary questions (e.g., how did a group of species adapt to their environments?) to addressing questions in functional biology. Yet, species tree estimation is very challenging, due to processes such as incomplete lineage sorting, gene duplication and loss, horizontal gene transfer, and hybridization, which can make gene trees differ from each other and from the overall evolutionary history of the species. Over the last 10–20 years, there has been tremendous growth in methods and mathematical theory for estimating species trees and phylogenetic networks, and some of these methods are now in wide use. In this survey, we provide an overview of the current state of the art, identify the limitations of existing methods and theory, and propose additional research problems and directions. Expected final online publication date for the Annual Review of Ecology, Evolution, and Systematics, Volume 52 is November 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2021 ◽  
Author(s):  
Sumaira Zaman ◽  
Samuel Sledzieski ◽  
Bonnie Berger ◽  
Yi-Chieh Wu ◽  
Mukul S. Bansal

An accurate understanding of the evolutionary history of rapidly-evolving viruses like SARS-CoV-2, responsible for the COVID-19 pandemic, is crucial to tracking and preventing the spread of emerging pathogens. However, viruses undergo frequent recombination, which makes it difficult to trace their evolutionary history using traditional phylogenetic methods. Here, we present a phylogenetic workflow, virDTL, for analyzing viral evolution in the presence of recombination. Our approach leverages reconciliation methods developed for inferring horizontal gene transfer in prokaryotes, and, compared to existing tools, is uniquely able to identify ancestral recombinations while accounting for several sources of inference uncertainty, including in the construction of a strain tree, estimation and rooting of gene family trees, and reconciliation itself. We apply this workflow to the Sarbecovirus subgenus and demonstrate how a principled analysis of predicted recombination gives insight into the evolution of SARS-CoV-2. In addition to providing confirming evidence for the horseshoe bat as its zoonotic origin, we identify several ancestral recombination events that merit further study.


Author(s):  
Fadhah Amer Alanazi

Uncovering hidden mixture dependencies among variables has been investigated in the literature using mixture R-vine copula models. They provide considerable flexibility for modeling multivariate data. As the dimensions increase, the number of the model parameters that need to be estimated is increased dramatically, which comes along with massive computational times and efforts. This situation becomes even much more complex and complicated in the regular vine copula mixture models. Incorporating the truncation method with a mixture of regular vine models will reduce the computation difficulty for the mixture-based models. In this paper, the tree-by-tree estimation mixture model is joined with the truncation method to reduce computational time and the number of parameters that need to be estimated in the mixture vine copula models. A simulation study and real data applications illustrated the performance of the method. In addition, the real data applications show the effect of the mixture components on the truncation level.


Algorithms ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 148
Author(s):  
Minhyuk Park ◽  
Paul Zaharias ◽  
Tandy Warnow

The estimation of phylogenetic trees for individual genes or multi-locus datasets is a basic part of considerable biological research. In order to enable large trees to be computed, Disjoint Tree Mergers (DTMs) have been developed; these methods operate by dividing the input sequence dataset into disjoint sets, constructing trees on each subset, and then combining the subset trees (using auxiliary information) into a tree on the full dataset. DTMs have been used to advantage for multi-locus species tree estimation, enabling highly accurate species trees at reduced computational effort, compared to leading species tree estimation methods. Here, we evaluate the feasibility of using DTMs to improve the scalability of maximum likelihood (ML) gene tree estimation to large numbers of input sequences. Our study shows distinct differences between the three selected ML codes—RAxML-NG, IQ-TREE 2, and FastTree 2—and shows that good DTM pipeline design can provide advantages over these ML codes on large datasets.


2021 ◽  
Vol 26 (2) ◽  
pp. 37
Author(s):  
Noah Giansiracusa

The voting patterns of the nine justices on the United States Supreme Court continue to fascinate and perplex observers of the Court. While it is commonly understood that the division of the justices into a liberal branch and a conservative branch inevitably drives many case outcomes, there are finer, less transparent divisions within these two main branches that have proven difficult to extract empirically. This study imports methods from evolutionary biology to help illuminate the intricate and often overlooked branching structure of the justices’ voting behavior. Specifically, phylogenetic tree estimation based on voting disagreement rates is used to extend ideal point estimation to the non-Euclidean setting of hyperbolic metrics. After introducing this framework, comparing it to one- and two-dimensional multidimensional scaling, and arguing that it flexibly captures important higher-dimensional voting behavior, a handful of potential ways to apply this tool are presented. The emphasis throughout is on interpreting these judicial trees and extracting qualitative insights from them.


2021 ◽  
Vol 7 (3) ◽  
pp. 191
Author(s):  
Vladimír Antonín ◽  
Hana Ševčíková ◽  
Roberto Para ◽  
Ondrej Ďuriška ◽  
Tomáš Kudláček ◽  
...  

Melanoleuca is one of the taxonomically most complicated genera of Agaricomycetes with several taxonomically lineages. The subgenus Urticocystis of the genus Melanoleuca contains species with either urticoid or absent cheilocystidia. In this paper, three new European species, Melanoleuca galbuserae, Melanoleuca fontenlae, and Melanoleuca acystidiata are described as new to science. Melanoleuca galbuserae, related to Melanoleuca stepposa and Melanoleuca tristis, was discovered in alpine grasslands in North Italy. The type specimens and recent collections of Melanoleuca angelesiana, Melanoleuca castaneofusca, Melanoleuca luteolosperma, Melanoleuca pseudopaedida, and Melanoleuca robertiana were sequenced and morphologically examined. Moreover, the related Melanoleuca microcephala and Melanoleuca paedida were included in morphological examination and DNA sequence analyses. All the species were delimited by macro- and micromorphological characters and the multigene phylogenetic analyses of a combined (ITS, rpb2, and tef1) dataset on the basis of the species tree estimation. In accordance with new molecular and morphological data, we suggest taxonomic reappraisal of M. pseudopaedida and M. robertiana, and M. fontenlae and M. acystidiata are proposed as new species. The differences between the type material of M. angelesiana from the USA and European M. angelesiana specimens are discussed.


Sign in / Sign up

Export Citation Format

Share Document