scholarly journals A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species

Author(s):  
Benjamin D Redelings ◽  
Mark T Holder

We present a new supertree method that enables rapid estimation of a summary tree on the scale of millions of leaves. This supertree method summarizes a collection of input phylogenies and an input taxonomy. We introduce formal goals and criteria for such a supertree to satisfy in order to transparently and justifiably represent the input trees. In addition to producing a supertree, our method computes annotations that describe which grouping in the input trees support and conflict with each group in the supertree. We compare our supertree construction method to a previously published supertree construction method by assessing their performance on input trees used to construct the Open Tree of Life version 4, and find that our method increases the number of displayed input splits from 35,518 to 39,639 and decreases the number of conflicting input splits from 2,760 to 1,357. The new supertree method also improves on the previous supertree construction method in that it produces no unsupported branches and avoids unnecessary polytomies. This pipeline is currently used by the Open Tree of Life project to produce all of the versions of project's "synthetic tree" starting at version 5. This software pipeline is called "propinquity". It relies heavily on "otcetera" - a set of C++ tools to perform most of the steps of the pipeline. All of the components are free software and are available on GitHub.

2016 ◽  
Author(s):  
Benjamin D Redelings ◽  
Mark T Holder

We present a new supertree method that enables rapid estimation of a summary tree on the scale of millions of leaves. This supertree method summarizes a collection of input phylogenies and an input taxonomy. We introduce formal goals and criteria for such a supertree to satisfy in order to transparently and justifiably represent the input trees. In addition to producing a supertree, our method computes annotations that describe which grouping in the input trees support and conflict with each group in the supertree. We compare our supertree construction method to a previously published supertree construction method by assessing their performance on input trees used to construct the Open Tree of Life version 4, and find that our method increases the number of displayed input splits from 35,518 to 39,639 and decreases the number of conflicting input splits from 2,760 to 1,357. The new supertree method also improves on the previous supertree construction method in that it produces no unsupported branches and avoids unnecessary polytomies. This pipeline is currently used by the Open Tree of Life project to produce all of the versions of project's "synthetic tree" starting at version 5. This software pipeline is called "propinquity". It relies heavily on "otcetera" - a set of C++ tools to perform most of the steps of the pipeline. All of the components are free software and are available on GitHub.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3058 ◽  
Author(s):  
Benjamin D. Redelings ◽  
Mark T. Holder

We present a new supertree method that enables rapid estimation of a summary tree on the scale of millions of leaves. This supertree method summarizes a collection of input phylogenies and an input taxonomy. We introduce formal goals and criteria for such a supertree to satisfy in order to transparently and justifiably represent the input trees. In addition to producing a supertree, our method computes annotations that describe which grouping in the input trees support and conflict with each group in the supertree. We compare our supertree construction method to a previously published supertree construction method by assessing their performance on input trees used to construct the Open Tree of Life version 4, and find that our method increases the number of displayed input splits from 35,518 to 39,639 and decreases the number of conflicting input splits from 2,760 to 1,357. The new supertree method also improves on the previous supertree construction method in that it produces no unsupported branches and avoids unnecessary polytomies. This pipeline is currently used by the Open Tree of Life project to produce all of the versions of project’s “synthetic tree” starting at version 5. This software pipeline is called “propinquity”. It relies heavily on “otcetera”—a set of C++ tools to perform most of the steps of the pipeline. All of the components are free software and are available on GitHub.


2020 ◽  
pp. 135910452095281
Author(s):  
Lucy Casdagli ◽  
Glenda Fredman ◽  
Ellie Huckle ◽  
Ella Mahony ◽  
Deborah Christie

This paper describes the involvement of peer trainers in Tree of Life groups for young people living with Type 1 Diabetes. The approach is informed by narrative therapy and collective narrative practice and principles, where people are seen as separate from problems and the focus is on creating opportunities for people to tell and witness one another’s preferred identity stories. Young people who have participated in a Tree of Life day are invited to join the project as peer trainers who help facilitate, engage group participants, witness their stories and consult to the project. Involving peer trainers also aims to create a community where preferred identity stories can be lived and witnessed. This paper describes the training for peer trainers and the building of community.


2015 ◽  
Vol 370 (1662) ◽  
pp. 20140009 ◽  
Author(s):  
Christopher L. Owen ◽  
Heather Bracken-Grissom ◽  
David Stern ◽  
Keith A. Crandall

Phylogenetic systematics is heading for a renaissance where we shift from considering our phylogenetic estimates as a static image in a published paper and taxonomies as a hardcopy checklist to treating both the phylogenetic estimate and dynamic taxonomies as metadata for further analyses. The Open Tree of Life project ( opentreeoflife.org ) is developing synthesis tools for harnessing the power of phylogenetic inference and robust taxonomy to develop a synthetic tree of life. We capitalize on this approach to estimate a synthesis tree for the freshwater crayfish. The crayfish make an exceptional group to demonstrate the utility of the synthesis approach, as there recently have been a number of phylogenetic studies on the crayfishes along with a robust underlying taxonomic framework. Importantly, the crayfish have also been extensively assessed by an IUCN Red List team and therefore have accurate and up-to-date area and conservation status data available for analysis within a phylogenetic context. Here, we develop a synthesis phylogeny for the world's freshwater crayfish and examine the phylogenetic distribution of threat. We also estimate a molecular phylogeny based on all available GenBank crayfish sequences and use this tree to estimate divergence times and test for divergence rate variation. Finally, we conduct EDGE and HEDGE analyses and identify a number of species of freshwater crayfish of highest priority in conservation efforts.


2018 ◽  
Vol 2 ◽  
pp. e25727
Author(s):  
Emily Jane McTavish ◽  
Mark Holder ◽  
Karen Cranston

The Open Tree of Life project is a collaborative effort to synthesize, share and update a comprehensive tree of life Fig. 1. We have completed a draft synthesis of a tree summarizing digitally available taxonomic and phylogenetic knowledge for all 2.6 million named species, available at tree.opentreeoflife.org Hinchliff et al. 2015. . . This tree provides ready access to phylogenetic information which can link together biodiversity data on the basis of what we know about relevant evolutionary history. Both the unified reference taxonomy Rees and Cranston 2017 and the published phylogenetic statements underlying the tree McTavish et al. 2015 are available and accessible online. Taxa in the phylogenies are mapped to the the reference taxonomy, which aligns Open Tree taxon identifiers to those from NCBI and GBIF, among several other taxonomy resources. The synthesis tree is revised as new data become available, and captures conflict and consensus across different published phylogenetic estimates. This undertaking requires both development of novel infrastructure and analysis tools, as well as community engagement with the Open Tree of Life project. I will discuss the challenges in and the progress towards achieving these goals.


Author(s):  
François Michonneau ◽  
Joseph W. Brown ◽  
David Winter

1. While phylogenies have been getting easier to build, it has been difficult to re-use, combine, and synthesize the information they provide because published trees are often only available as image files, and taxonomic information is not standardized across studies. 2. The Open Tree of Life (OTL) project addresses these issues by providing a digital tree that encompasses all organisms, built by combining taxonomic information and published phylogenies. The project also provides tools and services to query and download parts of this synthetic tree, as well as the source data used to build it. Here, we present rotl, an R package to search and download data from the Open Tree of Life directly in R. 3. rotl uses common data structures allowing researchers to take advantage of the rich set of tools and methods that are available in R to manipulate, analyze, and visualize phylogenies. Here, and in the vignettes accompanying the package, we demonstrate how rotl can be used with other R packages to analyze biodiversity data. 4. As phylogenies are being used in a growing number of applications, rotl facilitates access to phylogenetic data, and allows their integration with statistical methods and data sources available in R.


Author(s):  
Luna L. Sanchez Reyes ◽  
Martha Kandziora ◽  
Emily Jane McTavish

AbstractPhylogenies are a key part of research in many areas of biology. Tools that automate some parts of the process of phylogenetic reconstruction, mainly molecular character matrix assembly, have been developed for the advantage of both specialists in the field of phylogenetics and nonspecialists. However, interpretation of results, comparison with previously available phylogenetic hypotheses, and selection of one phylogeny for downstream analyses and discussion still impose difficulties to one that is not a specialist either on phylogenetic methods or on a particular group of study.Physcraper is a command-line Python program that automates the update of published phylogenies by adding public DNA sequences to underlying alignments of previously published phylogenies. It also provides a framework for straightforward comparison of published phylogenies with their updated versions, by leveraging upon tools from the Open Tree of Life project to link taxonomic information across databases.Physcraper can be used by the nonspecialist, as a tool to generate phylogenetic hypotheses based on publicly available expert phylogenetic knowledge. Phylogeneticists and taxonomic group specialists will find it useful as a tool to facilitate molecular dataset gathering and comparison of alternative phylogenetic hypotheses (topologies).The Physcraper workflow demonstrates the benefits of doing open science for phylogenetics, encour-aging researchers to strive for better sharing practices. Physcraper can be used with any OS and is released under an open-source license. Detailed instructions for installation and use are available at https://physcraper.readthedocs.


2020 ◽  
Author(s):  
Emily Jane McTavish ◽  
Luna L Sanchez Reyes ◽  
Mark T. Holder

The Open Tree of Life project constructs a comprehensive, dynamic and digitally-available tree of life by synthesizing published phylogenetic trees along with taxonomic data. Open Tree of Life provides web-service application programming interfaces (APIs) to make the tree estimate, unified taxonomy, and input phylogenetic data available to anyone. Here, we describe the python package 'opentree', which provides a user friendly python wrapper for these APIs and a set of scripts and tutorials for straightforward downstream data analyses. We demonstrate the utility of these tools by generating an estimate of the phylogenetic relationships of all bird families, and by capturing a phylogenetic estimate for all taxa observed at the University of California Merced Vernal Pools and Grassland Reserve.


Sign in / Sign up

Export Citation Format

Share Document