scholarly journals MonoPhy: A simple R package to find and visualize monophyly issues

Author(s):  
Orlando Schwery ◽  
Brian C O'Meara

Background. The monophyly of taxa is an important attribute of a phylogenetic tree, as a lack of it may hint at shortcomings of either the tree or the current taxonomy and can misguide subsequent analyses. While monophyly is conceptually simple, it is manually tedious and time consuming to assess on modern phylogenies of hundreds to thousands of species. Results. The R package MonoPhy allows assessment and exploration of monophyly of taxa in a phylogeny. It can assess the monophyly of genera using the phylogeny only, and with an additional input file, any other desired higher taxa or unranked groups can be checked as well. Conclusion. Summary tables, easily subsettable results and several visualization options allow quick and convenient exploration of monophyly issues, thus making MonoPhy a valuable tool for any researcher working with phylogenies.

2015 ◽  
Author(s):  
Orlando Schwery ◽  
Brian C O'Meara

Background. The monophyly of taxa is an important attribute of a phylogenetic tree, as a lack of it may hint at shortcomings of either the tree or the current taxonomy and can misguide subsequent analyses. While monophyly is conceptually simple, it is manually tedious and time consuming to assess on modern phylogenies of hundreds to thousands of species. Results. The R package MonoPhy allows assessment and exploration of monophyly of taxa in a phylogeny. It can assess the monophyly of genera using the phylogeny only, and with an additional input file, any other desired higher taxa or unranked groups can be checked as well. Conclusion. Summary tables, easily subsettable results and several visualization options allow quick and convenient exploration of monophyly issues, thus making MonoPhy a valuable tool for any researcher working with phylogenies.


2016 ◽  
Vol 2 ◽  
pp. e56 ◽  
Author(s):  
Orlando Schwery ◽  
Brian C. O’Meara

Background.The monophyly of taxa is an important attribute of a phylogenetic tree. A lack of it may hint at shortcomings of either the tree or the current taxonomy, or can indicate cases of incomplete lineage sorting or horizontal gene transfer. Whichever is the reason, a lack of monophyly can misguide subsequent analyses. While monophyly is conceptually simple, it is manually tedious and time consuming to assess on modern phylogenies of hundreds to thousands of species.Results.The R packageMonoPhyallows assessment and exploration of monophyly of taxa in a phylogeny. It can assess the monophyly of genera using the phylogeny only, and with an additional input file any other desired higher order taxa or unranked groups can be checked as well.Conclusion.Summary tables, easily subsettable results and several visualization options allow quick and convenient exploration of monophyly issues, thus makingMonoPhya valuable tool for any researcher working with phylogenies.


2019 ◽  
Vol 37 (2) ◽  
pp. 599-603 ◽  
Author(s):  
Li-Gen Wang ◽  
Tommy Tsan-Yuk Lam ◽  
Shuangbin Xu ◽  
Zehan Dai ◽  
Lang Zhou ◽  
...  

Abstract Phylogenetic trees and data are often stored in incompatible and inconsistent formats. The outputs of software tools that contain trees with analysis findings are often not compatible with each other, making it hard to integrate the results of different analyses in a comparative study. The treeio package is designed to connect phylogenetic tree input and output. It supports extracting phylogenetic trees as well as the outputs of commonly used analytical software. It can link external data to phylogenies and merge tree data obtained from different sources, enabling analyses of phylogeny-associated data from different disciplines in an evolutionary context. Treeio also supports export of a phylogenetic tree with heterogeneous-associated data to a single tree file, including BEAST compatible NEXUS and jtree formats; these facilitate data sharing as well as file format conversion for downstream analysis. The treeio package is designed to work with the tidytree and ggtree packages. Tree data can be processed using the tidy interface with tidytree and visualized by ggtree. The treeio package is released within the Bioconductor and rOpenSci projects. It is available at https://www.bioconductor.org/packages/treeio/.


2017 ◽  
Author(s):  
Paul Bastide ◽  
Cécile Ané ◽  
Stéphane Robin ◽  
Mahendra Mariadassou

AbstractTo study the evolution of several quantitative traits, the classical phylogenetic comparative framework consists of a multivariate random process running along the branches of a phylogenetic tree. The Ornstein-Uhlenbeck (OU) process is sometimes preferred to the simple Brownian Motion (BM) as it models stabilizing selection toward an optimum. The optimum for each trait is likely to be changing over the long periods of time spanned by large modern phylogenies. Our goal is to automatically detect the position of these shifts on a phylogenetic tree, while accounting for correlations between traits, which might exist because of structural or evolutionary constraints. We show that, in the presence shifts, phylogenetic Principal Component Analysis (pPCA) fails to decorrelate traits efficiently, so that any method aiming at finding shift needs to deal with correlation simultaneously. We introduce here a simplification of the full multivariate OU model, named scalar OU (scOU), which allows for noncausal correlations and is still computationally tractable. We extend the equivalence between the OU and a BM on a re-scaled tree to our multivariate framework. We describe an Expectation Maximization algorithm that allows for a maximum likelihood estimation of the shift positions, associated with a new model selection criterion, accounting for the identifiability issues for the shift localization on the tree. The method, freely available as an R-package (PhylogeneticEM) is fast, and can deal with missing values. We demonstrate its efficiency and accuracy compared to another state-of-the-art method (ℓ1ou) on a wide range of simulated scenarios, and use this new framework to re-analyze recently gathered datasets on New World Monkeys and Anolis lizards.


2021 ◽  
Author(s):  
Caizhi Huang ◽  
Benjamin John Callahan ◽  
Michael C Wu ◽  
Shannon T. Holloway ◽  
Hayden Brochu ◽  
...  

Abstract Background: The relationship between host conditions and microbiome profiles, typically characterized by operational taxonomic units (OTUs), contains important information about the microbial role in human health. Traditional association testing frameworks are challenged by the high-dimensionality and sparsity of typical microbiome profiles. Incorporating phylogenetic information is often used to address these challenges with the assumption that evolutionarily similar taxa tend to behave similarly. However, this assumption may not always be valid due to the complex effect of microbes, and phylogenetic information should be incorporated in a data-supervised fashion. Results: In this work, we propose a local collapsing test called Phylogeny-guided microbiome OTU-Specific association Test (POST). In POST, whether or not to borrow information and how much information to borrow from the neighboring OTUs in the phylogenic tree are supervised by phylogenetic distance and the outcome-OTU association. POST is constructed under the kernel machine framework to accommodate complex OTU effects and extends kernel machine microbiome tests from community-level to OTU-level. Using simulation studies, we showed that when the phylogenetic tree is informative, POST has better performance than existing OTU-level association tests. When the phylogenetic tree is not informative, POST achieves similar performance as existing methods. Finally, we show that POST can identify more outcome-associated OTUs that are of biological relevance in real data applications on bacterial vaginosis and on preterm birth. Conclusions: Using POST, we show that the power of detecting associated microbiome features can be enhanced by adaptively leveraging the phylogenetic information when testing for a target OTU. We developed an user friendly R package POSTm which is now available at CRAN (https://CRAN.R-project.org/package=POSTm) for public access.


2017 ◽  
Author(s):  
Matthew Hall ◽  
Caroline Colijn

AbstractOne approach to the reconstruction of infectious disease transmission trees from pathogen genomic data has been to use a phylogenetic tree, reconstructed from pathogen sequences, and annotate its internal nodes to provide a reconstruction of which host each lineage was in at each point in time. If only one pathogen lineage can be transmitted to a new host (i.e. the transmission bottleneck is complete), this corresponds to partitioning the nodes of the phylogeny into connected regions, each of which represents evolution in an individual host. These partitions define the possible transmission trees that are consistent with a given phylogenetic tree. However, the mathematical properties of the transmission trees given a phylogeny remain largely unexplored. Here, we describe a procedure to calculate the number of possible transmission trees for a given phylogeny, and we show how to uniformly sample from these transmission trees. The procedure is outlined for situations where one sample is available from each host and trees do not have branch lengths, and we also provide extensions for incomplete sampling, multiple sampling, and the application to time trees in a situation where limits on the period during which each host could have been infected are known. The sampling algorithm is available as an R package (STraTUS).


2019 ◽  
Author(s):  
Kang Jin Kim ◽  
Jaehyun Park ◽  
Sang-Chul Park ◽  
Sungho Won

Abstract Motivation Ecological patterns of the human microbiota exhibit high inter-subject variation, with few operational taxonomic units (OTUs) shared across individuals. To overcome these issues, non-parametric approaches, such as the Mann–Whitney U-test and Wilcoxon rank-sum test, have often been used to identify OTUs associated with host diseases. However, these approaches only use the ranks of observed relative abundances, leading to information loss, and are associated with high false-negative rates. In this study, we propose a phylogenetic tree-based microbiome association test (TMAT) to analyze the associations between microbiome OTU abundances and disease phenotypes. Phylogenetic trees illustrate patterns of similarity among different OTUs, and TMAT provides an efficient method for utilizing such information for association analyses. The proposed TMAT provides test statistics for each node, which are combined to identify mutations associated with host diseases. Results Power estimates of TMAT were compared with existing methods using extensive simulations based on real absolute abundances. Simulation studies showed that TMAT preserves the nominal type-1 error rate, and estimates of its statistical power generally outperformed existing methods in the considered scenarios. Furthermore, TMAT can be used to detect phylogenetic mutations associated with host diseases, providing more in-depth insight into bacterial pathology. Availability and implementation The 16S rRNA amplicon sequencing metagenomics datasets for colorectal carcinoma and myalgic encephalomyelitis/chronic fatigue syndrome are available from the European Nucleotide Archive (ENA) database under project accession number PRJEB6070 and PRJEB13092, respectively. TMAT was implemented in the R package. Detailed information is available at http://healthstat.snu.ac.kr/software/tmat. Supplementary information Supplementary data are available at Bioinformatics online.


Planta Medica ◽  
2016 ◽  
Vol 81 (S 01) ◽  
pp. S1-S381
Author(s):  
C Roullier ◽  
Y Guitton ◽  
S Prado ◽  
O Grovel ◽  
YF Pouchus

2020 ◽  
pp. 37-40

Genetic variety examination has demonstrated fundamental to the understanding of the epidemiological and developmental history of Papillomavirus (HPV), for the development of accurate diagnostic tests and for efficient vaccine design. The HPV nucleotide diversity has been investigated widely among high-risk HPV types. To make the nucleotide sequence of HPV and do the virus database in Thi-Qar province, and compare sequences of our isolates with previously described isolates from around the world and then draw its phylogenetic tree, this study done. A total of 6 breast formalin-fixed paraffin-embedded (FFPE) of the female patients were included in the study, divided as 4 FFPE malignant tumor and 2 FFPE of benign tumor. The PCR technique was implemented to detect the presence of HPV in breast tissue, and the real-time PCR used to determinant HPV genotypes, then determined a complete nucleotide sequence of HPV of L1 capsid gene, and draw its phylogenetic tree. The nucleotide sequencing finding detects a number of substitution mutation (SNPs) in (L1) gene, which have not been designated before, were identified once in this study population, and revealed that the HPV16 strains have the evolutionary relationship with the South African race, while, the HPV33 and HPV6 showing the evolutionary association with the North American and East Asian race, respectively.


2019 ◽  
Author(s):  
Shinichi Nakagawa ◽  
Malgorzata Lagisz ◽  
Rose E O'Dea ◽  
Joanna Rutkowska ◽  
Yefeng Yang ◽  
...  

‘Classic’ forest plots show the effect sizes from individual studies and the aggregate effect from a meta-analysis. However, in ecology and evolution meta-analyses routinely contain over 100 effect sizes, making the classic forest plot of limited use. We surveyed 102 meta-analyses in ecology and evolution, finding that only 11% use the classic forest plot. Instead, most used a ‘forest-like plot’, showing point estimates (with 95% confidence intervals; CIs) from a series of subgroups or categories in a meta-regression. We propose a modification of the forest-like plot, which we name the ‘orchard plot’. Orchard plots, in addition to showing overall mean effects and CIs from meta-analyses/regressions, also includes 95% prediction intervals (PIs), and the individual effect sizes scaled by their precision. The PI allows the user and reader to see the range in which an effect size from a future study may be expected to fall. The PI, therefore, provides an intuitive interpretation of any heterogeneity in the data. Supplementing the PI, the inclusion of underlying effect sizes also allows the user to see any influential or outlying effect sizes. We showcase the orchard plot with example datasets from ecology and evolution, using the R package, orchard, including several functions for visualizing meta-analytic data using forest-plot derivatives. We consider the orchard plot as a variant on the classic forest plot, cultivated to the needs of meta-analysts in ecology and evolution. Hopefully, the orchard plot will prove fruitful for visualizing large collections of heterogeneous effect sizes regardless of the field of study.


Sign in / Sign up

Export Citation Format

Share Document