PhySortR: a fast, flexible tool for sorting phylogenetic trees in R

PeerJ ◽

10.7717/peerj.2038 ◽

2016 ◽

Vol 4 ◽

pp. e2038 ◽

Cited By ~ 7

Author(s):

Timothy G. Stephens ◽

Debashish Bhattacharya ◽

Mark A. Ragan ◽

Cheong Xin Chan

Keyword(s):

Phylogenetic Trees ◽

A Priori ◽

R Package ◽

Command Line ◽

Flexible Tool ◽

Command Line Tool ◽

Whole Tree

A frequent bottleneck in interpreting phylogenomic output is the need to screen often thousands of trees for features of interest, particularly robust clades of specific taxa, as evidence of monophyletic relationship and/or reticulated evolution. Here we present PhySortR, a fast, flexible R package for classifying phylogenetic trees. Unlike existing utilities, PhySortR allows for identification of both exclusive and non-exclusive clades uniting the target taxa based on tip labels (i.e., leaves) on a tree, with customisable options to assess clades within the context of the whole tree. Using simulated and empirical datasets, we demonstrate the potential and scalability of PhySortR in analysis of thousands of phylogenetic trees without a priori assumption of tree-rooting, and in yielding readily interpretable trees that unambiguously satisfy the query. PhySortR is a command-line tool that is freely available and easily automatable.

Download Full-text

PhySortR: a fast, flexible tool for sorting phylogenetic trees in R

10.7287/peerj.preprints.1609 ◽

2015 ◽

Author(s):

Timothy G Stephens ◽

Debashish Bhattacharya ◽

Mark A Ragan ◽

Cheong Xin Chan

Keyword(s):

Phylogenetic Trees ◽

R Package ◽

Command Line ◽

Flexible Tool ◽

Command Line Tool ◽

Whole Tree

A frequent bottleneck in interpreting phylogenomic output is the need to screen often thousands of trees for features of interest, such as robust clades of specific taxa, as evidence of monophyletic relationship and/or reticulated evolution. Here we present PhySortR, a fast, flexible R package for sorting phylogenetic trees. Unlike existing utilities, PhySortR allows for identification of both exclusive and non-exclusive clades uniting the target taxa, with customisable options to assess clades within the context of the whole tree. PhySortR is a command-line tool that is freely available, highly scalable, and easily automatable.

Download Full-text

Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data

Bioinformatics ◽

10.1093/bioinformatics/btaa070 ◽

2020 ◽

Vol 36 (10) ◽

pp. 3263-3265 ◽

Cited By ~ 14

Author(s):

Lucas Czech ◽

Pierre Barbera ◽

Alexandros Stamatakis

Keyword(s):

Phylogenetic Trees ◽

Supplementary Information ◽

Command Line ◽

Supplementary Data ◽

Computationally Efficient ◽

Data Types ◽

Low Level ◽

Phylogenetic Placement ◽

Command Line Tool ◽

High Level

Abstract Summary We present genesis, a library for working with phylogenetic data, and gappa, an accompanying command-line tool for conducting typical analyses on such data. The tools target phylogenetic trees and phylogenetic placements, sequences, taxonomies and other relevant data types, offer high-level simplicity as well as low-level customizability, and are computationally efficient, well-tested and field-proven. Availability and implementation Both genesis and gappa are written in modern C++11, and are freely available under GPLv3 at http://github.com/lczech/genesis and http://github.com/lczech/gappa. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

CoRC: the COPASI R Connector

Bioinformatics ◽

10.1093/bioinformatics/btab033 ◽

2021 ◽

Author(s):

Jonas Förster ◽

Frank T Bergmann ◽

Jürgen Pahle

Keyword(s):

Graphical User Interface ◽

Academic Research ◽

R Package ◽

Supplementary Information ◽

Command Line ◽

Graphical Interface ◽

Thought Process ◽

Extensive Analysis ◽

Command Line Tool ◽

High Level

Abstract Motivation COPASI is a biochemical simulator and model analyzer which has found widespread use in academic research, teaching and beyond. One of COPASI’s strengths is its graphical user interface, and this is what most users work with. COPASI also provides a command-line tool. So far, an intuitive scripting interface that allows the creation and documentation of systems biology workflows was missing though. Results We have developed CoRC, the COPASI R Connector, an R package which provides a high-level scripting interface for COPASI. It closely mirrors the thought process of a (graphical interface) user and should therefore be very easy to use. This allows for complex workflows to be reproducibly scripted, utilizing COPASI’s powerful analytic toolset in combination with R’s extensive analysis and package ecosystem. Availability and implementation CoRC is a free and open-source R package, available via GitHub at https://jpahle.github.io/CoRC/ under the Artistic-2.0 license. Supplementary information: We provide tutorial articles as well as several example scripts on the project’s website.

Download Full-text

sangeranalyseR: simple and Interactive Processing of Sanger Sequencing Data in R

Genome Biology and Evolution ◽

10.1093/gbe/evab028 ◽

2021 ◽

Author(s):

Kuan-Hao Chao ◽

Kirston Barton ◽

Sarah Palmer ◽

Robert Lanfear

Keyword(s):

Open Source ◽

Phylogenetic Trees ◽

Input Data ◽

Sanger Sequencing ◽

R Package ◽

Command Line ◽

Sequencing Data ◽

Fasta Format ◽

Online Documentation ◽

Wide Range

Abstract sangeranalyseR is feature-rich, free, and open-source R package for processing Sanger sequencing data. It allows users to go from loading reads to saving aligned contigs in a few lines of R code by using sensible defaults for most actions. It also provides complete flexibility for determining how individual reads and contigs are processed, both at the command-line in R and via interactive Shiny applications. sangeranalyseR provides a wide range of options for all steps in Sanger processing pipelines including trimming reads, detecting secondary peaks, viewing chromatograms, detecting indels and stop codons, aligning contigs, estimating phylogenetic trees, and more. Input data can be in either ABIF or FASTA format. sangeranalyseR comes with extensive online documentation and outputs aligned and unaligned reads and contigs in FASTA format, along with detailed interactive HTML reports. sangeranalyseR supports the use of colourblind-friendly palettes for viewing alignments and chromatograms. It is released under an MIT licence and available for all platforms on Bioconductor (https://bioconductor.org/packages/sangeranalyseR) and on Github (https://github.com/roblanf/sangeranalyseR).

Download Full-text

Genesis and Gappa: Processing, Analyzing and Visualizing Phylogenetic (Placement) Data

10.1101/647958 ◽

2019 ◽

Cited By ~ 3

Author(s):

Lucas Czech ◽

Pierre Barbera ◽

Alexandros Stamatakis

Keyword(s):

Phylogenetic Trees ◽

Command Line ◽

Computationally Efficient ◽

Data Types ◽

Low Level ◽

Phylogenetic Placement ◽

Link Type ◽

Phylogenetic Data ◽

Command Line Tool ◽

High Level

SummaryWe present GENESIS, a library for working with phylogenetic data, and GAPPA, an accompanying command line tool for conducting typical analyses on such data. The tools target phylogenetic trees and phylogenetic placements, sequences, taxonomies, and other relevant data types, offer high-level simplicity as well as low-level customizability, and are computationally efficient, well-tested, and field-proven.Availability and ImplementationBoth GENESIS and GAPPA are written in modern C++11, and are freely available under GPLv3 at http://github.com/lczech/genesis and http://github.com/lczech/[email protected] and [email protected].

Download Full-text

Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files

Cancer Informatics ◽

10.4137/cin.s26470 ◽

2015 ◽

Vol 14 ◽

pp. CIN.S26470 ◽

Cited By ~ 2

Author(s):

Richard P. Finney ◽

Qing-Rong Chen ◽

Cu V. Nguyen ◽

Chih Hao Hsu ◽

Chunhua Yan ◽

...

Keyword(s):

Graphical User Interface ◽

Reference Genome ◽

Source Code ◽

Software Tool ◽

Command Line ◽

Sequencing Data ◽

Genome Data ◽

Command Line Tool ◽

Portable Software ◽

Microsoft Windows

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .

Download Full-text

UCEasy: A software package for automating and simplifying the analysis of ultraconserved elements (UCEs)

Biodiversity Data Journal ◽

10.3897/bdj.9.e78132 ◽

2021 ◽

Vol 9 ◽

Author(s):

Caio Ribeiro ◽

Lucas Oliveira ◽

Romina Batista ◽

Marcos De Sousa

Keyword(s):

Best Practices ◽

Software Package ◽

Phylogenetic Trees ◽

Computational Analysis ◽

Data Matrix ◽

Command Line ◽

Command Line Interface ◽

Ultraconserved Elements ◽

Research Software ◽

Different Levels

The use of Ultraconserved Elements (UCEs) as genetic markers in phylogenomics has become popular and has provided promising results. Although UCE data can be easily obtained from targeted enriched sequencing, the protocol for in silico analysis of UCEs consist of the execution of heterogeneous and complex tools, a challenge for scientists without training in bioinformatics. Developing tools with the adoption of best practices in research software can lessen this problem by improving the execution of computational experiments, thus promoting better reproducibility. We present UCEasy, an easy-to-install and easy-to-use software package with a simple command line interface that facilitates the computational analysis of UCEs from sequencing samples, following the best practices of research software. UCEasy is a wrapper that standardises, automates and simplifies the quality control of raw reads, assembly and extraction and alignment of UCEs, generating at the end a data matrix with different levels of completeness that can be used to infer phylogenetic trees. We demonstrate the functionalities of UCEasy by reproducing the published results of phylogenomic studies of the bird genus Turdus (Aves) and of Adephaga families (Coleoptera) containing genomic datasets to efficiently extract UCEs.

Download Full-text

FAN-C: A Feature-rich Framework for the Analysis and Visualisation of C data

10.1101/2020.02.03.932517 ◽

2020 ◽

Cited By ~ 6

Author(s):

Kai Kruse ◽

Clemens B. Hug ◽

Juan M. Vaquerizas

Keyword(s):

High Throughput ◽

Matrix Analysis ◽

Set Covering ◽

Command Line ◽

Chromosome Conformation ◽

C Storage ◽

Data Formats ◽

Analysis Tools ◽

Command Line Tool ◽

Broad Feature

Chromosome conformation capture data, particularly from high-throughput approaches such as Hi-C and its derivatives, are typically very complex to analyse. Existing analysis tools are often single-purpose, or limited in compatibility to a small number of data formats, frequently making Hi-C analyses tedious and time-consuming. Here, we present FAN-C, an easy-to-use command-line tool and powerful Python API with a broad feature set covering matrix generation, analysis, and visualisation for C-like data (https://github.com/vaquerizaslab/fanc). Due to its comprehensiveness and compatibility with the most prevalent Hi-C storage formats, FAN-C can be used in combination with a large number of existing analysis tools, thus greatly simplifying Hi-C matrix analysis.

Download Full-text

hilldiv: an R package for the integral analysis of diversity based on Hill numbers

10.1101/545665 ◽

2019 ◽

Cited By ~ 7

Author(s):

Antton Alberdi ◽

M Thomas P Gilbert

Keyword(s):

Phylogenetic Trees ◽

R Package ◽

Diversity Partitioning ◽

Diet Reconstruction ◽

Functional Correlation ◽

Community Profiling ◽

High Throughput Dna Sequencing ◽

Multi Level ◽

Diversity Profile ◽

Hill Numbers

AbstractHill numbers provide a powerful framework for measuring, comparing and partitioning the diversity of biological systems as characterised using high throughput DNA sequencing approaches. In order to facilitate the implementation of Hill numbers into such analyses, whether focusing on diet reconstruction, microbial community profiling or more general ecosystem characterisation analyses, we present a new R package. ‘Hilldiv’ provides a set of functions to assist analysis of diversity based on Hill numbers, using count tables (e.g. OTU, ASV) and associated phylogenetic trees as inputs. Multiple functionalities of the library are introduced, including diversity measurement, diversity profile plotting, diversity comparison between samples and groups, multi-level diversity partitioning and (dis)similarity measurement. All of these are grounded in abundance-based and incidence-based Hill numbers, and can accommodate phylogenetic or functional correlation among OTUs or ASVs. The package can be installed from CRAN or Github, and tutorials and example scripts can be found in the package’s page (https://github.com/anttonalberdi/hilldiv).

Download Full-text