scholarly journals RevGadgets: an R Package for visualizing Bayesian phylogenetic analyses from RevBayes

2021 ◽  
Author(s):  
Carrie M. Tribble ◽  
William A. Freyman ◽  
Jun Ying Lim ◽  
Michael J. Landis ◽  
Joëlle Barido-Sottani ◽  
...  

1. Statistical phylogenetic methods are the foundation for a wide range of evolutionary and epidemiological studies. However, as these methods grow increasingly complex, users often encounter significant challenges with summarizing, visualizing, and communicating their key results. 2. We present RevGadgets, an R package for creating publication-quality figures from the results of a large variety of phylogenetic analyses performed in RevBayes (and other phylogenetic software packages). 3. We demonstrate how to use RevGadgets through a set of vignettes that cover the most common use cases that researchers will encounter. 4. RevGadgets is an open-source, extensible package that will continue to evolve in parallel with RevBayes, helping researchers to make sense of and communicate the results of a diverse array of analyses.

2021 ◽  
Author(s):  
Jason Hunter ◽  
Mark Thyer ◽  
Dmitri Kavetski ◽  
David McInerney

<p>Probabilistic predictions provide crucial information regarding the uncertainty of hydrological predictions, which are a key input for risk-based decision-making. However, they are often excluded from hydrological modelling applications because suitable probabilistic error models can be both challenging to construct and interpret, and the quality of results are often reliant on the objective function used to calibrate the hydrological model.</p><p>We present an open-source R-package and an online web application that achieves the following two aims. Firstly, these resources are easy-to-use and accessible, so that users need not have specialised knowledge in probabilistic modelling to apply them. Secondly, the probabilistic error model that we describe provides high-quality probabilistic predictions for a wide range of commonly-used hydrological objective functions, which it is only able to do by including a new innovation that resolves a long-standing issue relating to model assumptions that previously prevented this broad application.  </p><p>We demonstrate our methods by comparing our new probabilistic error model with an existing reference error model in an empirical case study that uses 54 perennial Australian catchments, the hydrological model GR4J, 8 common objective functions and 4 performance metrics (reliability, precision, volumetric bias and errors in the flow duration curve). The existing reference error model introduces additional flow dependencies into the residual error structure when it is used with most of the study objective functions, which in turn leads to poor-quality probabilistic predictions. In contrast, the new probabilistic error model achieves high-quality probabilistic predictions for all objective functions used in this case study.</p><p>The new probabilistic error model and the open-source software and web application aims to facilitate the adoption of probabilistic predictions in the hydrological modelling community, and to improve the quality of predictions and decisions that are made using those predictions. In particular, our methods can be used to achieve high-quality probabilistic predictions from hydrological models that are calibrated with a wide range of common objective functions.</p>


2020 ◽  
Author(s):  
Jakub Nowosad

*Context* Pattern-based spatial analysis provides methods to describe and quantitatively compare spatial patterns for categorical raster datasets. It allows for spatial search, change detection, and clustering of areas with similar patterns. *Objectives* We developed an R package **motif** as a set of open-source tools for pattern-based spatial analysis. *Methods* This package provides most of the functionality of existing software (except spatial segmentation), but also extends the existing ideas through support for multi-layer raster datasets. It accepts larger-than-RAM datasets and works across all of the major operating systems. *Results* In this study, we describe the software design of the tool, its capabilities, and present four case studies. They include calculation of spatial signatures based on land cover data for regular and irregular areas, search for regions with similar patterns of geomorphons, detection of changes in land cover patterns, and clustering of areas with similar spatial patterns of land cover and landforms. *Conclusions* The methods implemented in **motif** should be useful in a wide range of applications, including land management, sustainable development, environmental protection, forest cover change and urban growth monitoring, and agriculture expansion studies. The **motif** package homepage is https://nowosad.github.io/motif.


2018 ◽  
Author(s):  
Jatin Nandania ◽  
Gopal Peddinti ◽  
Alberto Pessia ◽  
Meri Kokkonen ◽  
Vidya Velagapudi

AbstractThe use of metabolomics profiling to understand metabolism under different physiological states has increased in recent years, which created the need for robust analytical platforms. Here, we present a validated method for targeted and semi-quantitative analysis of 102 polar metabolites that covers major metabolic pathways from 24 classes in a single 17.5-min assay. The method has been optimized for a wide range of biological matrices from various organisms, and involves automated sample preparation, and data processing using in-house developed R package. To ensure reliability, the method was validated for accuracy, precision, selectivity, specificity, linearity, recovery, and stability according to European Medicines Agency guidelines. We demonstrated excellent repeatability of the retention times (CV<4%), calibration curves (R2≥0.980) in their respective wide dynamic concentration ranges (CV<3%), and concentrations (CV<25%) of quality control samples interspersed within 25 batches analyzed over a period of one-year. The robustness was demonstrated through high correlation between metabolite concentrations measured using our method and NIST reference values (R2=0.967), including cross-platform comparability against the BIOCRATES AbsoluteIDQp180 kit (R2=0.975) and NMR analyses (R2=0.884). We have shown that our method can be successfully applied in many biomedical research fields and clinical trials, including epidemiological studies for biomarker discovery. In summary, a thorough validation demonstrated that our method is reproducible, robust, reliable, and suitable for metabolomics studies.


2006 ◽  
Vol 72 (9) ◽  
pp. 6049-6052 ◽  
Author(s):  
Tony L. Goldberg ◽  
Thomas R. Gillespie ◽  
Randall S. Singer

ABSTRACT Repetitive-element PCR (rep-PCR) is a method for genotyping bacteria based on the selective amplification of repetitive genetic elements dispersed throughout bacterial chromosomes. The method has great potential for large-scale epidemiological studies because of its speed and simplicity; however, objective guidelines for inferring relationships among bacterial isolates from rep-PCR data are lacking. We used multilocus sequence typing (MLST) as a “gold standard” to optimize the analytical parameters for inferring relationships among Escherichia coli isolates from rep-PCR data. We chose 12 isolates from a large database to represent a wide range of pairwise genetic distances, based on the initial evaluation of their rep-PCR fingerprints. We conducted MLST with these same isolates and systematically varied the analytical parameters to maximize the correspondence between the relationships inferred from rep-PCR and those inferred from MLST. Methods that compared the shapes of densitometric profiles (“curve-based” methods) yielded consistently higher correspondence values between data types than did methods that calculated indices of similarity based on shared and different bands (maximum correspondences of 84.5% and 80.3%, respectively). Curve-based methods were also markedly more robust in accommodating variations in user-specified analytical parameter values than were “band-sharing coefficient” methods, and they enhanced the reproducibility of rep-PCR. Phylogenetic analyses of rep-PCR data yielded trees with high topological correspondence to trees based on MLST and high statistical support for major clades. These results indicate that rep-PCR yields accurate information for inferring relationships among E. coli isolates and that accuracy can be enhanced with the use of analytical methods that consider the shapes of densitometric profiles.


2020 ◽  
Author(s):  
Jenna Hershberger ◽  
Nicolas Morales ◽  
Christiano C. Simoes ◽  
Bryan Ellerbrock ◽  
Guillaume Bauchet ◽  
...  

ABSTRACTVisible and near-infrared (vis-NIRS) spectroscopy is a promising tool for increasing phenotyping throughput in plant breeding programs, but existing analysis software packages are not optimized for a breeding context. Additionally, commercial software options are often outside of budget constraints for some breeding and research programs. To that end, we developed an open-source R package, waves, for the streamlined analysis of spectral data with several cross-validation schemes to assess prediction accuracy. Waves is compatible with a wide range of spectrometer models and performs visualization, filtering, aggregation, cross-validation set formation, model training, and prediction functions for the association of vis-NIRS spectra with reference measurements. Furthermore, we have integrated this package into the Breedbase family of open-source databases, expanding the analysis capabilities of this growing digital ecosystem to a number of crop species. Taken together, the standalone and Breedbase versions of waves enhance the accessibility of tools for the analysis of spectral data during the plant breeding process.Core ideaswaves is an open-source R package for spectral data analysis in plant breedingBreeding relevant cross-validation schemes to evaluate predictive accuracy of modelsExtension of Breedbase—an open-source database—to support spectral data storageGraphical user interface developed for implementation of waves in Breedbase


2018 ◽  
Author(s):  
Xavier Didelot ◽  
Nicholas J Croucher ◽  
Stephen D Bentley ◽  
Simon R Harris ◽  
Daniel J Wilson

ABSTRACTThe sequencing and comparative analysis of a collection of bacterial genomes from a single species or lineage of interest can lead to key insights into its evolution, ecology or epidemiology. The tool of choice for such a study is often to build a phylogenetic tree, and more specifically when possible a dated phylogeny, in which the dates of all common ancestors are estimated. Here we propose a new Bayesian methodology to construct dated phylogenies which is specifically designed for bacterial genomics. Unlike previous Bayesian methods aimed at building dated phylogenies, we consider that the phylogenetic relationships between the genomes have been previously evaluated using a standard phylogenetic method, which makes our methodology much faster and scalable. This two-steps approach also allows us to directly exploit existing phylogenetic methods that detect bacterial recombination, and therefore to account for the effect of recombination in the construction of a dated phylogeny. We analysed many simulated datasets in order to benchmark the performance of our approach in a wide range of situations. Furthermore, we present applications to three different real datasets from recent bacterial genomic studies. Our methodology is implemented in a R package called BactDating which is freely available for download at https://github.com/xavierdidelot/BactDating.


2015 ◽  
Author(s):  
Zheng Ning ◽  
Yakov A. Tsepilov ◽  
Sodbo Zh. Sharapov ◽  
Alexander K. Grishenko ◽  
Xiao Feng ◽  
...  

AbstractThe ever-growing genome-wide association studies (GWAS) have revealed widespread pleiotropy. To exploit this, various methods which consider variant association with multiple traits jointly have been developed. However, most effort has been put on improving discovery power: how to replicate and interpret these discovered pleiotropic loci using multivariate methods has yet to be discussed fully. Using only multiple publicly available single-trait GWAS summary statistics, we develop a fast and flexible multi-trait framework that contains modules for (i) multi-trait genetic discovery, (ii) replication of locus pleiotropic profile, and (iii) multi-trait conditional analysis. The procedure is able to handle any level of sample overlap. As an empirical example, we discovered and replicated 23 novel pleiotropic loci for human anthropometry and evaluated their pleiotropic effects on other traits. By applying conditional multivariate analysis on the 23 loci, we discovered and replicated two additional multi-trait associated SNPs. Our results provide empirical evidence that multi-trait analysis allows detection of additional, replicable, highly pleiotropic genetic associations without genotyping additional individuals. The methods are implemented in a free and open source R package MultiABEL.Author summaryBy analyzing large-scale genomic data, geneticists have revealed widespread pleiotropy, i.e. single genetic variation can affect a wide range of complex traits. Methods have been developed to discover such genetic variants. However, we still lack insights into the relevant genetic architecture - What more can we learn from knowing the effects of these genetic variants?Here, we develop a fast and flexible statistical analysis procedure that includes discovery, replication, and interpretation of pleiotropic effects. The whole analysis pipeline only requires established genetic association study results. We also provide the mathematical theory behind the pleiotropic genetic effects testing.Most importantly, we show how a replication study can be essential to reveal new biology rather than solely increasing sample size in current genomic studies. For instance, we show that, using our proposed replication strategy, we can detect the difference in genetic effects between studies of different geographical origins.We applied the method to the GIANT consortium anthropometric traits to discover new genetic associations, replicated in the UK Biobank, and provided important new insights into growth and obesity.Our pipeline is implemented in an open-source R package MultiABEL, sufficiently efficient that allows researchers to immediately apply on personal computers in minutes.


2021 ◽  
Vol 8 ◽  
Author(s):  
Lei Wu ◽  
Jiqiu Li ◽  
Alan Warren ◽  
Xiaofeng Lin

Amphileptus is one of the largest genera of pleurostomatid ciliates and its species diversity has been reported in various habitats all over the world. In the present work, we review its biodiversity based on data with reliable morphological records. Our work confirms that there are 50 valid Amphileptus species, some of which have a wide range of salinity adaptability and diverse lifestyles. This genus has a high diversity in China but this might be because of the relatively intensive sampling. Phylogenetic analyses based on SSU rDNA sequence data verify the non-monophyly of the genus Amphileptus. Furthermore, two new and one poorly known Amphileptus species, namely A. shenzhenensis sp. n., A. cocous sp. n., and A. multinucleatusWang, 1934, from coastal habitats of southern China were investigated using morphological and molecular phylogenetic methods. These three species are highly similar based on their contractile vacuoles and macronuclear nodules. However, they can be discriminated by details of their living morphology and somatic kineties. We also propose two new combinations, Amphileptus polymicronuclei (Li, 1990) comb. n. (original combination Hemiophrys polymicronucleiLi, 1990) and Amphileptus salimicus (Burkovsky, 1970b) comb. n. (original combination Hemiophrys salimicaBurkovsky, 1970b).


Author(s):  
Emma H Gail ◽  
Anup D Shah ◽  
Ralf B Schittenhelm ◽  
Chen Davidovich

Abstract Summary Unbiased detection of protein–protein and protein–RNA interactions within ribonucleoprotein complexes are enabled through crosslinking followed by mass spectrometry. Yet, different methods detect different types of molecular interactions and therefore require the usage of different software packages with limited compatibility. We present crisscrosslinkeR, an R package that maps both protein–protein and protein–RNA interactions detected by different types of approaches for crosslinking with mass spectrometry. crisscrosslinkeR produces output files that are compatible with visualization using popular software packages for the generation of publication-quality figures. Availability and implementation crisscrosslinkeR is a free and open-source package, available through GitHub: github.com/egmg726/crisscrosslinker. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 49 (2) ◽  
pp. 136-167
Author(s):  
Thomas VAN HOEY ◽  
Arthur Lewis THOMPSON

Abstract This article introduces the Chinese Ideophone Database (CHIDEOD), an open-source dataset, which collects 4948 unique onomatopoeia and ideophones (mimetics, expressives) of Mandarin, as well as Middle Chinese and Old Chinese. These are analyzed according to a wide range of variables, e.g., description, frequency. Apart from an overview of these variables, we provide a tutorial that shows how the database can be accessed in different formats (.rds, .xlsx, .csv, R package and online app interface), and how the database can be used to explore skewed tonal distribution across Mandarin ideophones. Since CHIDEOD is a data repository, potential future research applications are discussed.


Sign in / Sign up

Export Citation Format

Share Document