admixr—R package for reproducible analyses using ADMIXTOOLS

Martin Petr; Benjamin Vernot; Janet Kelso

doi:10.1093/bioinformatics/btz030

admixr—R package for reproducible analyses using ADMIXTOOLS

Bioinformatics ◽

10.1093/bioinformatics/btz030 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3194-3195 ◽

Cited By ~ 18

Author(s):

Martin Petr ◽

Benjamin Vernot ◽

Janet Kelso

Keyword(s):

Population Genetic ◽

R Package ◽

Supplementary Information ◽

Command Line ◽

Software Suite ◽

Genetic Studies ◽

Convenient Interface ◽

Reference Manual ◽

Population Genetic Analyses ◽

R Functions

Abstract Summary We present a new R package admixr, which provides a convenient interface for performing reproducible population genetic analyses (f3, D, f4, f4-ratio, qpWave and qpAdm), as implemented by command-line programs in the ADMIXTOOLS software suite. In a traditional ADMIXTOOLS workflow, the user must first generate a set of text configuration files tailored to each individual analysis, often using a combination of shell scripting and manual text editing. The non-tabular output files then need to be parsed to extract values of interest prior to further analyses. Our package simplifies this process by automating all low-level configuration and parsing steps, making analyses as simple as running a single R command. Furthermore, we provide a set of R functions for processing, filtering and manipulating datasets in the EIGENSTRAT format. By unifying all steps of the workflow under a single R framework, this package enables the automation of analytic pipelines, significantly improving the reproducibility of population genetic studies. Availability and implementation The source code of the R package is available under the MIT license. Installation instructions, reference manual and a tutorial can be found on the package website at https://bioinf.eva.mpg.de/admixr. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

SambaR: An R package for fast, easy and reproducible population‐genetic analyses of biallelic SNP data sets

Molecular Ecology Resources ◽

10.1111/1755-0998.13339 ◽

2021 ◽

Author(s):

Menno J. Jong ◽

Joost F. Jong ◽

A. Rus Hoelzel ◽

Axel Janke

Keyword(s):

Population Genetic ◽

R Package ◽

Data Sets ◽

Genetic Analyses ◽

Snp Data ◽

Population Genetic Analyses

Download Full-text

CoRC: the COPASI R Connector

Bioinformatics ◽

10.1093/bioinformatics/btab033 ◽

2021 ◽

Author(s):

Jonas Förster ◽

Frank T Bergmann ◽

Jürgen Pahle

Keyword(s):

Graphical User Interface ◽

Academic Research ◽

R Package ◽

Supplementary Information ◽

Command Line ◽

Graphical Interface ◽

Thought Process ◽

Extensive Analysis ◽

Command Line Tool ◽

High Level

Abstract Motivation COPASI is a biochemical simulator and model analyzer which has found widespread use in academic research, teaching and beyond. One of COPASI’s strengths is its graphical user interface, and this is what most users work with. COPASI also provides a command-line tool. So far, an intuitive scripting interface that allows the creation and documentation of systems biology workflows was missing though. Results We have developed CoRC, the COPASI R Connector, an R package which provides a high-level scripting interface for COPASI. It closely mirrors the thought process of a (graphical interface) user and should therefore be very easy to use. This allows for complex workflows to be reproducibly scripted, utilizing COPASI’s powerful analytic toolset in combination with R’s extensive analysis and package ecosystem. Availability and implementation CoRC is a free and open-source R package, available via GitHub at https://jpahle.github.io/CoRC/ under the Artistic-2.0 license. Supplementary information: We provide tutorial articles as well as several example scripts on the project’s website.

Download Full-text

MicrobiomeExplorer: an R package for the analysis and visualization of microbial communities

Bioinformatics ◽

10.1093/bioinformatics/btaa838 ◽

2020 ◽

Author(s):

Janina Reeder ◽

Mo Huang ◽

Joshua S Kaminker ◽

Joseph N Paulson

Keyword(s):

Microbial Communities ◽

Automated Analysis ◽

R Package ◽

Supplementary Information ◽

Command Line ◽

Supplementary Data ◽

Report Generation ◽

Shiny Application

Abstract Summary We developed the MicrobiomeExplorer R package to facilitate the analysis and visualization of microbial communities. The MicrobiomeExplorer R package allows a user to perform typical microbiome analytic workflows and visualize their results, either through the command line or an interactive Shiny application included with the package. In addition to applying common analytical workflows, the application enables automated analysis report generation. Availability and implementation Available at https://github.com/zoecastillo/microbiomeExplorer. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

SambaR: an R package for fast, easy and reproducible population-genetic analyses of biallelic SNP datasets

10.1101/2020.07.23.213793 ◽

2020 ◽

Author(s):

Menno J. de Jong ◽

Joost F. de Jong ◽

A. Rus Hoelzel ◽

Axel Janke

Keyword(s):

Data Storage ◽

Population Genetic ◽

Natural Populations ◽

R Package ◽

Genetic Analyses ◽

Reduced Representation ◽

Snp Data ◽

Minimum Number ◽

Reduced Representation Library ◽

Population Genetic Analyses

ABSTRACTBackgroundSNP datasets can be used to infer a wealth of information about natural populations, including information about their structure, genetic diversity, and the presence of loci under selection. However, SNP data analysis can be a time-consuming and challenging process, not in the least because at present many different software packages are needed to execute and depict the wide variety of mainstream population-genetic analyses. Here we present SambaR, an integrative and user-friendly R package which automates and simplifies quality control and population-genetic analyses of biallelic SNP datasets. SambaR allows users to perform mainstream population-genetic analyses and to generate a wide variety of ready to publish graphs with a minimum number of commands (less than ten). These wrapper commands call functions of existing packages (including adegenet, ape, LEA, poppr, pcadapt and StAMPP) as well as new tools uniquely implemented in SambaR.ResultsWe tested SambaR on online available SNP datasets and found that SambaR can process datasets of millions of SNPs and hundreds of individuals within hours, given sufficient computing power. Newly developed tools implemented in SambaR facilitate optimization of filter settings, objective interpretation of ordination analyses, enhance comparability of diversity estimates from reduced representation library SNP datasets, and generate reduced SNP panels and structure-like plots with Bayesian population assignment probabilities.ConclusionSambaR facilitates rapid population genetic analyses on biallelic SNP datasets by removing three major time sinks: file handling, software learning, and data plotting. In addition, SambaR provides a convenient platform for SNP data storage and management, as well as several new utilities, including guidance in setting appropriate data filters.Availability and implementationThe SambaR source script, manual and example datasets are distributed through GitHub: https://github.com/mennodejong1986/SambaR

Download Full-text

iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics

Bioinformatics ◽

10.1093/bioinformatics/btz961 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2611-2613 ◽

Cited By ~ 5

Author(s):

Thang V Pham ◽

Alex A Henneman ◽

Connie R Jimenez

Keyword(s):

Open Source ◽

State Of The Art ◽

R Package ◽

Protein Quantification ◽

Supplementary Information ◽

Label Free ◽

Software Suite ◽

Data Independent Acquisition ◽

Free Data ◽

Acquisition Mode

Abstract Summary We present an R package called iq to enable accurate protein quantification for label-free data-independent acquisition (DIA) mass spectrometry-based proteomics, a recently developed global approach with superior quantitative consistency. We implement the popular maximal peptide ratio extraction module of the MaxLFQ algorithm, so far only applicable to data-dependent acquisition mode using the software suite MaxQuant. Moreover, our implementation shows, for each protein separately, the validity of quantification over all samples. Hence, iq exports a state-of-the-art protein quantification algorithm to the emerging DIA mode in an open-source implementation. Availability and implementation The open-source R package is available on CRAN, https://github.com/tvpham/iq/releases and oncoproteomics.nl/iq. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Faculty Opinions recommendation of Population genetic analyses reveal the African origin and strain variation of Cryptococcus neoformans var. grubii.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13968972.15426079 ◽

2012 ◽

Author(s):

June Kwon-Chung

Keyword(s):

Cryptococcus Neoformans ◽

Population Genetic ◽

African Origin ◽

Strain Variation ◽

Genetic Analyses ◽

Population Genetic Analyses

Download Full-text

The Handicap Principle in Parent-Offspring Conflict: Comparison of Optimality and Population-Genetic Analyses

The American Naturalist ◽

10.1086/285152 ◽

1991 ◽

Vol 137 (2) ◽

pp. 167-185 ◽

Cited By ~ 25

Author(s):

Ilan Eshel ◽

Marcus W. Feldman

Keyword(s):

Population Genetic ◽

Handicap Principle ◽

Genetic Analyses ◽

Population Genetic Analyses

Download Full-text

ExperimentSubset: an R package to manage subsets of Bioconductor Experiment objects

Bioinformatics ◽

10.1093/bioinformatics/btab179 ◽

2021 ◽

Author(s):

Irzam Sarfraz ◽

Muhammad Asif ◽

Joshua D Campbell

Keyword(s):

Single Cell ◽

R Package ◽

Poor Quality ◽

Data Matrix ◽

Supplementary Information ◽

Data Provenance ◽

Rna Seq ◽

Efficient Management ◽

The Matrix ◽

The Relationship

Abstract Motivation R Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for storing one or more matrix-like assays along with associated row and column data. These objects have been used to facilitate the storage and analysis of high-throughput genomic data generated from technologies such as single-cell RNA sequencing. One common computational task in many genomics analysis workflows is to perform subsetting of the data matrix before applying down-stream analytical methods. For example, one may need to subset the columns of the assay matrix to exclude poor-quality samples or subset the rows of the matrix to select the most variable features. Traditionally, a second object is created that contains the desired subset of assay from the original object. However, this approach is inefficient as it requires the creation of an additional object containing a copy of the original assay and leads to challenges with data provenance. Results To overcome these challenges, we developed an R package called ExperimentSubset, which is a data container that implements classes for efficient storage and streamlined retrieval of assays that have been subsetted by rows and/or columns. These classes are able to inherently provide data provenance by maintaining the relationship between the subsetted and parent assays. We demonstrate the utility of this package on a single-cell RNA-seq dataset by storing and retrieving subsets at different stages of the analysis while maintaining a lower memory footprint. Overall, the ExperimentSubset is a flexible container for the efficient management of subsets. Availability and implementation ExperimentSubset package is available at Bioconductor: https://bioconductor.org/packages/ExperimentSubset/ and Github: https://github.com/campbio/ExperimentSubset. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

BloodGen3Module: Blood transcriptional module repertoire analysis and visualization using R

Bioinformatics ◽

10.1093/bioinformatics/btab121 ◽

2021 ◽

Author(s):

Darawan Rinchai ◽

Jessica Roelands ◽

Mohammed Toufiq ◽

Wouter Hendrickx ◽

Matthew C Altman ◽

...

Keyword(s):

Transcript Abundance ◽

R Package ◽

Supplementary Information ◽

Illustrative Case ◽

Bioinformatic Tools ◽

Transcriptional Module ◽

Wide Range ◽

Downstream Analysis ◽

Computing Module ◽

Parallel Workflow

Abstract Motivation We previously described the construction and characterization of generic and reusable blood transcriptional module repertoires. More recently we released a third iteration (“BloodGen3” module repertoire) that comprises 382 functionally annotated gene sets (modules) and encompasses 14,168 transcripts. Custom bioinformatic tools are needed to support downstream analysis, visualization and interpretation relying on such fixed module repertoires. Results We have developed and describe here a R package, BloodGen3Module. The functions of our package permit group comparison analyses to be performed at the module-level, and to display the results as annotated fingerprint grid plots. A parallel workflow for computing module repertoire changes for individual samples rather than groups of samples is also available; these results are displayed as fingerprint heatmaps. An illustrative case is used to demonstrate the steps involved in generating blood transcriptome repertoire fingerprints of septic patients. Taken together, this resource could facilitate the analysis and interpretation of changes in blood transcript abundance observed across a wide range of pathological and physiological states. Availability The BloodGen3Module package and documentation are freely available from Github: https://github.com/Drinchai/BloodGen3Module Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

movAPA: modeling and visualization of dynamics of alternative polyadenylation across biological samples

Bioinformatics ◽

10.1093/bioinformatics/btaa997 ◽

2020 ◽

Author(s):

Wenbin Ye ◽

Tao Liu ◽

Hongjuan Fu ◽

Congting Ye ◽

Guoli Ji ◽

...

Keyword(s):

Biological Samples ◽

Tissue Specificity ◽

Single Cells ◽

Alternative Polyadenylation ◽

R Package ◽

Supplementary Information ◽

Rna Seq ◽

Mouse Sperm ◽

High Scalability ◽

A Site

Abstract Motivation Alternative polyadenylation (APA) has been widely recognized as a widespread mechanism modulated dynamically. Studies based on 3′ end sequencing and/or RNA-seq have profiled poly(A) sites in various species with diverse pipelines, yet no unified and easy-to-use toolkit is available for comprehensive APA analyses. Results We developed an R package called movAPA for modeling and visualization of dynamics of alternative polyadenylation across biological samples. movAPA incorporates rich functions for preprocessing, annotation and statistical analyses of poly(A) sites, identification of poly(A) signals, profiling of APA dynamics and visualization. Particularly, seven metrics are provided for measuring the tissue-specificity or usages of APA sites across samples. Three methods are used for identifying 3′ UTR shortening/lengthening events between conditions. APA site switching involving non-3′ UTR polyadenylation can also be explored. Using poly(A) site data from rice and mouse sperm cells, we demonstrated the high scalability and flexibility of movAPA in profiling APA dynamics across tissues and single cells. Availability and implementation https://github.com/BMILAB/movAPA. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text