bWGR: Bayesian whole-genome regression

Bioinformatics ◽

10.1093/bioinformatics/btz794 ◽

2019 ◽

Cited By ~ 1

Author(s):

Alencar Xavier ◽

William M Muir ◽

Katy M Rainey

Keyword(s):

Bayesian Methods ◽

Expectation Maximization ◽

Complex Traits ◽

Hierarchical Models ◽

R Package ◽

Supplementary Information ◽

Whole Genome ◽

Regression Methods ◽

Genome Wide ◽

User Friendly

AbstractMotivationWhole-genome regressions methods represent a key framework for genome-wide prediction, cross-validation studies and association analysis. The bWGR offers a compendium of Bayesian methods with various priors available, allowing users to predict complex traits with different genetic architectures.ResultsHere we introduce bWGR, an R package that enables users to efficient fit and cross-validate Bayesian and likelihood whole-genome regression methods. It implements a series of methods referred to as the Bayesian alphabet under the traditional Gibbs sampling and optimized expectation-maximization. The package also enables fitting efficient multivariate models and complex hierarchical models. The package is user-friendly and computational efficient.Availability and implementationbWGR is an R package available in the CRAN repository. It can be installed in R by typing: install.packages(‘bWGR’).Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text

MutSpot: detection of non-coding mutation hotspots in cancer genomes

10.1101/740944 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yu Amanda Guo ◽

Mei Mei Chang ◽

Anders Jacobsen Skanderup

Keyword(s):

Somatic Mutations ◽

R Package ◽

Supplementary Information ◽

Patient Specific ◽

Supplementary Data ◽

Link Type ◽

Genome Wide ◽

Cancer Genomes ◽

User Friendly ◽

Regulatory Dna

AbstractSummaryRecurrence and clustering of somatic mutations (hotspots) in cancer genomes may indicate positive selection and involvement in tumorigenesis. MutSpot performs genome-wide inference of mutation hotspots in non-coding and regulatory DNA of cancer genomes. MutSpot performs feature selection across hundreds of epigenetic and sequence features followed by estimation of position and patient-specific background somatic mutation probabilities. MutSpot is user-friendly, works on a standard workstation, and scales to thousands of cancer genomes.Availability and implementationMutSpot is implemented as an R package and is available at https://github.com/skandlab/MutSpot/Supplementary informationSupplementary data are available at https://github.com/skandlab/MutSpot/

Download Full-text

bGWAS: an R package to perform Bayesian genome wide association studies

Bioinformatics ◽

10.1093/bioinformatics/btaa549 ◽

2020 ◽

Vol 36 (15) ◽

pp. 4374-4376

Author(s):

Ninon Mounier ◽

Zoltán Kutalik

Keyword(s):

Mendelian Randomization ◽

Causal Effect ◽

Association Studies ◽

R Package ◽

Genome Wide Association ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Biological Mechanisms ◽

Genome Wide ◽

Related Risk

Abstract Summary Increasing sample size is not the only strategy to improve discovery in Genome Wide Association Studies (GWASs) and we propose here an approach that leverages published studies of related traits to improve inference. Our Bayesian GWAS method derives informative prior effects by leveraging GWASs of related risk factors and their causal effect estimates on the focal trait using multivariable Mendelian randomization. These prior effects are combined with the observed effects to yield Bayes Factors, posterior and direct effects. The approach not only increases power, but also has the potential to dissect direct and indirect biological mechanisms. Availability and implementation bGWAS package is freely available under a GPL-2 License, and can be accessed, alongside with user guides and tutorials, from https://github.com/n-mounier/bGWAS. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DiscoRhythm: an easy-to-use web application and R package for discovering rhythmicity

Bioinformatics ◽

10.1093/bioinformatics/btz834 ◽

2019 ◽

Cited By ~ 2

Author(s):

Matthew Carlucci ◽

Algimantas Kriščiūnas ◽

Haohan Li ◽

Povilas Gibas ◽

Karolis Koncevičius ◽

...

Keyword(s):

Web Application ◽

Statistical Significance ◽

R Package ◽

Biological Data ◽

Supplementary Information ◽

Statistical Knowledge ◽

Health And Disease ◽

Phase Amplitude ◽

Almost All ◽

User Friendly

Abstract Motivation Biological rhythmicity is fundamental to almost all organisms on Earth and plays a key role in health and disease. Identification of oscillating signals could lead to novel biological insights, yet its investigation is impeded by the extensive computational and statistical knowledge required to perform such analysis. Results To address this issue, we present DiscoRhythm (Discovering Rhythmicity), a user-friendly application for characterizing rhythmicity in temporal biological data. DiscoRhythm is available as a web application or an R/Bioconductor package for estimating phase, amplitude, and statistical significance using four popular approaches to rhythm detection (Cosinor, JTK Cycle, ARSER, and Lomb-Scargle). We optimized these algorithms for speed, improving their execution times up to 30-fold to enable rapid analysis of -omic-scale datasets in real-time. Informative visualizations, interactive modules for quality control, dimensionality reduction, periodicity profiling, and incorporation of experimental replicates make DiscoRhythm a thorough toolkit for analyzing rhythmicity. Availability and Implementation The DiscoRhythm R package is available on Bioconductor (https://bioconductor.org/packages/DiscoRhythm), with source code available on GitHub (https://github.com/matthewcarlucci/DiscoRhythm) under a GPL-3 license. The web application is securely deployed over HTTPS (https://disco.camh.ca) and is freely available for use worldwide. Local instances of the DiscoRhythm web application can be created using the R package or by deploying the publicly available Docker container (https://hub.docker.com/r/mcarlucci/discorhythm). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Multi-SNP mediation intersection-union test

Bioinformatics ◽

10.1093/bioinformatics/btz285 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4724-4729 ◽

Cited By ~ 4

Author(s):

Wujuan Zhong ◽

Cassandra N Spracklen ◽

Karen L Mohlke ◽

Xiaojing Zheng ◽

Jason Fine ◽

...

Keyword(s):

Association Studies ◽

R Package ◽

Alternative Methods ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Mediation Effects ◽

Coding Regions ◽

Genome Wide ◽

Plasma Adiponectin Level ◽

Intersection Union Test

Abstract Summary Tens of thousands of reproducibly identified GWAS (Genome-Wide Association Studies) variants, with the vast majority falling in non-coding regions resulting in no eventual protein products, call urgently for mechanistic interpretations. Although numerous methods exist, there are few, if any methods, for simultaneously testing the mediation effects of multiple correlated SNPs via some mediator (e.g. the expression of a gene in the neighborhood) on phenotypic outcome. We propose multi-SNP mediation intersection-union test (SMUT) to fill in this methodological gap. Our extensive simulations demonstrate the validity of SMUT as well as substantial, up to 92%, power gains over alternative methods. In addition, SMUT confirmed known mediators in a real dataset of Finns for plasma adiponectin level, which were missed by many alternative methods. We believe SMUT will become a useful tool to generate mechanistic hypotheses underlying GWAS variants, facilitating functional follow-up. Availability and implementation The R package SMUT is publicly available from CRAN at https://CRAN.R-project.org/package=SMUT. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A Bayesian linear mixed model for prediction of complex traits

Bioinformatics ◽

10.1093/bioinformatics/btaa1023 ◽

2020 ◽

Author(s):

Yang Hai ◽

Yalu Wen

Keyword(s):

Complex Traits ◽

Mixed Model ◽

Linear Mixed Model ◽

Rare Variants ◽

Disease Risk ◽

R Package ◽

Underlying Disease ◽

Supplementary Information ◽

True Effect Size ◽

Bayes Algorithm

Abstract Motivation Accurate disease risk prediction is essential for precision medicine. Existing models either assume that diseases are caused by groups of predictors with small-to-moderate effects or a few isolated predictors with large effects. Their performance can be sensitive to the underlying disease mechanisms, which are usually unknown in advance. Results We developed a Bayesian linear mixed model (BLMM), where genetic effects were modelled using a hybrid of the sparsity regression and linear mixed model with multiple random effects. The parameters in BLMM were inferred through a computationally efficient variational Bayes algorithm. The proposed method can resemble the shape of the true effect size distributions, captures the predictive effects from both common and rare variants, and is robust against various disease models. Through extensive simulations and the application to a whole-genome sequencing dataset obtained from the Alzheimer’s Disease Neuroimaging Initiatives, we have demonstrated that BLMM has better prediction performance than existing methods and can detect variables and/or genetic regions that are predictive. Availability The R-package is available at https://github.com/yhai943/BLMM Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

AlpsNMR: an R package for signal processing of fully untargeted NMR-based metabolomics

Bioinformatics ◽

10.1093/bioinformatics/btaa022 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2943-2945 ◽

Cited By ~ 3

Author(s):

Francisco Madrid-Gambin ◽

Sergio Oller-Moreno ◽

Luis Fernandez ◽

Simona Bartova ◽

Maria Pilar Giner ◽

...

Keyword(s):

Signal Processing ◽

Statistical Analysis ◽

Processing System ◽

R Package ◽

Supplementary Information ◽

Test Case ◽

Previous Knowledge ◽

Spectral Processing ◽

Computational Tools ◽

User Friendly

Abstract Summary Nuclear magnetic resonance (NMR)-based metabolomics is widely used to obtain metabolic fingerprints of biological systems. While targeted workflows require previous knowledge of metabolites, prior to statistical analysis, untargeted approaches remain a challenge. Computational tools dealing with fully untargeted NMR-based metabolomics are still scarce or not user-friendly. Therefore, we developed AlpsNMR (Automated spectraL Processing System for NMR), an R package that provides automated and efficient signal processing for untargeted NMR metabolomics. AlpsNMR includes spectra loading, metadata handling, automated outlier detection, spectra alignment and peak-picking, integration and normalization. The resulting output can be used for further statistical analysis. AlpsNMR proved effective in detecting metabolite changes in a test case. The tool allows less experienced users to easily implement this workflow from spectra to a ready-to-use dataset in their routines. Availability and implementation The AlpsNMR R package and tutorial is freely available to download from http://github.com/sipss/AlpsNMR under the MIT license. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large whole genome sequencing datasets

10.1101/2020.10.15.340901 ◽

2020 ◽

Author(s):

Ryan D. Crawford ◽

Evan S. Snitkin

Keyword(s):

Phylogenetic Analysis ◽

Genome Sequencing ◽

Software Package ◽

Genomic Data ◽

Phylogenetic Inference ◽

R Package ◽

Core Gene ◽

Whole Genome ◽

Rapid Generation ◽

User Friendly

AbstractThe quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. We present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis. We applied this tool to generate core gene alignments for very large genomic datasets, including a dataset of over 11,000 genomes from the genus Escherichia containing 1,353 genes, which was constructed in less than 17 hours. We have released cognac as an R package (https://github.com/rdcrawford/cognac) with customizable parameters for adaptation to diverse applications.

Download Full-text

gwasurvivr: an R package for genome wide survival analysis

10.1101/326033 ◽

2018 ◽

Author(s):

Abbas A Rizvi ◽

Ezgi Karaesmen ◽

Martin Morgan ◽

Leah Preus ◽

Junke Wang ◽

...

Keyword(s):

Survival Analysis ◽

Cox Model ◽

R Package ◽

Supplementary Information ◽

Parameter Estimates ◽

Survival Analyses ◽

Link Type ◽

Genome Wide ◽

Size Number ◽

Simple Interface

ABSTRACTSummaryTo address the limited software options for performing survival analyses with millions of SNPs, we developed gwasurvivr, an R/Bioconductor package with a simple interface for conducting genome wide survival analyses using VCF (outputted from Michigan or Sanger imputation servers), IMPUTE2 or PLINK files. To decrease the number of iterations needed for convergence when optimizing the parameter estimates in the Cox model we modified the R package survival; covariates in the model are first fit without the SNP, and those parameter estimates are used as initial points. We benchmarked gwasurvivr with other software capable of conducting genome wide survival analysis (genipe, SurvivalGWAS_SV, and GWASTools). gwasurvivr is significantly faster and shows better scalability as sample size, number of SNPs and number of covariates increases.Availability and implementationgwasurvivr, including source code, documentation, and vignette are available at: http://bioconductor.org/packages/gwasurvivrContactAbbas Rizvi, [email protected]; Lara E Sucheston-Campbell, [email protected] information: Supplementary data are available at https://github.com/suchestoncampbelllab/gwasurvivr_manuscript

Download Full-text

dittoSeq: universal user-friendly single-cell and bulk RNA sequencing visualization toolkit

Bioinformatics ◽

10.1093/bioinformatics/btaa1011 ◽

2020 ◽

Author(s):

Daniel G Bunis ◽

Jared Andrews ◽

Gabriela K Fragiadakis ◽

Trevor D Burt ◽

Marina Sirota

Keyword(s):

Single Cell ◽

R Package ◽

Color Blindness ◽

Ease Of Use ◽

Supplementary Information ◽

Supplementary Data ◽

Rnaseq Data ◽

Visualization Toolkit ◽

User Friendly ◽

Publication Quality

Abstract Summary A visualization suite for major forms of bulk and single-cell RNAseq data in R. dittoSeq is color blindness-friendly by default, robustly documented to power ease-of-use and allows highly customizable generation of both daily-use and publication-quality figures. Availability and implementation dittoSeq is an R package available through Bioconductor via an open source MIT license. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Bayesian weighted Mendelian randomization for causal inference based on summary statistics

Bioinformatics ◽

10.1093/bioinformatics/btz749 ◽

2019 ◽

Author(s):

Jia Zhao ◽

Jingsi Ming ◽

Xianghong Hu ◽

Gang Chen ◽

Jin Liu ◽

...

Keyword(s):

Causal Inference ◽

Complex Traits ◽

Mendelian Randomization ◽

Causal Effect ◽

Association Studies ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Simulation Studies ◽

Genome Wide ◽

Complex Human Traits

Abstract Motivation The results from Genome-Wide Association Studies (GWAS) on thousands of phenotypes provide an unprecedented opportunity to infer the causal effect of one phenotype (exposure) on another (outcome). Mendelian randomization (MR), an instrumental variable (IV) method, has been introduced for causal inference using GWAS data. Due to the polygenic architecture of complex traits/diseases and the ubiquity of pleiotropy, however, MR has many unique challenges compared to conventional IV methods. Results We propose a Bayesian weighted Mendelian randomization (BWMR) for causal inference to address these challenges. In our BWMR model, the uncertainty of weak effects owing to polygenicity has been taken into account and the violation of IV assumption due to pleiotropy has been addressed through outlier detection by Bayesian weighting. To make the causal inference based on BWMR computationally stable and efficient, we developed a variational expectation-maximization (VEM) algorithm. Moreover, we have also derived an exact closed-form formula to correct the posterior covariance which is often underestimated in variational inference. Through comprehensive simulation studies, we evaluated the performance of BWMR, demonstrating the advantage of BWMR over its competitors. Then we applied BWMR to make causal inference between 130 metabolites and 93 complex human traits, uncovering novel causal relationship between exposure and outcome traits. Availability and implementation The BWMR software is available at https://github.com/jiazhao97/BWMR. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text