scholarly journals MOSAIC: a joint modeling methodology for combined circadian and non-circadian analysis of multi-omics data

Author(s):  
Hannah De los Santos ◽  
Kristin P Bennett ◽  
Jennifer M Hurley

Abstract Motivation Circadian rhythms are approximately 24-h endogenous cycles that control many biological functions. To identify these rhythms, biological samples are taken over circadian time and analyzed using a single omics type, such as transcriptomics or proteomics. By comparing data from these single omics approaches, it has been shown that transcriptional rhythms are not necessarily conserved at the protein level, implying extensive circadian post-transcriptional regulation. However, as proteomics methods are known to be noisier than transcriptomic methods, this suggests that previously identified arrhythmic proteins with rhythmic transcripts could have been missed due to noise and may not be due to post-transcriptional regulation. Results To determine if one can use information from less-noisy transcriptomic data to inform rhythms in more-noisy proteomic data, and thus more accurately identify rhythms in the proteome, we have created the Multi-Omics Selection with Amplitude Independent Criteria (MOSAIC) application. MOSAIC combines model selection and joint modeling of multiple omics types to recover significant circadian and non-circadian trends. Using both synthetic data and proteomic data from Neurospora crassa, we showed that MOSAIC accurately recovers circadian rhythms at higher rates in not only the proteome but the transcriptome as well, outperforming existing methods for rhythm identification. In addition, by quantifying non-circadian trends in addition to circadian trends in data, our methodology allowed for the recognition of the diversity of circadian regulation as compared to non-circadian regulation. Availability and implementation MOSAIC’s full interface is available at https://github.com/delosh653/MOSAIC. An R package for this functionality, mosaic.find, can be downloaded at https://CRAN.R-project.org/package=mosaic.find. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Author(s):  
Hannah De los Santos ◽  
Kristin P. Bennett ◽  
Jennifer M. Hurley

AbstractMotivationCircadian rhythms are approximately 24 hour endogenous cycles that control many biological functions. To identify these rhythms, biological samples are taken over circadian time and analyzed using a single omics type, such as transcriptomics or proteomics. By comparing data from these single omics approaches, it has been shown that transcriptional rhythms are not necessarily conserved at the protein level, implying extensive circadian post-transcriptional regulation. However, as proteomics methods are known to be noisier than transcriptomic methods, this suggests that previously identified arrhythmic proteins with rhythmic transcripts could have been missed due to noise and may not be due to post-transcriptional regulation.ResultsTo determine if one can use information from less-noisy transcriptomic data to inform rhythms in more-noisy proteomic data, and thus more accurately identify rhythms in the proteome, we have created the MOSAIC (Multi-Omics Selection with Amplitude Independent Criteria) application. MOSAIC combines model selection and joint modeling of multiple omics types to recover significant circadian and non-circadian trends. Using both synthetic data and proteomic data from Neurospora crassa, we showed that MOSAIC accurately recovers circadian rhythms at higher rates in not only the proteome but the transcriptome as well, outperforming existing methods for rhythm identification. In addition, by quantifying non-circadian trends in addition to circadian trends in data, our methodology allowed for the recognition of the diversity of circadian regulation as compared to non-circadian regulation.AvailabilityMOSAIC’s full interface is available at https://github.com/delosh653/[email protected] informationSupplementary data are available.at Bioinformatics online.


2020 ◽  
Vol 36 (9) ◽  
pp. 2862-2871
Author(s):  
Chiung-Ting Wu ◽  
Yizhi Wang ◽  
Yinxue Wang ◽  
Timothy Ebbels ◽  
Ibrahim Karaman ◽  
...  

Abstract Motivation Liquid chromatography–mass spectrometry (LC-MS) is a standard method for proteomics and metabolomics analysis of biological samples. Unfortunately, it suffers from various changes in the retention times (RT) of the same compound in different samples, and these must be subsequently corrected (aligned) during data processing. Classic alignment methods such as in the popular XCMS package often assume a single time-warping function for each sample. Thus, the potentially varying RT drift for compounds with different masses in a sample is neglected in these methods. Moreover, the systematic change in RT drift across run order is often not considered by alignment algorithms. Therefore, these methods cannot effectively correct all misalignments. For a large-scale experiment involving many samples, the existence of misalignment becomes inevitable and concerning. Results Here, we describe an integrated reference-free profile alignment method, neighbor-wise compound-specific Graphical Time Warping (ncGTW), that can detect misaligned features and align profiles by leveraging expected RT drift structures and compound-specific warping functions. Specifically, ncGTW uses individualized warping functions for different compounds and assigns constraint edges on warping functions of neighboring samples. Validated with both realistic synthetic data and internal quality control samples, ncGTW applied to two large-scale metabolomics LC-MS datasets identifies many misaligned features and successfully realigns them. These features would otherwise be discarded or uncorrected using existing methods. The ncGTW software tool is developed currently as a plug-in to detect and realign misaligned features present in standard XCMS output. Availability and implementation An R package of ncGTW is freely available at Bioconductor and https://github.com/ChiungTingWu/ncGTW. A detailed user’s manual and a vignette are provided within the package. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (9) ◽  
pp. 2938-2940
Author(s):  
Olivia Angelin-Bonnet ◽  
Patrick J Biggs ◽  
Samantha Baldwin ◽  
Susan Thomson ◽  
Matthieu Vignes

Abstract Summary We present sismonr, an R package for an integral generation and simulation of in silico biological systems. The package generates gene regulatory networks, which include protein-coding and non-coding genes along with different transcriptional and post-transcriptional regulations. The effect of genetic mutations on the system behaviour is accounted for via the simulation of genetically different in silico individuals. The ploidy of the system is not restricted to the usual haploid or diploid situations but can be defined by the user to higher ploidies. A choice of stochastic simulation algorithms allows us to simulate the expression profiles of the genes in the in silico system. We illustrate the use of sismonr by simulating the anthocyanin biosynthesis regulation pathway for three genetically distinct in silico plants. Availability and implementation The sismonr package is implemented in R and Julia and is publicly available on the CRAN repository (https://CRAN.R-project.org/package=sismonr). A detailed tutorial is available from GitHub at https://oliviaab.github.io/sismonr/. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Yating Liu ◽  
Joseph D Dougherty

Abstract Summary Whole genome sequencing of patient populations is identifying thousands of new variants in UnTranslated Regions(UTRs). While the consequences of UTR mutations are not as easily predicted from primary sequence as coding mutations are, there are some known features of UTRs that modulate their function. utr.annotation is an R package that can be used to annotate potential deleterious variants in the UTR regions for both human and mouse species. Given a CSV or VCF format variant file, utr.annotation provides information of each variant on whether and how it alters known translational regulators including: upstream Open Reading Frames (uORFs), upstream Kozak sequences, polyA signals, Kozak sequences at the annotated translation start site, start codons, and stop codons, conservation scores in the variant position, and whether and how it changes ribosome loading based on a model derived from empirical data. Availability utr.annotation is freely available on Bitbucket (https://bitbucket.org/jdlabteam/utr.annotation/src/master/) and CRAN (https://cran.r-project.org/web/packages/utr.annotation/index.html) Supplementary information Supplementary data are available at https://wustl.box.com/s/yye99bryfin89nav45gv91l5k35fxo7z.


Author(s):  
Wencan Zhu ◽  
Céline Lévy-Leduc ◽  
Nils Ternès

Abstract Motivation In genomic studies, identifying biomarkers associated with a variable of interest is a major concern in biomedical research. Regularized approaches are classically used to perform variable selection in high-dimensional linear models. However, these methods can fail in highly correlated settings. Results We propose a novel variable selection approach called WLasso, taking these correlations into account. It consists in rewriting the initial high-dimensional linear model to remove the correlation between the biomarkers (predictors) and in applying the generalized Lasso criterion. The performance of WLasso is assessed using synthetic data in several scenarios and compared with recent alternative approaches. The results show that when the biomarkers are highly correlated, WLasso outperforms the other approaches in sparse high-dimensional frameworks. The method is also illustrated on publicly available gene expression data in breast cancer. Availabilityand implementation Our method is implemented in the WLasso R package which is available from the Comprehensive R Archive Network (CRAN). Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Chiung-Ting Wu ◽  
David M. Herrington ◽  
Yizhi Wang ◽  
Timothy Ebbels ◽  
Ibrahim Karaman ◽  
...  

AbstractMotivationLiquid chromatography - mass spectrometry (LC-MS) is a standard method for proteomics and metabolomics analysis of biological samples. Unfortunately, it suffers from various changes in the retention times (RT) of the same compound in different samples, and these must be subsequently corrected (aligned) during data processing. Classic alignment methods such as in the popular XCMS package often assume a single time-warping function for each sample. Thus, the potentially varying RT drift for compounds with different masses in a sample is neglected in these methods. Moreover, the systematic change in RT drift across run order is often not considered by alignment algorithms. Therefore, these methods cannot completely correct misalignments. For a large-scale experiment involving many samples, the existence of misalignment becomes inevitable and concerning.ResultsHere we describe an integrated reference-free profile alignment method, neighbor-wise compound-specific Graphical Time Warping (ncGTW), that can detect misaligned features and align profiles by leveraging expected RT drift structures and compound-specific warping functions. Specifically, ncGTW uses individualized warping functions for different compounds and assigns constraint edges on warping functions of neighboring samples. Validated with both realistic synthetic data and internal quality control samples, ncGTW applied to two large-scale metabolomics LC-MS datasets identifies many misaligned features and successfully realigns them. These features would otherwise be discarded or uncorrected using existing methods. The ncGTW software tool is developed currently as a plug-in to the XCMS package.Availability and ImplementationAn R package of ncGTW is freely available at https://github.com/ChiungTingWu/ncGTW. A detailed user’s manual and a vignette are provided within the [email protected], [email protected] informationSupplementary data are available at Bioinformatics online.


Author(s):  
Emily J Collins ◽  
Mariana P Cervantes-Silva ◽  
George A Timmons ◽  
James R O’Siorain ◽  
Annie M Curtis ◽  
...  

SUMMARYOur core timekeeping mechanism, the circadian clock, regulates an astonishing amount of cellular physiology and behavior, playing a vital role in organismal fitness. While the mechanics of circadian control over cellular regulation can in part be explained by the transcriptional activation stemming from the positive arm of the clock’s transcription-translation negative feedback loop, research has shown that extensive circadian regulation occurs beyond transcriptional activation in fungal species and data suggest that this post-transcriptional regulation may also be preserved in mammals. To determine the extent to which circadian output is regulated post-transcriptionally in mammalian cells, we comprehensively profiled the transcriptome and proteome of murine bone marrow-derived macrophages in a high resolution, sample rich time course. We found that only 15% of the circadian proteome had corresponding oscillating mRNA and this regulation was cell intrinsic. Ontological analysis of oscillating proteins revealed robust temporal enrichment for protein degradation and translation, providing potential insights into the source of this extensive post-transcriptional regulation. We noted post-transcriptional temporal-gating across a number of connected metabolic pathways. This temporal metabolic regulation further corresponded with rhythms we observed in ATP production, mitochondrial morphology, and phagocytosis. With the strong interconnection between cellular metabolic states and macrophage phenotypes/responses, our work demonstrates that post-transcriptional circadian regulation in macrophages is broadly utilized as a tool to confer time-dependent immune function and responses. As macrophages coordinate many immunological and inflammatory functions, an understanding of this regulation provides a framework to determine the impact of circadian regulation on a wide array of disease pathologies.


Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 43-OR
Author(s):  
DINA MOSTAFA ◽  
AKINORI TAKAHASHI ◽  
TADASHI YAMAMOTO

Sign in / Sign up

Export Citation Format

Share Document