Bayesian functional mixed-effects models with grouped smoothness for analyzing time-course gene expression data

2020 ◽  
Vol 15 ◽  
Author(s):  
Shangyuan Ye ◽  
Ye Liang ◽  
Bo Zhang

Objective: As a result of the development of microarray technologies, gene expression levels of thousands of genes involved in a given biological process can be measured simultaneously, and it is important to study their temporal behavior to understand their mechanisms. Since the dependence between gene expression levels over time for a given gene is often too complicated to model parametrically, sparse functional data analysis has received an increasing amount of attention for analyzing such data. Methods: We propose a new functional mixed-effects model for analyzing time-course gene expression data. Specifically, the model groups individual functions with heterogeneous smoothness. The proposed method utilizes the mixed-effects model representation of penalized splines for both the mean function and the individual functions. Given noninformative or weakly informative priors, Bayesian inference on the proposed models was developed, and Bayesian computation was implemented by using Markov chain Monte Carlo methods. Results: The performance of our new model was studied by two simulation studies and illustrated using a yeast cell cycle gene expression dataset. Simulation results suggest that our proposed methods can outperform the previously used methods in terms of the mean integrated squared error. The yeast gene expression data application suggests that the proposed model with two latent groups should be used on this dataset.

2015 ◽  
Author(s):  
Andrew Anand Brown ◽  
Zhihao Ding ◽  
Ana Viñuela ◽  
Dan Glass ◽  
Leopold Parts ◽  
...  

Statistical factor analysis methods have previously been used to remove noise components from high dimensional data prior to genetic association mapping, and in a guided fashion to summarise biologically relevant sources of variation. Here we show how the derived factors summarising pathway expression can be used to analyse the relationships between expression, heritability and ageing. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarise patterns of gene expression, both to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 "pathway phenotypes" which summarised patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold (P<5.38E-5). These phenotypes are more heritable (h^2=0.32) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolising sugars and fatty acids, others with insulin signalling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Cheng Qian ◽  
Amin Emad ◽  
Nicholas D. Sidiropoulos

Abstract The biological processes involved in a drug’s mechanisms of action are oftentimes dynamic, complex and difficult to discern. Time-course gene expression data is a rich source of information that can be used to unravel these complex processes, identify biomarkers of drug sensitivity and predict the response to a drug. However, the majority of previous work has not fully utilized this temporal dimension. In these studies, the gene expression data is either considered at one time-point (before the administration of the drug) or two time-points (before and after the administration of the drug). This is clearly inadequate in modeling dynamic gene–drug interactions, especially for applications such as long-term drug therapy. In this work, we present a novel REcursive Prediction (REP) framework for drug response prediction by taking advantage of time-course gene expression data. Our goal is to predict drug response values at every stage of a long-term treatment, given the expression levels of genes collected in the previous time-points. To this end, REP employs a built-in recursive structure that exploits the intrinsic time-course nature of the data and integrates past values of drug responses for subsequent predictions. It also incorporates tensor completion that can not only alleviate the impact of noise and missing data, but also predict unseen gene expression levels (GEXs). These advantages enable REP to estimate drug response at any stage of a given treatment from some GEXs measured in the beginning of the treatment. Extensive experiments on two datasets corresponding to multiple sclerosis patients treated with interferon are included to showcase the effectiveness of REP.


2019 ◽  
Vol 17 (04) ◽  
pp. 1950015 ◽  
Author(s):  
Shuhei Kimura ◽  
Masato Tokuhisa ◽  
Mariko Okada

In using gene expression levels for genetic network inference, we believe that two measurements that are similar to each other are less informative than two measurements that differ from each other. Given, for example, that gene expression levels measured at two adjacent time points in a time-series experiment are often similar to each other, we assume that each measurement in the time-series experiment will be less informative than each measurement in a steady-state experiment. Based on this idea, we propose a new inference method that relies heavily on informative gene expression data. Through numerical experiments, we prove that the quality of an inferred genetic network is slightly improved by heavily weighting informative gene expression data. In this study, we develop a new method by modifying the existing random-forest-based inference method to take advantage of its ability to analyze both time-series and static gene expression data. The idea we propose can be similarly applied to many of the other existing inference methods, as well.


2007 ◽  
Vol 8 (1) ◽  
Author(s):  
Miika Ahdesmäki ◽  
Harri Lähdesmäki ◽  
Andrew Gracey ◽  
llya Shmulevich ◽  
Olli Yli-Harja

Cancers ◽  
2019 ◽  
Vol 11 (7) ◽  
pp. 983 ◽  
Author(s):  
Otília Menyhart ◽  
Tatsuhiko Kakisaka ◽  
Lőrinc Sándor Pongor ◽  
Hiroyuki Uetake ◽  
Ajay Goel ◽  
...  

Background: Numerous driver mutations have been identified in colorectal cancer (CRC), but their relevance to the development of targeted therapies remains elusive. The secondary effects of pathogenic driver mutations on downstream signaling pathways offer a potential approach for the identification of therapeutic targets. We aimed to identify differentially expressed genes as potential drug targets linked to driver mutations. Methods: Somatic mutations and the gene expression data of 582 CRC patients were utilized, incorporating the mutational status of 39,916 and the expression levels of 20,500 genes. To uncover candidate targets, the expression levels of various genes in wild-type and mutant cases for the most frequent disruptive mutations were compared with a Mann–Whitney test. A survival analysis was performed in 2100 patients with transcriptomic gene expression data. Up-regulated genes associated with worse survival were filtered for potentially actionable targets. The most significant hits were validated in an independent set of 171 CRC patients. Results: Altogether, 426 disruptive mutation-associated upregulated genes were identified. Among these, 95 were linked to worse recurrence-free survival (RFS). Based on the druggability filter, 37 potentially actionable targets were revealed. We selected seven genes and validated their expression in 171 patient specimens. The best independently validated combinations were DUSP4 (p = 2.6 × 10−12) in ACVR2A mutated (7.7%) patients; BMP4 (p = 1.6 × 10−04) in SOX9 mutated (8.1%) patients; TRIB2 (p = 1.35 × 10−14) in ACVR2A mutated patients; VSIG4 (p = 2.6 × 10−05) in ANK3 mutated (7.6%) patients, and DUSP4 (p = 7.1 × 10−04) in AMER1 mutated (8.2%) patients. Conclusions: The results uncovered potentially druggable genes in colorectal cancer. The identified mutations could enable future patient stratification for targeted therapy.


Sign in / Sign up

Export Citation Format

Share Document