scholarly journals Methods detecting rhythmic gene expression are biologically relevant only for strong signal

2020 ◽  
Vol 16 (3) ◽  
pp. e1007666 ◽  
Author(s):  
David Laloum ◽  
Marc Robinson-Rechavi
2019 ◽  
Author(s):  
David Laloum ◽  
Marc Robinson-Rechavi

AbstractThe nycthemeral transcriptome embodies all genes displaying a rhythmic variation of their mRNAs periodically every 24 hours, including but not restricted to circadian genes. In this study, we show that the nycthemeral rhythmicity at the gene expression level is biologically functional and that this functionality is more conserved between orthologous genes than between random genes. We used this conservation of the rhythmic expression to assess the ability of seven methods (ARSER, Lomb Scargle, RAIN, JTK, empirical-JTK, GeneCycle, and meta2d) to detect rhythmic signal in gene expression. We have contrasted them to a naive method, not based on rhythmic parameters. By taking into account the tissue-specificity of rhythmic gene expression and different species comparisons, we show that no method is strongly favored. The results show that these methods designed for rhythm detection, in addition to having quite similar performances, are consistent only among genes with a strong rhythm signal. Rhythmic genes defined with a standard p-value threshold of 0.01 for instance, could include genes whose rhythmicity is biologically irrelevant. Although these results were dependent on the datasets used and the evolutionary distance between the species compared, we call for caution about the results of studies reporting or using large sets of rhythmic genes. Furthermore, given the analysis of the behaviors of the methods on real and randomized data, we recommend using primarily ARS, empJTK, or GeneCycle, which verify expectations of a classical distribution of p-values. Experimental design should also take into account the circumstances under which the methods seem more efficient, such as giving priority to biological replicates over the number of time-points, or to the number of time-points over the quality of the technique (microarray vs RNAseq). GeneCycle, and to a lesser extent empirical-JTK, might be the most robust method when applied to weakly informative datasets. Finally, our analyzes suggest that rhythmic genes are mainly highly expressed genes.Author SummaryTo be active, genes have to be transcribed to RNA. For some genes, the transcription rate follows a circadian rhythm with a periodicity of approximately 24 hours; we call these genes “rhythmic”. In this study, we compared methods designed to detect rhythmic genes in gene expression data. The data are measures of the number of RNA molecules for each gene, given at several time-points, usually spaced 2 to 4 hours, over one or several periods of 24 hours. There are many such methods, but it is not known which ones work best to detect genes whose rhythmic expression is biologically functional. We compared these methods using a reference group of evolutionarily conserved rhythmic genes. We compared data from baboon, mouse, rat, zebrafish, fly, and mosquitoes. Surprisingly, no method was particularly effective. Furthermore, we found that only very strong rhythmic signals were relevant with each method. More precisely, when we use a usual cut-off to define rhythmic genes, the group of genes considered as rhythmic contains many genes whose rhythmicity cannot be confirmed to be biologically relevant. We also show that rhythmic genes mainly contain highly expressed genes. Finally, based on our results, we provide recommendations on which methods to use and how, and suggestions for future experimental designs.


Author(s):  
Soumya Raychaudhuri

The most interesting and challenging gene expression data sets to analyze are large multidimensional data sets that contain expression values for many genes across multiple conditions. In these data sets the use of scientific text can be particularly useful, since there are a myriad of genes examined under vastly different conditions, each of which may induce or repress expression of the same gene for different reasons. There is an enormous complexity to the data that we are examining—each gene is associated with dozens if not hundreds of expression values as well as multiple documents built up from vocabularies consisting of thousands of words. In Section 2.4 we reviewed common gene expression strategies, most of which revolve around defining groups of genes based on common profiles. A limitation of many gene expression analytic approaches is that they do not incorporate comprehensive background knowledge about the genes into the analysis. We present computational methods that leverage the peer-reviewed literature in the automatic analysis of gene expression data sets. Including the literature in gene expression data analysis offers an opportunity to incorporate background functional information about the genes when defining expression clusters. In Chapter 5 we saw how literature- based approaches could help in the analysis of single condition experiments. Here we will apply the strategies introduced in Chapter 6 to assess the coherence of groups of genes to enhance gene expression analysis approaches. The methods proposed here could, in fact, be applied to any multivariate genomics data type. The key concepts discussed in this chapter are listed in the frame box. We begin with a discussion of gene groups and their role in expression analysis; we briefly discuss strategies to assign keywords to groups and strategies to assess their functional coherence. We apply functional coherence measures to gene expression analysis; for examples we focus on a yeast expression data set. We first demonstrate how functional coherence can be used to focus in on the key biologically relevant gene groups derived by clustering methods such as self-organizing maps and k-means clustering.


2019 ◽  
Vol 51 (10) ◽  
pp. 981-988 ◽  
Author(s):  
Xiaolan Rao ◽  
Richard A Dixon

Abstract Co-expression network analysis is one of the most powerful approaches for interpretation of large transcriptomic datasets. It enables characterization of modules of co-expressed genes that may share biological functional linkages. Such networks provide an initial way to explore functional associations from gene expression profiling and can be applied to various aspects of plant biology. This review presents the applications of co-expression network analysis in plant biology and addresses optimized strategies from the recent literature for performing co-expression analysis on plant biological systems. Additionally, we describe the combined interpretation of co-expression analysis with other genomic data to enhance the generation of biologically relevant information.


Cell Reports ◽  
2019 ◽  
Vol 27 (3) ◽  
pp. 649-657.e5 ◽  
Author(s):  
Ben J. Greenwell ◽  
Alexandra J. Trott ◽  
Joshua R. Beytebiere ◽  
Shanny Pao ◽  
Alexander Bosley ◽  
...  

2019 ◽  
Vol 2019 ◽  
pp. 1-6 ◽  
Author(s):  
Suyan Tian ◽  
Lei Zhang

Multiple sclerosis (MS) is a common neurological disability of the central nervous system. Immune-modulatory therapy with interferon-β (IFN-β) has been used as a first-line treatment to prevent relapses in MS patients. While the therapeutic mechanism of IFN-β has not been fully elucidated, the data of microarray experiments that collected longitudinal gene expression profiles to evaluate the long-term response of IFN-β treatment have been analyzed using statistical methods that were incapable of dealing with such data. In this study, the GeneRank method was applied to generate weighted gene expression values and the monotonically expressed genes (MEGs) for both IFN-β treatment responders and nonresponders were identified. The proposed procedure identified 13 MEGs for the responders and 2 MEGs for the nonresponders, most of which are biologically relevant to MS. Our work here provides some useful insight into the mechanism of IFN-β treatment for MS patients. A full understanding of the therapeutic mechanism will enable a more personalized treatment strategy possible.


Blood ◽  
2005 ◽  
Vol 106 (11) ◽  
pp. 3524-3524
Author(s):  
Anil Potti ◽  
Holly K. Dressman ◽  
Murat O. Arcasoy

Abstract Hematopoietic proliferation, lineage commitment, and terminal differentiation are characterized by the emergence of a cell type-specific gene expression and transcriptional programs that determine the specific phenotype and function of cells in the erythroid lineage. Our objectives in this study were to identify unique gene expression patterns that characterize the transcriptional program of normal primary human erythroid precursors during terminal differentiation, and define the gene expression patterns seen in erythroblasts (EBL) of patients with polycythemia vera (PV). Homogenous populations of primary proEBL were generated from purified liquid cultures of CD34+ cells collected from healthy volunteers and PV patients. All patients with PV were diagnosed based on established criteria and had the JAK2-V617F mutation. Morphologic examination and surface expression of CD71 confirmed the purity of proEBL cell populations. ProEBL from normal individuals were induced to terminally differentiate generating orthochromatic EBL. RNA was extracted from normal proEBL, PV proEBL, and normal orthochromatic EBL. Affymetrix U133 Plus 2.0 arrays representing approximately 39,000 human genes were used for gene expression analysis. Four replicates from four independent primary cell cultures were analyzed for each comparison group (e.g. undifferentiated proEBL versus terminally differentiated orthochromatic EBL). Unsupervised hierarchical clustering showed distinct gene expression profiles in the proEBL and terminally differentiated EBL lineages. 1109 genes (2.0 fold change, P<0.01) were found to be differentially expressed. Numerous erythroid genes were found to be upregulated during terminal differentiation [e.g. globin genes, erythropoietin receptor, heme synthesis enzymes (ferrochelatase, ALAS2) erythrocyte membrane proteins (band 3, ankyrin, protein 4.1) and transcription factors (NFE2, Kruppel-like factors, myb, GATA2)]. As a proof of validation, the differential expression of 7 genes was verified by Northern blotting. To better understand the biologic role of the gene sets identified, using Ingenuity pathway analysis, individual genes were integrated into specific regulatory and signaling pathway networks. A total of 19 networks with significant scores (>23) were identified. Biological functions of the identified networks included RNA post-transcriptional regulation, cell cycle control, translational regulation, DNA replication and repair and cellular assembly/organization. In a proof of principle study, gene expression patterns in PV proEBL (n=6) were compared to normal proEBL (n=5). Unsupervised hierarchical clustering showed a distinct gene expression profile for PV. A binary regression predictive model was also developed to find gene expression patterns predictive for PV. Using this model a 150 gene predictor was found that could predict PV patients from control at 100% accuracy. Ingenuity pathways analysis of a subset of gene subsets demonstrated several biologically relevant networks that were distinct in patients with PV, including myc, CDC2, and JAK2. Deregulation of normal transcriptional mechanisms in hematopoietic cells is associated with the pathogenesis of PV. Further, our data shows that genomic studies provide new insights into transcriptional programs that govern erythroid differentiation, and identify biologically relevant deregulated pathways as potential targets for therapy in PV.


2005 ◽  
Vol 17 (2) ◽  
pp. 223-229 ◽  
Author(s):  
Ueli Schibler ◽  
Felix Naef

Sign in / Sign up

Export Citation Format

Share Document