scholarly journals Unbiased Boolean analysis of public gene expression data for cell cycle gene identification

2019 ◽  
Vol 30 (14) ◽  
pp. 1770-1779 ◽  
Author(s):  
Sarah A. Dabydeen ◽  
Arshad Desai ◽  
Debashis Sahoo

Cell proliferation is essential for the development and maintenance of all organisms and is dysregulated in cancer. Using synchronized cells progressing through the cell cycle, pioneering microarray studies defined cell cycle genes based on cyclic variation in their expression. However, the concordance of the small number of synchronized cell studies has been limited, leading to discrepancies in definition of the transcriptionally regulated set of cell cycle genes within and between species. Here we present an informatics approach based on Boolean logic to identify cell cycle genes. This approach used the vast array of publicly available gene expression data sets to query similarity to CCNB1, which encodes the cyclin subunit of the Cdk1-cyclin B complex that triggers the G2-to-M transition. In addition to highlighting conservation of cell cycle genes across large evolutionary distances, this approach identified contexts where well-studied genes known to act during the cell cycle are expressed and potentially acting in nondivision contexts. An accessible web platform enables a detailed exploration of the cell cycle gene lists generated using the Boolean logic approach. The methods employed are straightforward to extend to processes other than the cell cycle.

2000 ◽  
Vol 3 (1) ◽  
pp. 9-15 ◽  
Author(s):  
PETER J. WOOLF ◽  
YIXIN WANG

Woolf, Peter J., and Yixin Wang. A fuzzy logic approach to analyzing gene expression data. Physiol Genomics 3: 9–15, 2000.—We have developed a novel algorithm for analyzing gene expression data. This algorithm uses fuzzy logic to transform expression values into qualitative descriptors that can be evaluated by using a set of heuristic rules. In our tests we designed a model to find triplets of activators, repressors, and targets in a yeast gene expression data set. For the conditions tested, the predictions made by the algorithm agree well with experimental data in the literature. The algorithm can also assist in determining the function of uncharacterized proteins and is able to detect a substantially larger number of transcription factors than could be found at random. This technology extends current techniques such as clustering in that it allows the user to generate a connected network of genes using only expression data.


Forests ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 120
Author(s):  
Yijie Li ◽  
Song Chen ◽  
Yuhang Liu ◽  
Haijiao Huang

Research Highlights: This study identified the cell cycle genes in birch that likely play important roles during the plant’s growth and development. This analysis provides a basis for understanding the regulatory mechanism of various cell cycles in Betula pendula Roth. Background and Objectives: The cell cycle factors not only influence cell cycles progression together, but also regulate accretion, division, and differentiation of cells, and then regulate growth and development of the plant. In this study, we identified the putative cell cycle genes in the B. pendula genome, based on the annotated cell cycle genes in Arabidopsis thaliana (L.) Heynh. It can be used as a basis for further functional research. Materials and Methods: RNA-seq technology was used to determine the transcription abundance of all cell cycle genes in xylem, roots, leaves, and floral tissues. Results: We identified 59 cell cycle gene models in the genome of B. pendula, with 17 highly expression genes among them. These genes were BpCDKA.1, BpCDKB1.1, BpCDKB2.1, BpCKS1.2, BpCYCB1.1, BpCYCB1.2, BpCYCB2.1, BpCYCD3.1, BpCYCD3.5, BpDEL1, BpDpa2, BpE2Fa, BpE2Fb, BpKRP1, BpKRP2, BpRb1, and BpWEE1. Conclusions: By combining phylogenetic analysis and tissue-specific expression data, we identified 17 core cell cycle genes in the Betulapendula genome.


2017 ◽  
Author(s):  
Anthony Szedlak ◽  
Spencer Sims ◽  
Nicholas Smith ◽  
Giovanni Paternostro ◽  
Carlo Piermarocchi

AbstractModern time series gene expression and other omics data sets have enabled unprecedented resolution of the dynamics of cellular processes such as cell cycle and response to pharmaceutical compounds. In anticipation of the proliferation of time series data sets in the near future, we use the Hopfield model, a recurrent neural network based on spin glasses, to model the dynamics of cell cycle in HeLa (human cervical cancer) and S. cerevisiae cells. We study some of the rich dynamical properties of these cyclic Hopfield systems, including the ability of populations of simulated cells to recreate experimental expression data and the effects of noise on the dynamics. Next, we use a genetic algorithm to identify sets of genes which, when selectively inhibited by local external fields representing gene silencing compounds such as kinase inhibitors, disrupt the encoded cell cycle. We find, for example, that inhibiting the set of four kinases BRD4, MAPK1, NEK7, and YES1 in HeLa cells causes simulated cells to accumulate in the M phase. Finally, we suggest possible improvements and extensions to our model.Author SummaryCell cycle – the process in which a parent cell replicates its DNA and divides into two daughter cells – is an upregulated process in many forms of cancer. Identifying gene inhibition targets to regulate cell cycle is important to the development of effective therapies. Although modern high throughput techniques offer unprecedented resolution of the molecular details of biological processes like cell cycle, analyzing the vast quantities of the resulting experimental data and extracting actionable information remains a formidable task. Here, we create a dynamical model of the process of cell cycle using the Hopfield model (a type of recurrent neural network) and gene expression data from human cervical cancer cells and yeast cells. We find that the model recreates the oscillations observed in experimental data. Tuning the level of noise (representing the inherent randomness in gene expression and regulation) to the “edge of chaos” is crucial for the proper behavior of the system. We then use this model to identify potential gene targets for disrupting the process of cell cycle. This method could be applied to other time series data sets and used to predict the effects of untested targeted perturbations.


Author(s):  
Soumya Raychaudhuri

The most interesting and challenging gene expression data sets to analyze are large multidimensional data sets that contain expression values for many genes across multiple conditions. In these data sets the use of scientific text can be particularly useful, since there are a myriad of genes examined under vastly different conditions, each of which may induce or repress expression of the same gene for different reasons. There is an enormous complexity to the data that we are examining—each gene is associated with dozens if not hundreds of expression values as well as multiple documents built up from vocabularies consisting of thousands of words. In Section 2.4 we reviewed common gene expression strategies, most of which revolve around defining groups of genes based on common profiles. A limitation of many gene expression analytic approaches is that they do not incorporate comprehensive background knowledge about the genes into the analysis. We present computational methods that leverage the peer-reviewed literature in the automatic analysis of gene expression data sets. Including the literature in gene expression data analysis offers an opportunity to incorporate background functional information about the genes when defining expression clusters. In Chapter 5 we saw how literature- based approaches could help in the analysis of single condition experiments. Here we will apply the strategies introduced in Chapter 6 to assess the coherence of groups of genes to enhance gene expression analysis approaches. The methods proposed here could, in fact, be applied to any multivariate genomics data type. The key concepts discussed in this chapter are listed in the frame box. We begin with a discussion of gene groups and their role in expression analysis; we briefly discuss strategies to assign keywords to groups and strategies to assess their functional coherence. We apply functional coherence measures to gene expression analysis; for examples we focus on a yeast expression data set. We first demonstrate how functional coherence can be used to focus in on the key biologically relevant gene groups derived by clustering methods such as self-organizing maps and k-means clustering.


Sign in / Sign up

Export Citation Format

Share Document