GEREDB: Gene expression regulation database curated by mining abstracts from literature

Understanding how genes are expressed and regulated in different biological processes are fundamental and challenging issues. Considerable progress has been made in studying the relationship between the expression and regulation of human genes. However, it is difficult to use these resources productively to analyze gene expression data. GEREDB ( www.thua45.cn/geredb ) has been developed to facilitate analyses that will provide insights into the regulation of genes that govern specific biological responses. GEREDB is a publicly available, manually curated biological database that stores the data regarding relationships between expression and regulation of human genes. To date, more than 39,000 Links have been contextually annotated by reviewing more than 53,000 abstracts. GEREDB can be searched using the official NCBI gene symbol as a query, and it can be downloaded along with the GEREA software package. GEREDB has the ability to analyze user-supplied gene expression data in a causal analysis oriented manner using the GEREA bioinformatics tool.

Download Full-text

Determining Physical Mechanisms of Gene Expression Regulation from Single Cell Gene Expression Data

PLoS Computational Biology ◽

10.1371/journal.pcbi.1005072 ◽

2016 ◽

Vol 12 (8) ◽

pp. e1005072 ◽

Cited By ~ 13

Author(s):

Daphne Ezer ◽

Victoria Moignard ◽

Berthold Göttgens ◽

Boris Adryan

Keyword(s):

Gene Expression ◽

Single Cell ◽

Gene Expression Data ◽

Gene Expression Regulation ◽

Expression Regulation ◽

Expression Data ◽

Physical Mechanisms ◽

Cell Gene Expression ◽

Cell Gene

Download Full-text

A New Approach to Analysis and Interpretation of Toxicogenomic Gene Expression Data and its Importance in Examining Biological Responses to Low, Environmentally Relevant Doses of Toxicants

Toxicogenomics ◽

10.1002/9780470699638.ch2 ◽

2008 ◽

pp. 27-57 ◽

Cited By ~ 5

Author(s):

Julie A. Gosse ◽

Thomas H. Hampton ◽

Jennifer C. Davey ◽

Joshua W. Hamilton

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Expression Data ◽

New Approach ◽

Biological Responses

Download Full-text

Classification Algorithm for Gene Expression Graph and Manhattan Distance

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v5.i2.pp472-478 ◽

2017 ◽

Vol 5 (2) ◽

pp. 472 ◽

Cited By ~ 1

Author(s):

N Sevugapandi ◽

C.P. Chandran

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Euclidean Distance ◽

Classification Algorithm ◽

Manhattan Distance ◽

Expression Data ◽

The Relationship ◽

Gene Information

This proposed method focus on these issues by developing a novel classification algorithm by combining Gene Expression Graph (GEG) with Manhattan distance. This method will be used to express the gene expression data. Gene Expression Graph provides the optimal view about the relationship between normal and unhealthy genes. The method of using a graph-based gene expression to express gene information was first offered by the authors in [1] and [2], It will permits to construct a classifier based on an association between graphs represented for well-known classes and graphs represented for samples to evaluate. Additionally Euclidean distance is used to measure the strength of relationship which exists between the genes.

Download Full-text

Investigating the Molecular Processes behind the Cell-Specific Toxicity Response to Titanium Dioxide Nanobelts

International Journal of Molecular Sciences ◽

10.3390/ijms22179432 ◽

2021 ◽

Vol 22 (17) ◽

pp. 9432

Author(s):

Laurent A. Winckers ◽

Chris T. Evelo ◽

Egon L. Willighagen ◽

Martina Kutmon

Keyword(s):

Gene Expression ◽

Titanium Dioxide ◽

Cell Lines ◽

Gene Expression Data ◽

Gene Networks ◽

Biological Processes ◽

Expression Data ◽

Molecular Processes ◽

High Exposure ◽

Pathway Gene

Some engineered nanomaterials incite toxicological effects, but the underlying molecular processes are understudied. The varied physicochemical properties cause different initial molecular interactions, complicating toxicological predictions. Gene expression data allow us to study the responses of genes and biological processes. Overrepresentation analysis identifies enriched biological processes using the experimental data but prompts broad results instead of detailed toxicological processes. We demonstrate a targeted filtering approach to compare public gene expression data for low and high exposure on three cell lines to titanium dioxide nanobelts. Our workflow finds cell and concentration-specific changes in affected pathways linked to four Gene Ontology terms (apoptosis, inflammation, DNA damage, and oxidative stress) to select pathways with a clear toxicity focus. We saw more differentially expressed genes at higher exposure, but our analysis identifies clear differences between the cell lines in affected processes. Colorectal adenocarcinoma cells showed resilience to both concentrations. Small airway epithelial cells displayed a cytotoxic response to the high concentration, but not as strongly as monocytic-like cells. The pathway-gene networks highlighted the gene overlap between altered toxicity-related pathways. The automated workflow is flexible and can focus on other biological processes by selecting other GO terms.

Download Full-text

To Predict Human Biomarker for the Obesity Using Mouse Homologous Expression Data at Different Theiler Stages

International Letters of Natural Sciences ◽

10.18052/www.scipress.com/ilns.45.9 ◽

2015 ◽

Vol 45 ◽

pp. 9-17

Author(s):

Ashok Kumar ◽

Ruchika Puri ◽

Kanika Gupta ◽

Amit Pal

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Molecular Mechanisms ◽

Expression Data ◽

Low Fat Diet ◽

Homologous Expression ◽

Melanocortin 4 Receptor ◽

Human Genes ◽

Genomics And Proteomics ◽

Mechanisms Of Development

There are numerous genetic factors like MC4R (Melanocortin-4 receptor), POMC (Pro-opiomelanocortin), SIM1 (Single Minded Gene) etc. important in obesity, which can be used as biomarker. But more reliable diagnostic markers are the need for today, along with new therapeutic strategies that target specific molecules in the disease pathways. As in mouse and human genes, where mutations in one or both species are associated with some phenotypic characteristics as observed in human disease. In molecular mechanisms of development, differentiation, and disease gene expression data provide crucial insights. Up-regulation and down-regulation of selective genes can have major effects on diet-induced obesity, but there is little or no effect when animals are fed a low-fat diet. In present study we have studied the gene expression data of mouse at different theiler stages using GXD BioMart. The interacting partners and pathway of the genes that are already used as biomarker in mouse as well as in humans have been studied. A gene NPY1R (Neuropeptide Y1 receptor) was taken as common after STRING and KEGG results on the basis of biochemical pathways and interactions similar to MC4R. Our present work focuses on comparative genomics and proteomics analysis of NPY1R, which has led to identification of biomarker by comparing it with already known MC4R human and mouse biomarker. It has been concluded that both the proteins are structurally and functionally similar.

Download Full-text

CLUSTERING BIOLOGICAL ANNOTATIONS AND GENE EXPRESSION DATA TO IDENTIFY PUTATIVELY CO-REGULATED BIOLOGICAL PROCESSES

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720006002181 ◽

2006 ◽

Vol 04 (04) ◽

pp. 833-852 ◽

Cited By ~ 16

Author(s):

CORNELIU HENEGAR ◽

RAFFAELLA CANCELLO ◽

SOPHIE ROME ◽

HUBERT VIDAL ◽

KARINE CLÉMENT ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Regulation Of Gene Expression ◽

Muscular Contraction ◽

Supplementary Information ◽

Biological Processes ◽

Expression Data ◽

Biological Interactions ◽

Microarray Gene Expression ◽

Data Set

Motivation: Functional profiling is a key step of microarray gene expression data analysis. Identifying co-regulated biological processes could help for better understanding of underlying biological interactions within the studied biological frame. Results: We present herein an original approach designed to search for putatively co-regulated biological processes sharing a significant number of co-expressed genes. An R language implementation named "FunCluster" was built and tested on two gene expression data sets. A discriminatory functional analysis of the first data set, related to experiments performed on separated adipocytes and stroma vascular fraction cells of human white adipose tissue, highlighted the prevalent role of nonadipose cells in the synthesis of inflammatory and immunity molecules in human adiposity. On the second data set, resulting from a model investigating insulin coordinated regulation of gene expression in human skeletal muscle, FunCluster analysis spotlighted novel functional classes of putatively co-regulated biological processes related to protein metabolism and the regulation of muscular contraction. Availability: Supplementary information about the FunCluster tool is available on-line at .

Download Full-text

EXPath 2.0: An Updated Database for Integrating High-Throughput Gene Expression Data with Biological Pathways

Plant and Cell Physiology ◽

10.1093/pcp/pcaa115 ◽

2020 ◽

Vol 61 (10) ◽

pp. 1818-1827

Author(s):

Kuan-Chieh Tseng ◽

Guan-Zhen Li ◽

Yu-Cheng Hung ◽

Chi-Nga Chow ◽

Nai-Yun Wu ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Metabolic Pathways ◽

Expression Profiles ◽

Regulatory Mechanisms ◽

Biological Processes ◽

Expression Data ◽

Rna Seq ◽

Correlation Networks ◽

Transcriptional Regulatory Mechanisms

Abstract Co-expressed genes tend to have regulatory relationships and participate in similar biological processes. Construction of gene correlation networks from microarray or RNA-seq expression data has been widely applied to study transcriptional regulatory mechanisms and metabolic pathways under specific conditions. Furthermore, since transcription factors (TFs) are critical regulators of gene expression, it is worth investigating TFs on the promoters of co-expressed genes. Although co-expressed genes and their related metabolic pathways can be easily identified from previous resources, such as EXPath and EXPath Tool, this information is not simultaneously available to identify their regulatory TFs. EXPath 2.0 is an updated database for the investigation of regulatory mechanisms in various plant metabolic pathways with 1,881 microarray and 978 RNA-seq samples. There are six significant improvements in EXPath 2.0: (i) the number of species has been extended from three to six to include Arabidopsis, rice, maize, Medicago, soybean and tomato; (ii) gene expression at various developmental stages have been added; (iii) construction of correlation networks according to a group of genes is available; (iv) hierarchical figures of the enriched Gene Ontology (GO) terms are accessible; (v) promoter analysis of genes in a metabolic pathway or correlation network is provided; and (vi) user’s gene expression data can be uploaded and analyzed. Thus, EXPath 2.0 is an updated platform for investigating gene expression profiles and metabolic pathways under specific conditions. It facilitates users to access the regulatory mechanisms of plant biological processes. The new version is available at http://EXPath.itps.ncku.edu.tw.

Download Full-text

Integrative Gene Selection on Gene Expression Data: Providing Biological Context to Traditional Approaches

Journal of Integrative Bioinformatics ◽

10.1515/jib-2018-0064 ◽

2018 ◽

Vol 16 (1) ◽

Cited By ~ 2

Author(s):

Cindy Perscheid ◽

Bastien Grasnick ◽

Matthias Uflacker

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Knowledge Integration ◽

Gene Selection ◽

Expression Profiles ◽

Knowledge Bases ◽

Biological Processes ◽

Expression Data ◽

Specific Expression ◽

External Knowledge

AbstractThe advance of high-throughput RNA-Sequencing techniques enables researchers to analyze the complete gene activity in particular cells. From the insights of such analyses, researchers can identify disease-specific expression profiles, thus understand complex diseases like cancer, and eventually develop effective measures for diagnosis and treatment. The high dimensionality of gene expression data poses challenges to its computational analysis, which is addressed with measures of gene selection. Traditional gene selection approaches base their findings on statistical analyses of the actual expression levels, which implies several drawbacks when it comes to accurately identifying the underlying biological processes. In turn, integrative approaches include curated information on biological processes from external knowledge bases during gene selection, which promises to lead to better interpretability and improved predictive performance. Our work compares the performance of traditional and integrative gene selection approaches. Moreover, we propose a straightforward approach to integrate external knowledge with traditional gene selection approaches. We introduce a framework enabling the automatic external knowledge integration, gene selection, and evaluation. Evaluation results prove our framework to be a useful tool for evaluation and show that integration of external knowledge improves overall analysis results.

Download Full-text

Interpretable generative deep learning: an illustration with single cell gene expression data

Human Genetics ◽

10.1007/s00439-021-02417-6 ◽

2022 ◽

Author(s):

Martin Treppner ◽

Harald Binder ◽

Moritz Hess

Keyword(s):

Gene Expression ◽

Single Cell ◽

Gene Expression Data ◽

Latent Variables ◽

Generative Models ◽

Omics Data ◽

Expression Data ◽

Cell Gene Expression ◽

The Relationship ◽

Cell Gene

AbstractDeep generative models can learn the underlying structure, such as pathways or gene programs, from omics data. We provide an introduction as well as an overview of such techniques, specifically illustrating their use with single-cell gene expression data. For example, the low dimensional latent representations offered by various approaches, such as variational auto-encoders, are useful to get a better understanding of the relations between observed gene expressions and experimental factors or phenotypes. Furthermore, by providing a generative model for the latent and observed variables, deep generative models can generate synthetic observations, which allow us to assess the uncertainty in the learned representations. While deep generative models are useful to learn the structure of high-dimensional omics data by efficiently capturing non-linear dependencies between genes, they are sometimes difficult to interpret due to their neural network building blocks. More precisely, to understand the relationship between learned latent variables and observed variables, e.g., gene transcript abundances and external phenotypes, is difficult. Therefore, we also illustrate current approaches that allow us to infer the relationship between learned latent variables and observed variables as well as external phenotypes. Thereby, we render deep learning approaches more interpretable. In an application with single-cell gene expression data, we demonstrate the utility of the discussed methods.

Download Full-text

Fast gene set enrichment analysis

10.1101/060012 ◽

2016 ◽

Cited By ~ 218

Author(s):

Gennady Korotkevich ◽

Vladimir Sukhov ◽

Alexey Sergushichev

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Polynomial Algorithm ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Biological Processes ◽

Expression Data ◽

Gene Set Enrichment ◽

P Values ◽

Gene Set

AbstractPreranked gene set enrichment analysis (GSEA) is a widely used method for interpretation of gene expression data in terms of biological processes. Here we present FGSEA method that is able to estimate arbitrarily low GSEA P-values with a higher accuracy and much faster compared to other implementations. We also present a polynomial algorithm to calculate GSEA P-values exactly, which we use to practically confirm the accuracy of the method.

Download Full-text