Open TG-GATEs: a large-scale toxicogenomics database

Abstract Toxicogenomics focuses on assessing the safety of compounds using gene expression profiles. Gene expression signatures from large toxicogenomics databases are expected to perform better than small databases in identifying biomarkers for the prediction and evaluation of drug safety based on a compound's toxicological mechanisms in animal target organs. Over the past 10 years, the Japanese Toxicogenomics Project consortium (TGP) has been developing a large-scale toxicogenomics database consisting of data from 170 compounds (mostly drugs) with the aim of improving and enhancing drug safety assessment. Most of the data generated by the project (e.g. gene expression, pathology, lot number) are freely available to the public via Open TG-GATEs (Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System). Here, we provide a comprehensive overview of the database, including both gene expression data and metadata, with a description of experimental conditions and procedures used to generate the database. Open TG-GATEs is available from https://toxico.nibiohn.go.jp/english/index.html.

Download Full-text

LncGSEA: a versatile tool to infer lncRNA associated pathways from large-scale cancer transcriptome sequencing data

BMC Genomics ◽

10.1186/s12864-021-07900-y ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Yanan Ren ◽

Ting-You Wang ◽

Leah C. Anderton ◽

Qi Cao ◽

Rendong Yang

Keyword(s):

Gene Expression ◽

Large Scale ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Clinical Samples ◽

Sequencing Data ◽

Multiple Cancer ◽

Regulatory Pathways ◽

Cancer Transcriptome ◽

Versatile Tool

Abstract Background Long non-coding RNAs (lncRNAs) are a growing focus in cancer research. Deciphering pathways influenced by lncRNAs is important to understand their role in cancer. Although knock-down or overexpression of lncRNAs followed by gene expression profiling in cancer cell lines are established approaches to address this problem, these experimental data are not available for a majority of the annotated lncRNAs. Results As a surrogate, we present lncGSEA, a convenient tool to predict the lncRNA associated pathways through Gene Set Enrichment Analysis of gene expression profiles from large-scale cancer patient samples. We demonstrate that lncGSEA is able to recapitulate lncRNA associated pathways supported by literature and experimental validations in multiple cancer types. Conclusions LncGSEA allows researchers to infer lncRNA regulatory pathways directly from clinical samples in oncology. LncGSEA is written in R, and is freely accessible at https://github.com/ylab-hi/lncGSEA.

Download Full-text

Analysis of blood-based gene expression in idiopathic Parkinson disease

Neurology ◽

10.1212/wnl.0000000000004516 ◽

2017 ◽

Vol 89 (16) ◽

pp. 1676-1683 ◽

Cited By ~ 36

Author(s):

Ron Shamir ◽

Christine Klein ◽

David Amar ◽

Eva-Juliane Vollstedt ◽

Michael Bonin ◽

...

Keyword(s):

Gene Expression ◽

Parkinson Disease ◽

Gene Networks ◽

Large Scale ◽

Expression Profiles ◽

Area Under The Curve ◽

Gene Expression Profiles ◽

Gene Signature ◽

Gene Profiles ◽

Independent Test

Objective:To examine whether gene expression analysis of a large-scale Parkinson disease (PD) patient cohort produces a robust blood-based PD gene signature compared to previous studies that have used relatively small cohorts (≤220 samples).Methods:Whole-blood gene expression profiles were collected from a total of 523 individuals. After preprocessing, the data contained 486 gene profiles (n = 205 PD, n = 233 controls, n = 48 other neurodegenerative diseases) that were partitioned into training, validation, and independent test cohorts to identify and validate a gene signature. Batch-effect reduction and cross-validation were performed to ensure signature reliability. Finally, functional and pathway enrichment analyses were applied to the signature to identify PD-associated gene networks.Results:A gene signature of 100 probes that mapped to 87 genes, corresponding to 64 upregulated and 23 downregulated genes differentiating between patients with idiopathic PD and controls, was identified with the training cohort and successfully replicated in both an independent validation cohort (area under the curve [AUC] = 0.79, p = 7.13E–6) and a subsequent independent test cohort (AUC = 0.74, p = 4.2E–4). Network analysis of the signature revealed gene enrichment in pathways, including metabolism, oxidation, and ubiquitination/proteasomal activity, and misregulation of mitochondria-localized genes, including downregulation of COX4I1, ATP5A1, and VDAC3.Conclusions:We present a large-scale study of PD gene expression profiling. This work identifies a reliable blood-based PD signature and highlights the importance of large-scale patient cohorts in developing potential PD biomarkers.

Download Full-text

Discovering Distinct Patterns in Gene Expression Profiles

Journal of Integrative Bioinformatics ◽

10.1515/jib-2008-105 ◽

2008 ◽

Vol 5 (2) ◽

Cited By ~ 1

Author(s):

Li Teng ◽

Laiwan Chan

Keyword(s):

Gene Expression ◽

Large Scale ◽

Expression Profiles ◽

Expression Patterns ◽

Gene Expression Profiles ◽

Clustering Methods ◽

Gene Expressions ◽

Real Gene ◽

Large Scale Dataset ◽

Coexpressed Genes

SummaryTraditional analysis of gene expression profiles use clustering to find groups of coexpressed genes which have similar expression patterns. However clustering is time consuming and could be diffcult for very large scale dataset. We proposed the idea of Discovering Distinct Patterns (DDP) in gene expression profiles. Since patterns showing by the gene expressions reveal their regulate mechanisms. It is significant to find all different patterns existing in the dataset when there is little prior knowledge. It is also a helpful start before taking on further analysis. We propose an algorithm for DDP by iteratively picking out pairs of gene expression patterns which have the largest dissimilarities. This method can also be used as preprocessing to initialize centers for clustering methods, like K-means. Experiments on both synthetic dataset and real gene expression datasets show our method is very effective in finding distinct patterns which have gene functional significance and is also effcient.

Download Full-text

CFTR ΔF508 mutation has minimal effect on the gene expression profile of differentiated human airway epithelia

AJP Lung Cellular and Molecular Physiology ◽

10.1152/ajplung.00065.2005 ◽

2005 ◽

Vol 289 (4) ◽

pp. L545-L553 ◽

Cited By ~ 29

Author(s):

Joseph Zabner ◽

Todd E. Scheetz ◽

Hakeem G. Almabrazi ◽

Thomas L. Casavant ◽

Jian Huang ◽

...

Keyword(s):

Gene Expression ◽

Cystic Fibrosis ◽

Large Scale ◽

Expression Profiles ◽

Expression Patterns ◽

Primary Cultures ◽

Gene Expression Profiles ◽

Filter Method ◽

Tissue Destruction ◽

Airway Epithelia

Cystic fibrosis (CF) is caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR), an epithelial chloride channel regulated by phosphorylation. Most of the disease-associated morbidity is the consequence of chronic lung infection with progressive tissue destruction. As an approach to investigate the cellular effects of CFTR mutations, we used large-scale microarray hybridization to contrast the gene expression profiles of well-differentiated primary cultures of human CF and non-CF airway epithelia grown under resting culture conditions. We surveyed the expression profiles for 10 non-CF and 10 ΔF508 homozygote samples. Of the 22,283 genes represented on the Affymetrix U133A GeneChip, we found evidence of significant changes in expression in 24 genes by two-sample t-test ( P < 0.00001). A second, three-filter method of comparative analysis found no significant differences between the groups. The levels of CFTR mRNA were comparable in both groups. There were no significant differences in the gene expression patterns between male and female CF specimens. There were 18 genes with significant increases and 6 genes with decreases in CF relative to non-CF samples. Although the function of many of the differentially expressed genes is unknown, one transcript that was elevated in CF, the KCl cotransporter (KCC4), is a candidate for further study. Overall, the results indicate that CFTR dysfunction has little direct impact on airway epithelial gene expression in samples grown under these conditions.

Download Full-text

Dynamical consequences of regional heterogeneity in the brain’s transcriptional landscape

10.1101/2020.10.28.359943 ◽

2020 ◽

Cited By ~ 1

Author(s):

Gustavo Deco ◽

Kevin Aquino ◽

Aurina Arnatkevičiūtė ◽

Stuart Oldham ◽

Kristina Sabaroedin ◽

...

Keyword(s):

Gene Expression ◽

Large Scale ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Global Gene Expression ◽

Brain Regions ◽

Biophysical Model ◽

Neuronal Dynamics ◽

Regional Heterogeneity ◽

Magnetic Resonance Imaging Mri

AbstractBrain regions vary in their molecular and cellular composition, but how this heterogeneity shapes neuronal dynamics is unclear. Here, we investigate the dynamical consequences of regional heterogeneity using a biophysical model of whole-brain functional magnetic resonance imaging (MRI) dynamics in humans. We show that models in which transcriptional variations in excitatory and inhibitory receptor (E:I) gene expression constrain regional heterogeneity more accurately reproduce the spatiotemporal structure of empirical functional connectivity estimates than do models constrained by global gene expression profiles and MRI-derived estimates of myeloarchitecture. We further show that regional heterogeneity is essential for yielding both ignition-like dynamics, which are thought to support conscious processing, and a wide variance of regional activity timescales, which supports a broad dynamical range. We thus identify a key role for E:I heterogeneity in generating complex neuronal dynamics and demonstrate the viability of using transcriptional data to constrain models of large-scale brain function.

Download Full-text

Sex differences in viral entry protein expression and host transcript responses to SARS-CoV-2

10.21203/rs.3.rs-100914/v2 ◽

2020 ◽

Author(s):

Mengying Sun ◽

Rama Shankar ◽

Meehyun Ko ◽

Christopher Daniel Chang ◽

Shan-Ju Yeh ◽

...

Keyword(s):

Gene Expression ◽

Sex Differences ◽

Viral Entry ◽

Large Scale ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Scale Analysis ◽

Transcriptional Responses ◽

Deconvolution Analysis ◽

Large Scale Analysis

Abstract Epidemiological studies suggest that men exhibit a higher mortality rate to COVID-19 than women, yet the underlying biology is largely unknown. Here, we seek to delineate sex differences in the gene expression of viral entry proteins ACE2 and TMPRSS2, and host transcriptional responses to SARS-CoV-2 through large-scale analysis of genomic and clinical data. We first compiled 220,000 human gene expression profiles from three databases and completed the meta-information through machine learning and manual annotation. Large scale analysis of these profiles indicated that male samples show higher expression levels of ACE2 and TMPRSS2 than female samples, especially in the older group (>60 years) and in the kidney. Subsequent analysis of 6,031 COVID-19 patients at Mount Sinai Health System revealed that men have significantly higher creatinine levels, an indicator of impaired kidney function. Further analysis of 782 COVID-19 patient gene expression profiles taken from upper airway and blood suggested men and women present distinct expression changes. Computational deconvolution analysis of these profiles revealed male COVID-19 patients have enriched kidney-specific mesangial cells in blood compared to healthy patients. Together, this study suggests biological differences in the kidney between sexes may contribute to sex disparity in COVID-19.

Download Full-text

RankerGUI: A Computational Framework to Compare Differential Gene Expression Profiles Using Rank Based Statistics

International Journal of Molecular Sciences ◽

10.3390/ijms20236098 ◽

2019 ◽

Vol 20 (23) ◽

pp. 6098 ◽

Cited By ~ 1

Author(s):

Amarinder Singh Thind ◽

Kumar Parijat Tripathi ◽

Mario Rosario Guarracino

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Differential Gene Expression ◽

Web Application ◽

Cellular Response ◽

Expression Profiles ◽

Expression Patterns ◽

Gene Expression Profiles ◽

Experimental Conditions ◽

Differential Gene

The comparison of high throughput gene expression datasets obtained from different experimental conditions is a challenging task. It provides an opportunity to explore the cellular response to various biological events such as disease, environmental conditions, and drugs. There is a need for tools that allow the integration and analysis of such data. We developed the “RankerGUI pipeline”, a user-friendly web application for the biological community. It allows users to use various rank based statistical approaches for the comparison of full differential gene expression profiles between the same or different biological states obtained from different sources. The pipeline modules are an integration of various open-source packages, a few of which are modified for extended functionality. The main modules include rank rank hypergeometric overlap, enriched rank rank hypergeometric overlap and distance calculations. Additionally, preprocessing steps such as merging differential expression profiles of multiple independent studies can be added before running the main modules. Output plots show the strength, pattern, and trends among complete differential expression profiles. In this paper, we describe the various modules and functionalities of the developed pipeline. We also present a case study that demonstrates how the pipeline can be used for the comparison of differential expression profiles obtained from multiple platforms’ data of the Gene Expression Omnibus. Using these comparisons, we investigate gene expression patterns in kidney and lung cancers.

Download Full-text

A large-scale electrophoresis- and chromatography-based determination of gene expression profiles in bovine brain capillary endothelial cells after the re-induction of blood-brain barrier properties

Proteome Science ◽

10.1186/1477-5956-8-57 ◽

2010 ◽

Vol 8 (1) ◽

pp. 57 ◽

Cited By ~ 11

Author(s):

Gwënaël Pottiez ◽

Barbara Deracinois ◽

Sophie Duban-Deweer ◽

Roméo Cecchelli ◽

Laurence Fenart ◽

...

Keyword(s):

Gene Expression ◽

Endothelial Cells ◽

Large Scale ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Barrier Properties ◽

Bovine Brain ◽

Brain Capillary ◽

Capillary Endothelial Cells

Download Full-text

Large-scale analysis of gene expression profiles

Briefings in Bioinformatics ◽

10.1093/bib/3.1.7 ◽

2002 ◽

Vol 3 (1) ◽

pp. 7-17 ◽

Cited By ~ 2

Author(s):

T. D. Wu

Keyword(s):

Gene Expression ◽

Large Scale ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Scale Analysis ◽

Large Scale Analysis

Download Full-text

ImmuSort, a database on gene plasticity and electronic sorting for immune cells

Scientific Reports ◽

10.1038/srep10370 ◽

2015 ◽

Vol 5 (1) ◽

Cited By ~ 19

Author(s):

Pingzhang Wang ◽

Yehong Yang ◽

Wenling Han ◽

Dalong Ma

Keyword(s):

Gene Expression ◽

Immune Cells ◽

Immune Cell ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Experimental Conditions ◽

Evaluation Scores ◽

Cell Groups ◽

Differential Gene ◽

Disease Associated Genes

Abstract Gene expression is highly dynamic and plastic. We present a new immunological database, ImmuSort. Unlike other gene expression databases, ImmuSort provides a convenient way to view global differential gene expression data across thousands of experimental conditions in immune cells. It enables electronic sorting, which is a bioinformatics process to retrieve cell states associated with specific experimental conditions that are mainly based on gene expression intensity. A comparison of gene expression profiles reveals other applications, such as the evaluation of immune cell biomarkers and cell subsets, identification of cell specific and/or disease-associated genes or transcripts, comparison of gene expression in different transcript variants and probe set quality evaluation. A plasticity score is introduced to measure gene plasticity. Average rank and marker evaluation scores are used to evaluate biomarkers. The current version includes 31 human and 17 mouse immune cell groups, comprising 10,422 and 3,929 microarrays derived from public databases, respectively. A total of 20,283 human and 20,963 mouse genes are available to query in the database. Examples show the distinct advantages of the database. The database URL is http://immusort.bjmu.edu.cn/.

Download Full-text