PADGE: analysis of heterogeneous patterns of differential gene expression

2007 ◽  
Vol 32 (1) ◽  
pp. 154-159 ◽  
Li Li ◽  
Amitabha Chaudhuri ◽  
John Chant ◽  
Zhijun Tang

We have devised a novel analysis approach, percentile analysis for differential gene expression (PADGE), for identifying genes differentially expressed between two groups of heterogeneous samples. PADGE was designed to compare expression profiles of sample subgroups at a series of percentile cutoffs and to examine the trend of relative expression between sample groups as expression level increases. Simulation studies showed that PADGE has more statistical power than t-statistics, cancer outlier profile analysis (COPA) (Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun XW, Varambally S, Cao X, Tchinda J, Kuefer R, Lee C, Montie JE, Shah RB, Pienta KJ, Rubin MA, Chinnaiyan AM. Science 310: 644–648, 2005), and kurtosis (Teschendorff AE, Naderi A, Barbosa-Morais NL, Caldas C. Bioinformatics 22: 2269–2275, 2006). Application of PADGE to microarray data sets in tumor tissues demonstrated its utility in prioritizing cancer genes encoding potential therapeutic targets or diagnostic markers. A web application was developed for researchers to analyze a large gene expression data set from heterogeneous biological samples and identify differentially expressed genes between subsets of sample classes using PADGE and other available approaches. Availability: .

2021 ◽  
Vol 14 (1) ◽  
pp. 38-45
O. Lykhenko ◽  

The purpose of the study was to provide the pipeline for processing of publicly available unprocessed data on gene expression via integration and differential gene expression analysis. Data collection from open gene expression databases, normalization and integration into a single expression matrix in accordance with metadata and determination of differentially expressed genes were fulfilled. To demonstrate all stages of data processing and integrative analysis, there were used the data from gene expression in the human placenta from the first and second trimesters of normal pregnancy. The source code for the integrative analysis was written in the R programming language and publicly available as a repository on GitHub. Four clusters of functionally enriched differentially expressed genes were identified for the human placenta in the interval between the first and second trimester of pregnancy. Immune processes, developmental processes, vasculogenesis and angiogenesis, signaling and the processes associated with zinc ions varied in the considered interval between the first and second trimester of placental development. The proposed sequence of actions for integrative analysis could be applied to any data obtained by microarray technology.

2000 ◽  
Vol 16 (8) ◽  
pp. 685-698 ◽  
E. Manduchi ◽  
G. R. Grant ◽  
S. E. McKenzie ◽  
G. C. Overton ◽  
S. Surrey ◽  

Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 2779-2779 ◽  
Andrea Pellagatti ◽  
Moritz Gerstung ◽  
Elli Papaemmanuil ◽  
Luca Malcovati ◽  
Aristoteles Giagounidis ◽  

Abstract A particular profile of gene expression can reflect an underlying molecular abnormality in malignancy. Distinct gene expression profiles and deregulated gene pathways can be driven by specific gene mutations and may shed light on the biology of the disease and lead to the identification of new therapeutic targets. We selected 143 cases from our large-scale gene expression profiling (GEP) dataset on bone marrow CD34+ cells from patients with myelodysplastic syndromes (MDS), for which matching genotyping data were obtained using next-generation sequencing of a comprehensive list of 111 genes involved in myeloid malignancies (including the spliceosomal genes SF3B1, SRSF2, U2AF1 and ZRSR2, as well as TET2, ASXL1and many other). The GEP data were then correlated with the mutational status to identify significantly differentially expressed genes associated with each of the most common gene mutations found in MDS. The expression levels of the mutated genes analyzed were generally lower in patients carrying a mutation than in patients wild-type for that gene (e.g. SF3B1, ASXL1 and TP53), with the exception of RUNX1 for which patients carrying a mutation showed higher expression levels than patients without mutation. Principal components analysis showed that the main directions of gene expression changes (principal components) tend to coincide with some of the common gene mutations, including SF3B1, SRSF2 and TP53. SF3B1 and STAG2 were the mutated genes showing the highest number of associated significantly differentially expressed genes, including ABCB7 as differentially expressed in association with SF3B1 mutation and SULT2A1 in association with STAG2 mutation. We found distinct differentially expressed genes associated with the four most common splicing gene mutations (SF3B1, SRSF2, U2AF1 and ZRSR2) in MDS, suggesting that different phenotypes associated with these mutations may be driven by different effects on gene expression and that the target gene may be different. We have also evaluated the prognostic impact of the GEP data in comparison with that of the genotype data and importantly we have found a larger contribution of gene expression data in predicting progression free survival compared to mutation-based multivariate survival models. In summary, this analysis correlating gene expression data with genotype data has revealed that the mutational status shapes the gene expression landscape. We have identified deregulated genes associated with the most common gene mutations in MDS and found that the prognostic power of gene expression data is greater than the prognostic power provided by mutation data. AP and MG contributed equally to this work. JB and PJC are co-senior authors. Disclosures: No relevant conflicts of interest to declare.

2019 ◽  
Vol 2 (1) ◽  
Jackson Townsend ◽  
Heather A. Hundley

Background and Hypothesis: RNA editing is one of several mechanisms regulating gene expression. One type of RNA editing, the deamination of adenosine to inosine, is carried out by ADAR enzymes. ADAR enzymes are essential for neural function and aberrant editing is implicated in various forms of neuropathology. C. elegans lacking the RNA editing enzyme, ADR-2, are viable allowing us to ascertain how loss of RNA editing affects neural gene expression. The effects of loss of adr-2 on neural gene expression will be analyzed in both the first larval (L1) and young adult stages. We hypothesize that the transcriptome will change depending on life stage and the presence of ADR-2. Methods: Three replicates of neural cells isolated from wild type and adr-2(-) L1 and young adult stage animals were obtained. Total RNA was extracted from each population and mRNA was isolated using an oligo-dT bead. The mRNA was fragmented, and reverse transcribed to generate a complentary DNA (cDNA) library. The cDNA was sequenced by a facility at Indiana University. Quality of the library was evaluated using FASTqc. DE-seq2 software evaluated the differential gene expression. Results: I examined differential gene expression in two life stages of the WT and adr-2 neural samples. After obtaining the differentially expressed genes, the portions of the transcriptome that require ADR-2 was determined. WT young adults showed increased (3715) and decreased (2504) expression of neural genes when compared to the L1 stage. Many differentially expressed genes required adr-2 (~40% of the upregulated and 78% of the downregulated genes.) In addition, some genes were uniquely altered (631 upregulated, 196 downregulated) in the absence of adr-2. Conclusion and Potential Impact: The life stage and presence of ADR-2 alter the neural transcriptome and this function changes throughout development. Future studies will determine whether these genes are altered due to the lack of RNA editing or binding by ADR-2.

2021 ◽  
Magdalena Navarro ◽  
T Ian Simpson

AbstractMotivationAutism spectrum disorder (ASD) has a strong, yet heterogeneous, genetic component. Among the various methods that are being developed to help reveal the underlying molecular aetiology of the disease, one that is gaining popularity is the combination of gene expression and clinical genetic data. For ASD, the SFARI-gene database comprises lists of curated genes in which presumed causative mutations have been identified in patients. In order to predict novel candidate SFARI-genes we built classification models combining differential gene expression data for ASD patients and unaffected individuals with a gene’s status in the SFARI-gene list.ResultsSFARI-genes were not found to be significantly associated with differential gene expression patterns, nor were they enriched in gene co-expression network modules that had a strong correlation with ASD diagnosis. However, network analysis and machine learning models that incorporate information from the whole gene co-expression network were able to predict novel candidate genes that share features of existing SFARI genes and have support for roles in ASD in the literature. We found a statistically significant bias related to the absolute level of gene expression for existing SFARI genes and their scores. It is essential that this bias be taken into account when studies interpret ASD gene expression data at gene, module and whole-network levels.AvailabilitySource code is available from GitHub ( and the accompanying data from The University of Edinburgh DataStore ([email protected]

2013 ◽  
Vol 29 (5) ◽  
pp. 622-629 ◽  
Christopher L. Poirel ◽  
Ahsanur Rahman ◽  
Richard R. Rodrigues ◽  
Arjun Krishnan ◽  
Jacqueline R. Addesa ◽  

Sign in / Sign up

Export Citation Format

Share Document