scholarly journals Detecting Cancer Outlier Genes with Potential Rearrangement Using Gene Expression Data and Biological Networks

2012 ◽  
Vol 2012 ◽  
pp. 1-13 ◽  
Author(s):  
Mohammed Alshalalfa ◽  
Tarek A. Bismar ◽  
Reda Alhajj

Gene alterations are a major component of the landscape of tumor genomes. To assess the significance of these alterations in the development of prostate cancer, it is necessary to identify these alterations and analyze them from systems biology perspective. Here, we present a new method (EigFusion) for predicting outlier genes with potential gene rearrangement. EigFusion demonstrated excellent performance in identifying outlier genes with potential rearrangement by testing it to synthetic and real data to evaluate performance. EigFusion was able to identify previously unrecognized genes such as FABP5 and KCNH8 and confirmed their association with primary and metastatic prostate samples while confirmed the metastatic specificity for other genes such as PAH, TOP2A, and SPINK1. We performed protein network based approaches to analyze the network context of potential rearranged genes. Functional gene rearrangement Modules are constructed by integrating functional protein networks. Rearranged genes showed to be highly connected to well-known altered genes in cancer such as AR, RB1, MYC, and BRCA1. Finally, using clinical outcome data of prostate cancer patients, potential rearranged genes demonstrated significant association with prostate cancer specific death.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ramin Hasibi ◽  
Tom Michoel

Abstract Background Molecular interaction networks summarize complex biological processes as graphs, whose structure is informative of biological function at multiple scales. Simultaneously, omics technologies measure the variation or activity of genes, proteins, or metabolites across individuals or experimental conditions. Integrating the complementary viewpoints of biological networks and omics data is an important task in bioinformatics, but existing methods treat networks as discrete structures, which are intrinsically difficult to integrate with continuous node features or activity measures. Graph neural networks map graph nodes into a low-dimensional vector space representation, and can be trained to preserve both the local graph structure and the similarity between node features. Results We studied the representation of transcriptional, protein–protein and genetic interaction networks in E. coli and mouse using graph neural networks. We found that such representations explain a large proportion of variation in gene expression data, and that using gene expression data as node features improves the reconstruction of the graph from the embedding. We further proposed a new end-to-end Graph Feature Auto-Encoder framework for the prediction of node features utilizing the structure of the gene networks, which is trained on the feature prediction task, and showed that it performs better at predicting unobserved node features than regular MultiLayer Perceptrons. When applied to the problem of imputing missing data in single-cell RNAseq data, the Graph Feature Auto-Encoder utilizing our new graph convolution layer called FeatGraphConv outperformed a state-of-the-art imputation method that does not use protein interaction information, showing the benefit of integrating biological networks and omics data with our proposed approach. Conclusion Our proposed Graph Feature Auto-Encoder framework is a powerful approach for integrating and exploiting the close relation between molecular interaction networks and functional genomics data.


2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Ieva Rauluseviciute ◽  
Finn Drabløs ◽  
Morten Beck Rye

Abstract Background Prostate cancer (PCa) has the highest incidence rates of cancers in men in western countries. Unlike several other types of cancer, PCa has few genetic drivers, which has led researchers to look for additional epigenetic and transcriptomic contributors to PCa development and progression. Especially datasets on DNA methylation, the most commonly studied epigenetic marker, have recently been measured and analysed in several PCa patient cohorts. DNA methylation is most commonly associated with downregulation of gene expression. However, positive associations of DNA methylation to gene expression have also been reported, suggesting a more diverse mechanism of epigenetic regulation. Such additional complexity could have important implications for understanding prostate cancer development but has not been studied at a genome-wide scale. Results In this study, we have compared three sets of genome-wide single-site DNA methylation data from 870 PCa and normal tissue samples with multi-cohort gene expression data from 1117 samples, including 532 samples where DNA methylation and gene expression have been measured on the exact same samples. Genes were classified according to their corresponding methylation and expression profiles. A large group of hypermethylated genes was robustly associated with increased gene expression (UPUP group) in all three methylation datasets. These genes demonstrated distinct patterns of correlation between DNA methylation and gene expression compared to the genes showing the canonical negative association between methylation and expression (UPDOWN group). This indicates a more diversified role of DNA methylation in regulating gene expression than previously appreciated. Moreover, UPUP and UPDOWN genes were associated with different compartments — UPUP genes were related to the structures in nucleus, while UPDOWN genes were linked to extracellular features. Conclusion We identified a robust association between hypermethylation and upregulation of gene expression when comparing samples from prostate cancer and normal tissue. These results challenge the classical view where DNA methylation is always associated with suppression of gene expression, which underlines the importance of considering corresponding expression data when assessing the downstream regulatory effect of DNA methylation.


2014 ◽  
Vol 9 ◽  
pp. BMI.S13729 ◽  
Author(s):  
Chindo Hicks ◽  
Tejaswi Koganti ◽  
Shankar Giri ◽  
Memory Tekere ◽  
Ritika Ramani ◽  
...  

Genome-wide association studies (GWAS) have achieved great success in identifying single nucleotide polymorphisms (SNPs, herein called genetic variants) and genes associated with risk of developing prostate cancer. However, GWAS do not typically link the genetic variants to the disease state or inform the broader context in which the genetic variants operate. Here, we present a novel integrative genomics approach that combines GWAS information with gene expression data to infer the causal association between gene expression and the disease and to identify the network states and biological pathways enriched for genetic variants. We identified gene regulatory networks and biological pathways enriched for genetic variants, including the prostate cancer, IGF-1, JAK2, androgen, and prolactin signaling pathways. The integration of GWAS information with gene expression data provides insights about the broader context in which genetic variants associated with an increased risk of developing prostate cancer operate.


2019 ◽  
Vol 16 (3) ◽  
Author(s):  
Nimisha Asati ◽  
Abhinav Mishra ◽  
Ankita Shukla ◽  
Tiratha Raj Singh

AbstractGene expression studies revealed a large degree of variability in gene expression patterns particularly in tissues even in genetically identical individuals. It helps to reveal the components majorly fluctuating during the disease condition. With the advent of gene expression studies many microarray studies have been conducted in prostate cancer, but the results have varied across different studies. To better understand the genetic and biological regulatory mechanisms of prostate cancer, we conducted a meta-analysis of three major pathways i.e. androgen receptor (AR), mechanistic target of rapamycin (mTOR) and Mitogen-Activated Protein Kinase (MAPK) on prostate cancer. Meta-analysis has been performed for the gene expression data for the human species that are exposed to prostate cancer. Twelve datasets comprising AR, mTOR, and MAPK pathways were taken for analysis, out of which thirteen potential biomarkers were identified through meta-analysis. These findings were compiled based upon the quantitative data analysis by using different tools. Also, various interconnections were found amongst the pathways in study. Our study suggests that the microarray analysis of the gene expression data and their pathway level connections allows detection of the potential predictors that can prove to be putative therapeutic targets with biological and functional significance in progression of prostate cancer.


2011 ◽  
Vol 29 (7_suppl) ◽  
pp. 36-36
Author(s):  
E. A. Klein ◽  
S. M. Falzarano ◽  
T. Maddala ◽  
D. Cherbavaz ◽  
W. F. Novotny ◽  
...  

36 Background: The association of TMPRSS2-ERG fusions and ERG expression in prostate cancer (PC) with adverse clinical outcomes has been controversial, with mixed results in the literature. We conducted a study to test whether tumor-derived gene expression profiles, including the presence of TMPRSS2-ERG fusions and ERG gene expression, are associated with clinical recurrence (cR) after radical prostatectomy (RP). Methods: All patients with clinical stage T1/T2 prostate cancer treated with RP at CC from 1987 to 2004 were identified (n∼f2,600). A cohort sampling design was used to select 127 patients with cR and 374 patients without cR after RP. For each patient a primary Gleason pattern (GP) sample, secondary (or highest) GP sample, and an adjacent nontumor tissue sample were evaluated. Surgical Gleason Score (GS) and clinical data were centrally reviewed. RNA was extracted from 6 manually dissected 10 μ m formalin-fixed paraffin-embedded sections obtained from RP specimens and expression of TMPRSS2-ERGa, TMPRSS2-ERGb, ERG and reference genes were quantified using RT-PCR. Times to cR, PSA recurrence, and PC death were analyzed using Cox PH regression. Results: Blocks from 441 patients were evaluable. Median F/U was 5.8 years. Patients were mostly Caucasian (83%), clinical stage T1 (66%), had baseline PSA <10 ng/mL (82%), and had surgical Gleason score ≤7 (87%). 848 tumor samples and 410 non-tumor samples were assessed. TMPRSS2-ERGa and/or TMPRSS2-ERGb fusions were present in 51.8% of tumor samples and 7.5% of non-tumor samples. There was 89% concordance (95% CI: 86%, 92%) for TMPRSS2-ERG fusion status between the 2 tumor samples for each patient. High ERG expression was strongly associated with the presence of TMPRSS2-ERG fusions (p <0.01). We did not find an association between TMPRSS2-ERG a/b gene rearrangement or ERG expression with cR, PSA recurrence, PC death, or surgical GS (p > 0.2). Conclusions: This study was notable for the large number of cR events, use of a standardized quantitative assay, and rigorous central review of pathology and clinical data. We did not find an association of TMPRSS2-ERG gene rearrangements or ERG expression with aggressiveness of prostate cancer post RP. [Table: see text]


2013 ◽  
Vol 43 (10) ◽  
pp. 1363-1373 ◽  
Author(s):  
Hyunjin Kim ◽  
Jaegyoon Ahn ◽  
Chihyun Park ◽  
Youngmi Yoon ◽  
Sanghyun Park

Author(s):  
Guro Dørum ◽  
Lars Snipen ◽  
Margrete Solheim ◽  
Solve Saebo

Gene set analysis methods have become a widely used tool for including prior biological knowledge in the statistical analysis of gene expression data. Advantages of these methods include increased sensitivity, easier interpretation and more conformity in the results. However, gene set methods do not employ all the available information about gene relations. Genes are arranged in complex networks where the network distances contain detailed information about inter-gene dependencies. We propose a method that uses gene networks to smooth gene expression data with the aim of reducing the number of false positives and identify important subnetworks. Gene dependencies are extracted from the network topology and are used to smooth genewise test statistics. To find the optimal degree of smoothing, we propose using a criterion that considers the correlation between the network and the data. The network smoothing is shown to improve the ability to identify important genes in simulated data. Applied to a real data set, the smoothing accentuates parts of the network with a high density of differentially expressed genes.


2014 ◽  
Vol 13 ◽  
pp. CIN.S19745 ◽  
Author(s):  
Leorey N. Saligan ◽  
Juan Luis Fernández-Martínez ◽  
Enrique J. deAndrés-Galiana ◽  
Stephen Sonis

Background Fatigue is a common side effect of cancer (CA) treatment. We used a novel analytical method to identify and validate a specific gene cluster that is predictive of fatigue risk in prostate cancer patients (PCP) treated with radiotherapy (RT). Methods A total of 44 PCP were categorized into high-fatigue (HF) and low-fatigue (LF) cohorts based on fatigue score change from baseline to RT completion. Fold-change differential and Fisher's linear discriminant analyses (LDA) from 27 subjects with gene expression data at baseline and RT completion generated a reduced base of most discriminatory genes (learning phase). A nearest-neighbor risk (k-NN) prediction model was developed based on small-scale prognostic signatures. The predictive model validity was tested in another 17 subjects using baseline gene expression data (validation phase). Result The model generated in the learning phase predicted HF classification at RT completion in the validation phase with 76.5% accuracy. Conclusion The results suggest that a novel analytical algorithm that incorporates fold-change differential analysis, LDA, and a k-NN may have applicability in predicting regimen-related toxicity in cancer patients with high reliability, if we take into account these results and the limited amount of data that we had at disposal. It is expected that the accuracy will be improved by increasing data sampling in the learning phase.


2020 ◽  
Vol 21 (S10) ◽  
Author(s):  
Ichcha Manipur ◽  
Ilaria Granata ◽  
Lucia Maddalena ◽  
Mario R. Guarracino

Abstract Background Biological networks are representative of the diverse molecular interactions that occur within cells. Some of the commonly studied biological networks are modeled through protein-protein interactions, gene regulatory, and metabolic pathways. Among these, metabolic networks are probably the most studied, as they directly influence all physiological processes. Exploration of biochemical pathways using multigraph representation is important in understanding complex regulatory mechanisms. Feature extraction and clustering of these networks enable grouping of samples obtained from different biological specimens. Clustering techniques separate networks depending on their mutual similarity. Results We present a clustering analysis on tissue-specific metabolic networks for single samples from three primary tumor sites: breast, lung, and kidney cancer. The metabolic networks were obtained by integrating genome scale metabolic models with gene expression data. We performed network simplification to reduce the computational time needed for the computation of network distances. We empirically proved that networks clustering can characterize groups of patients in multiple conditions. Conclusions We provide a computational methodology to explore and characterize the metabolic landscape of tumors, thus providing a general methodology to integrate analytic metabolic models with gene expression data. This method represents a first attempt in clustering large scale metabolic networks. Moreover, this approach gives the possibility to get valuable information on what are the effects of different conditions on the overall metabolism.


Sign in / Sign up

Export Citation Format

Share Document