scholarly journals Unsupervised and supervised learning with neural network for human transcriptome analysis and cancer diagnosis

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Bo Yuan ◽  
Dong Yang ◽  
Bonnie E. G. Rothberg ◽  
Hao Chang ◽  
Tian Xu

Abstract Deep learning analysis of images and text unfolds new horizons in medicine. However, analysis of transcriptomic data, the cause of biological and pathological changes, is hampered by structural complexity distinctive from images and text. Here we conduct unsupervised training on more than 20,000 human normal and tumor transcriptomic data and show that the resulting Deep-Autoencoder, DeepT2Vec, has successfully extracted informative features and embedded transcriptomes into 30-dimensional Transcriptomic Feature Vectors (TFVs). We demonstrate that the TFVs could recapitulate expression patterns and be used to track tissue origins. Trained on these extracted features only, a supervised classifier, DeepC, can effectively distinguish tumors from normal samples with an accuracy of 90% for Pan-Cancer and reach an average 94% for specific cancers. Training on a connected network, the accuracy is further increased to 96% for Pan-Cancer. Together, our study shows that deep learning with autoencoder is suitable for transcriptomic analysis, and DeepT2Vec could be successfully applied to distinguish cancers, normal tissues, and other potential traits with limited samples.

2021 ◽  
Vol 41 (1) ◽  
Author(s):  
Xinpeng Liu ◽  
Yuanbo Zhan ◽  
Wenxia Xu ◽  
Xiaoyao Liu ◽  
Yawei Geng ◽  
...  

Abstract Background: The family with sequence similarity 20-member C (Fam20C) kinase plays important roles in physiopathological process and is responsible for majority of the secreted phosphoproteome, including substrates associated with tumor cell migration. However, it remains unclear whether Fam20C plays a role in cancers. Here, we aimed to analyze the expression and prognostic value of Fam20C in pan-cancer and to gain insights into the association between Fam20C and immune infiltration. Methods: We analyzed Fam20C expression patterns and the associations between Fam20C expression levels and prognosis in pan-cancer via the ONCOMINE, TIMER (Tumor Immune Estimation Resource), PrognoScan, GEPIA (Gene Expression Profiling Interactive Analysis), and Kaplan–Meier Plotter databases. After that, GEPIA and TIMER databases were applied to investigate the relations between Fam20C expression and immune infiltration across different cancer types, especially BLCA (bladder urothelial carcinoma), LGG (brain lower grade glioma), and STAD (stomach adenocarcinoma). Results: Compared with adjacent normal tissues, Fam20C was widely expressed across many cancers. In general, Fam20C showed a detrimental role in pan-cancer, it was positively associated with poor survival of BLCA, LGG, and STAD patients. Specifically, based on TCGA (The Cancer Genome Atlas) database, a high expression level of Fam20C was associated with worse prognostic value in stages T2–T4 and stages N0–N2 in the cohort of STAD patients. Moreover, Fam20C expression had positive associations with immune infiltration, including CD4+ T cells, macrophages, neutrophils, and dendritic cells, and other diverse immune cells in BLCA, LGG, and STAD. Conclusion: Fam20C may serve as a promising prognostic biomarker in pan-cancer and has positive associations with immune infiltrates.


2021 ◽  
Author(s):  
Sungwoo Bae ◽  
Hongyoon Choi ◽  
Dong Soo Lee

Abstract Profiling molecular features associated with the morphological landscape of tissue is crucial for investigating the structural and spatial patterns that underlie the biological function of tissues. In this study, we present a new method, spatial gene expression patterns by deep learning of tissue images (SPADE), to identify important genes associated with morphological contexts by combining spatial transcriptomic data with coregistered images. SPADE incorporates deep learning-derived image patterns with spatially resolved gene expression data to extract morphological context markers. Morphological features that correspond to spatial maps of the transcriptome were extracted by image patches surrounding each spot and were subsequently represented by image latent features. The molecular profiles correlated with the image latent features were identified. The extracted genes could be further analyzed to discover functional terms and exploited to extract clusters maintaining morphological contexts. We apply our approach to spatial transcriptomic data from different tissues, platforms and types of images to demonstrate an unbiased method that is capable of obtaining image-integrated gene expression trends.


2020 ◽  
Author(s):  
Yun Feng ◽  
Fang Li ◽  
Xianli Guo ◽  
Fenghui Wang ◽  
Haiyan Shi ◽  
...  

Abstract Background: Discs large-associated protein 5 (DLGAP5), a kinetochore fibers-binding protein, has been found to function as a oncoprotein in many cancers. However, its expression patterns in normal and cancer tissues across pan-cancer, as well as the cell lines, are far from clear. Methods: Data from genotype-tissue expression (GTEx) and The Cancer Genome Atlas (TCGA) was used to analyze the DLGAP5 expression in normal tissues and cancer cell lines, respectively. The analysis of DLGAP5 expression in cancer tissues and adjacent tissues was based on data from a combined TCGA and GTEx. The associations between the expression, prognosis and cancer immune infiltrates in pan-cancer were also investigated based on TCGA and Tumor Immune Estimation Resource (TIMER), respectively. Furthermore, the analysis results of ccRCC was verified using cell lines via RNAi, western blotting, and the cytological analysis.Results: The low expression levels of DLGAP5 were observed in 31 types of common human tissues, including kidney tissue. However, its expression displayed upregulation in all the 21 tested cancer cell lines, of which kidney cancer cell lines showed a minimal upregulation. As predicted, the significant overexpression of DLGAP5 occurred in at least 26 types of common cancer tissues compared with the adjacent normal tissues. Surprisingly, in three types of kidney cancer (KICH, KIRC/ccRCC, KIRP), DLGAP5 exhibited a statistically significant, but minor, overexpression among 26 types of tested cancers. Furthermore, the survival probability of some tested cancers, including kidney cancer, were significantly related to the upregulated expression of DLGAP5. In addition, among 33 types of tested cancers, KIRC/ccRCC, LGG and LIHC showed a significant positive correlation between DLGAP5 expression and immune infiltration levels. DLGAP5 expression level was also significantly positive correlated with clinical TNM stage of ccRCC patients. Regarding ccRCC tissues and the cell lines, upregulation expression of DLGAP5 was also detected. Its knockdown inhibited the cells viability and proliferation, and compromised the cells migration and invasion. Conclusions: DLGAP5 overexpression occurred in common human cancers, including the kidney cancers. Notably, ccRCC, seemed to be particularly sensitive to the expression. DLGAP5, therefore, may be as a robust independent prognostic biomarker in ccRCC diagnosis.


2021 ◽  
Author(s):  
Cangyuan Zhang ◽  
Ziyang Long ◽  
Cheng Yan ◽  
Mohamed Said Jalloh ◽  
Yongkun Fang ◽  
...  

Abstract Background The protein meflin encoded by ISLR contains a C2-type immunoglobulin (Ig)-like domain and five leucine-rich repeat (LRR) domains. ISLR is known to play a role in a small number of tumors, but its role in most tumors is unknown. The purpose of this study was to analyze the expression and prognosis of ISLR in pan-cancer, as well as its correlation with tumor immunity. Methods We used multiple databases and R software to conduct bioinformatics analysis to explore the predictive role of ISLR in pan-cancer, mainly involving expression patterns, prognosis, and immune infiltration. Results Compared with normal tissues, the expression of ISLR was significantly increased or decreased in most tumors. Moreover, the high expression of ISLR may cause the prognosis of some tumors to become better or worse. ISLR also affects immune infiltration in a variety of tumors, which affects the clinical prognosis. ISLR is also significantly related to TMB and MSI in pan-cancer and is related to genes encoding immune regulatory genes. ISLR also affects various cancer-and immune-related pathways. Conclusions ISLR is differentially expressed in tumors, may regulate TME, affect tumor prognosis, and is expected to become a prognostic biomarker.


2018 ◽  
Vol 30 (1) ◽  
pp. 90 ◽  
Author(s):  
Peng Zhang ◽  
Xinnan Xu ◽  
Hongwei Wang ◽  
Yuanli Feng ◽  
Haozhe Feng ◽  
...  

2020 ◽  
Author(s):  
Wei Zhang ◽  
Zixing Huang ◽  
Jian Zhao ◽  
Du He ◽  
Mou Li ◽  
...  

2020 ◽  
Vol 20 (11) ◽  
pp. 1288-1299
Author(s):  
Paromita Kundu ◽  
Deepika Singh ◽  
Abhalaxmi Singh ◽  
Sanjeeb K. Sahoo

The panorama of cancer treatment has taken a considerable leap over the last decade with the advancement in the upcoming novel therapies combined with modern diagnostics. Nanotheranostics is an emerging science that holds tremendous potential as a contrivance by integrating therapy and imaging in a single probe for cancer diagnosis and treatment thus offering the advantage like tumor-specific drug delivery and at the same time reduced side effects to normal tissues. The recent surge in nanomedicine research has also paved the way for multimodal theranostic nanoprobe towards personalized therapy through interaction with a specific biological system. This review presents an overview of the nano theranostics approach in cancer management and a series of different nanomaterials used in theranostics and the possible challenges with future directions.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jing Xu ◽  
Xiangdong Liu ◽  
Qiming Dai

Abstract Background Hypertrophic cardiomyopathy (HCM) represents one of the most common inherited heart diseases. To identify key molecules involved in the development of HCM, gene expression patterns of the heart tissue samples in HCM patients from multiple microarray and RNA-seq platforms were investigated. Methods The significant genes were obtained through the intersection of two gene sets, corresponding to the identified differentially expressed genes (DEGs) within the microarray data and within the RNA-Seq data. Those genes were further ranked using minimum-Redundancy Maximum-Relevance feature selection algorithm. Moreover, the genes were assessed by three different machine learning methods for classification, including support vector machines, random forest and k-Nearest Neighbor. Results Outstanding results were achieved by taking exclusively the top eight genes of the ranking into consideration. Since the eight genes were identified as candidate HCM hallmark genes, the interactions between them and known HCM disease genes were explored through the protein–protein interaction (PPI) network. Most candidate HCM hallmark genes were found to have direct or indirect interactions with known HCM diseases genes in the PPI network, particularly the hub genes JAK2 and GADD45A. Conclusions This study highlights the transcriptomic data integration, in combination with machine learning methods, in providing insight into the key hallmark genes in the genetic etiology of HCM.


Sign in / Sign up

Export Citation Format

Share Document