Identification of diagnostic and prognostic lncRNA biomarkers in oral squamous carcinoma by integrated analysis and machine learning

2020 ◽  
Vol 29 (2) ◽  
pp. 265-275
Author(s):  
Sen Yang ◽  
Yingshu Wang ◽  
Jun Ren ◽  
Xueqin Zhou ◽  
Kaizhi Cai ◽  
...  

BACKGROUND: Patients with oral squamous carcinoma (OSCC) present difficulty in precise diagnosis and poor prognosis. OBJECTIVE: We aimed to identify the diagnostic and prognostic indicators in OSCC and provide basis for molecular mechanism investigation of OSCC. METHODS: We collected sequencing data and clinical data from TCGA database and screened the differentially expressed mRNAs (DEmRNAs) and lncRNAs (DElncRNAs) in OSCC. Machine learning and modeling were performed to identify the optimal diagnostic markers. In order to determine lncRNAs with prognostic value, survival analysis was performed through combing the expression profiles with the clinical data. Finally, co-expressed DEmRNAs of lncRNAs were identified by interacted network construction and functional annotated by GO and KEGG analysis. RESULTS: A total of 1114 (345 up- and 769 down-regulated) DEmRNAs and 156 (86 up- and 70 down-regulated) DElncRNAs were obtained in OSCC. Following the machine learning and modeling, 15 lncRNAs were identified to be the optimal diagnostic indicators of OSCC. Among them, FOXD2.AS1 was significantly associated with survival rate of patients with OSCC. In addition, Focal adhesion and ECM-receptor interaction pathways were found to be involved in OSCC. CONCLUSIONS : FOXD2.AS1 might be a prognostic marker for OSCC and our study may provide more information to the further study in OSCC.

Animals ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 2006
Author(s):  
Hongyu Liu ◽  
Ibrar Muhammad Khan ◽  
Huiqun Yin ◽  
Xinqi Zhou ◽  
Muhammad Rizwan ◽  
...  

The mRNAs and long non-coding RNAs axes are playing a vital role in the regulating of post-transcriptional gene expression. Thereby, elucidating the expression pattern of mRNAs and long non-coding RNAs underlying testis development is crucial. In this study, mRNA and long non-coding RNAs expression profiles were investigated in 3-month-old calves and 3-year-old mature bulls’ testes by total RNA sequencing. Additionally, during the gene level analysis, 21,250 mRNAs and 20,533 long non-coding RNAs were identified. As a result, 7908 long non-coding RNAs (p-adjust < 0.05) and 5122 mRNAs (p-adjust < 0.05) were significantly differentially expressed between the distinct age groups. In addition, gene ontology and biological pathway analyses revealed that the predicted target genes are enriched in the lysine degradation, cell cycle, propanoate metabolism, adherens junction and cell adhesion molecules pathways. Correspondingly, the RT-qPCR validation results showed a strong consistency with the sequencing data. The source genes for the mRNAs (CCDC83, DMRTC2, HSPA2, IQCG, PACRG, SPO11, EHHADH, SPP1, NSD2 and ACTN4) and the long non-coding RNAs (COX7A2, COX6B2, TRIM37, PRM2, INHBA, ERBB4, SDHA, ATP6VOA2, FGF9 and TCF21) were found to be actively associated with bull sexual maturity and spermatogenesis. This study provided a comprehensive catalog of long non-coding RNAs in the bovine testes and also offered useful resources for understanding the differences in sexual development caused by the changes in the mRNA and long non-coding RNA interaction expressions between the immature and mature stages.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9656
Author(s):  
Sugandh Kumar ◽  
Srinivas Patnaik ◽  
Anshuman Dixit

Machine learning techniques are increasingly used in the analysis of high throughput genome sequencing data to better understand the disease process and design of therapeutic modalities. In the current study, we have applied state of the art machine learning (ML) algorithms (Random Forest (RF), Support Vector Machine Radial Kernel (svmR), Adaptive Boost (AdaBoost), averaged Neural Network (avNNet), and Gradient Boosting Machine (GBM)) to stratify the HNSCC patients in early and late clinical stages (TNM) and to predict the risk using miRNAs expression profiles. A six miRNA signature was identified that can stratify patients in the early and late stages. The mean accuracy, sensitivity, specificity, and area under the curve (AUC) was found to be 0.84, 0.87, 0.78, and 0.82, respectively indicating the robust performance of the generated model. The prognostic signature of eight miRNAs was identified using LASSO (least absolute shrinkage and selection operator) penalized regression. These miRNAs were found to be significantly associated with overall survival of the patients. The pathway and functional enrichment analysis of the identified biomarkers revealed their involvement in important cancer pathways such as GP6 signalling, Wnt signalling, p53 signalling, granulocyte adhesion, and dipedesis. To the best of our knowledge, this is the first such study and we hope that these signature miRNAs will be useful for the risk stratification of patients and the design of therapeutic modalities.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Hao Bo ◽  
Fang Zhu ◽  
Zhizhong Liu ◽  
Qi Deng ◽  
Guangmin Liu ◽  
...  

AbstractLong noncoding RNAs (lncRNAs) are involved in various physiological and pathological processes. However, the role of lncRNAs in testicular germ cell tumor (TGCT) has been rarely reported. Our purpose is to comprehensively survey the expression and function of lncRNAs in TGCT. In this study, we used RNA sequencing to construct the lncRNA expression profiles of 13 TGCT tissues and 4 paraneoplastic tissues to explore the function of lncRNAs in TGCT. The bioinformatics analysis showed that many lncRNAs are differentially expressed in TGCT. GO and KEGG enrichment analyses revealed that the differentially expressed lncRNAs participated in various biological processes associated with tumorigenesis in cis and trans manners. Further, we found that the expression of LINC00467 was positively correlated with the poor prognosis and pathological grade of TGCT using WGCNA analysis and GEPIA database data mining. In vitro experiments revealed that LNC00467 could promote the migration and invasion of TGCT cells by regulating the expression of AKT3 and influencing total AKT phosphorylation. Further analysis of TCGA data revealed that the expression was negatively correlated with the infiltration of immune cells and the response to PD1 immunotherapy. In summary, this study is the first to construct the expression profile of lncRNAs in TGCT. It is also the first study to identify the metastasis-promoting role of LNC00467, which can be used as a potential predictor of TGCT prognosis and immunotherapeutic response to provide a clinical reference for the treatment and diagnosis of TGCT metastasis.


2021 ◽  
Author(s):  
Weina Lu ◽  
Ran Ji

Abstract Background and Aims: Acute respiratory distress syndrome (ARDS) is one of the most common acute thoracopathy with complicated pathogenesis in ICU. The study is to explore the differentially expressed genes (DEGs) in the lung tissue and underlying altering mechanisms in ARDS.Methods: Gene expression profiles of GSE2411 and GSE130936 were available from GEO database, both of them included in GPL 339. Then, an integrated analysis of these genes was performed, including gene ontology (GO) and KEGG pathway enrichment analysis, protein-protein interaction (PPI) network construction, Transcription Factors (TFs) forecasting, and their expression in varied organs.Results: A total of 39 differential expressed genes were screened from the datasets, including 39 up-regulated genes and 0 down-regulated genes. The up-regulated genes were mainly enriched in the biological process, such as immune system process, innate immune response, inflammatory response, cellular response to interferon-beta and also involved in some signal pathways, including cytokine-cytokine receptor interaction, salmonella infection, legionellosis, chemokine, and Toll-like receptor signal pathway. GBP2, IFIT2 and IFIT3 were identified as hub genes in the lung by PPI network analysis with MCODE plug-in, as well as GO and KEGG re-enrichment. All of the three hub genes were regulated by the predictive common TFs, including STAT1, E2F1, IRF1, IRF2, and IRF9. Conclusions: This study implied that hub gene GBP2, IFIT2 and IFIT3, which might be regulated by STAT1, E2F1, IRF1, IRF2, or IRF9, played significant roles in ARDS. They could be potential diagnostic or therapeutic targets for ARDS patients.


2020 ◽  
Vol 48 (9) ◽  
pp. 030006052095323
Author(s):  
Jun Liu ◽  
Gui-Li Sun ◽  
Shang-Ling Pan ◽  
Meng-Bin Qin ◽  
Rong Ouyang ◽  
...  

Objectives This study aimed to investigate hub genes and their prognostic value in colon cancer via bioinformatics analysis. Methods Differentially expressed genes (DEGs) of expression profiles (GSE33113, GSE20916, and GSE37364) obtained from Gene Expression Omnibus (GEO) were identified using the GEO2R tool and Venn diagram software. Function and pathway enrichment analyses were performed, and a protein–protein interaction (PPI) network was constructed. Hub genes were verified based on The Cancer Genome Atlas (TCGA) and Human Protein Atlas (HPA) databases. Results We identified 207 DEGs, 62 upregulated and 145 downregulated genes, enriched in Gene Ontology terms “organic anion transport,” “extracellular matrix,” and “receptor ligand activity”, and in the Kyoto Encyclopedia of Genes and Genomes pathway “cytokine-cytokine receptor interaction.” The PPI network was constructed and nine hub genes were selected by survival analysis and expression validation. We verified these genes in the TCGA database and selected three potential predictors ( ZG16, TIMP1, and BGN) that met the independent predictive criteria. TIMP1 and BGN were upregulated in patients with a high cancer risk, whereas ZG16 was downregulated. The immunostaining results from HPA supported these findings. Conclusion This study indicates that these hub genes may be promising prognostic indicators or therapeutic targets for colon cancer.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Chang Su ◽  
Jie Tong ◽  
Fei Wang

Abstract High-throughput techniques have generated abundant genetic and transcriptomic data of Parkinson’s disease (PD) patients but data analysis approaches such as traditional statistical methods have not provided much in the way of insightful integrated analysis or interpretation of the data. As an advanced computational approach, machine learning, which enables people to identify complex patterns and insight from data, has consequently been harnessed to analyze and interpret large, highly complex genetic and transcriptomic data toward a better understanding of PD. In particular, machine learning models have been developed to integrate patient genotype data alone or combined with demographic, clinical, neuroimaging, and other information, for PD outcome study. They have also been used to identify biomarkers of PD based on transcriptomic data, e.g., gene expression profiles from microarrays. This study overviews the relevant literature on using machine learning models for genetic and transcriptomic data analysis in PD, points out remaining challenges, and suggests future directions accordingly. Undoubtedly, the use of machine learning is amplifying PD genetic and transcriptomic achievements for accelerating the study of PD. Existing studies have demonstrated the great potential of machine learning in discovering hidden patterns within genetic or transcriptomic information and thus revealing clues underpinning pathology and pathogenesis. Moving forward, by addressing the remaining challenges, machine learning may advance our ability to precisely diagnose, prognose, and treat PD.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Weina Lu ◽  
Ran Ji

Abstract Background and aims Acute respiratory distress syndrome (ARDS) or acute lung injury (ALI) is one of the most common acute thoracopathy with complicated pathogenesis in ICU. The study is to explore the differentially expressed genes (DEGs) in the lung tissue and underlying altering mechanisms in ARDS. Methods Gene expression profiles of GSE2411 and GSE130936 were available from GEO database, both of them included in GPL339. Then, an integrated analysis of these genes was performed, including gene ontology (GO) and KEGG pathway enrichment analysis in DAVID database, protein–protein interaction (PPI) network construction evaluated by the online database STRING, Transcription Factors (TFs) forecasting based on the Cytoscape plugin iRegulon, and their expression in varied organs in The Human Protein Atlas. Results A total of 39 differential expressed genes were screened from the two datasets, including 39 up-regulated genes and 0 down-regulated genes. The up-regulated genes were mainly enriched in the biological process, such as immune system process, innate immune response, inflammatory response, and also involved in some signal pathways, including cytokine–cytokine receptor interaction, Salmonella infection, Legionellosis, Chemokine, and Toll-like receptor signal pathway with an integrated analysis. GBP2, IFIT2 and IFIT3 were identified as hub genes in the lung by PPI network analysis with MCODE plug-in, as well as GO and KEGG re-enrichment. All of the three hub genes were regulated by the predictive common TFs, including STAT1, E2F1, IRF1, IRF2, and IRF9. Conclusions This study implied that hub gene GBP2, IFIT2 and IFIT3, which might be regulated by STAT1, E2F1, IRF1, IRF2, or IRF9, played significant roles in ARDS. They could be potential diagnostic or therapeutic targets for ARDS patients.


2017 ◽  
Vol 95 (3) ◽  
pp. 1092 ◽  
Author(s):  
J. Sun ◽  
M. Xie ◽  
Z. Huang ◽  
H. Li ◽  
T. chen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document