scholarly journals Machine Learning Model for Lymph Node Metastasis Prediction in Breast Cancer Using Random Forest Algorithm and Mitochondrial Metabolism Hub Genes

2021 ◽  
Vol 11 (7) ◽  
pp. 2897
Author(s):  
Byung-Chul Kim ◽  
Jingyu Kim ◽  
Ilhan Lim ◽  
Dong Ho Kim ◽  
Sang Moo Lim ◽  
...  

Breast cancer metastasis can have a fatal outcome, with the prediction of metastasis being critical for establishing effective treatment strategies. RNA-sequencing (RNA-seq) is a good tool for identifying genes that promote and support metastasis development. The hub gene analysis method is a bioinformatics method that can effectively analyze RNA sequencing results. This can be used to specify the set of genes most relevant to the function of the cell involved in metastasis. Herein, a new machine learning model based on RNA-seq data using the random forest algorithm and hub genes to estimate the accuracy of breast cancer metastasis prediction. Single-cell breast cancer samples (56 metastatic and 38 non-metastatic samples) were obtained from the Gene Expression Omnibus database, and the Weighted Gene Correlation Network Analysis package was used for the selection of gene modules and hub genes (function in mitochondrial metabolism). A machine learning prediction model using the hub gene set was devised and its accuracy was evaluated. A prediction model comprising 54-functional-gene modules and the hub gene set (NDUFA9, NDUFB5, and NDUFB3) showed an accuracy of 0.769 ± 0.02, 0.782 ± 0.012, and 0.945 ± 0.016, respectively. The test accuracy of the hub gene set was over 93% and that of the prediction model with random forest and hub genes was over 91%. A breast cancer metastasis dataset from The Cancer Genome Atlas was used for external validation, showing an accuracy of over 91%. The hub gene assay can be used to predict breast cancer metastasis by machine learning.

PLoS ONE ◽  
2013 ◽  
Vol 8 (2) ◽  
pp. e56195 ◽  
Author(s):  
Xinan Yang ◽  
Prabhakaran Vasudevan ◽  
Vishwas Parekh ◽  
Aleks Penev ◽  
John M. Cunningham

Genes ◽  
2019 ◽  
Vol 10 (6) ◽  
pp. 466
Author(s):  
Li ◽  
Zhong ◽  
Zhou

Bone is the most frequent organ for breast cancer metastasis, and thus it is essential to predict the bone metastasis of breast cancer. In our work, we constructed a gene dependency network based on the hypothesis that the relation between one gene and the risk of bone metastasis might be affected by another gene. Then, based on the structure controllability theory, we mined the driver gene set which can control the whole network in the gene dependency network, and the signature genes were selected from them. Survival analysis showed that the signature could distinguish the bone metastasis risks of cancer patients in the test data set and independent data set. Besides, we used the signature genes to construct a centroid classifier. The results showed that our method is effective and performed better than published methods.


Hereditas ◽  
2019 ◽  
Vol 156 (1) ◽  
Author(s):  
Yun Cai ◽  
Jie Mei ◽  
Zhuang Xiao ◽  
Bujie Xu ◽  
Xiaozheng Jiang ◽  
...  

2021 ◽  
Author(s):  
Zhifeng Wang ◽  
Zihao Chen ◽  
Hongfan Zhao ◽  
Hao Lin ◽  
Junjie Wang ◽  
...  

Abstract Background Clear cell renal cell carcinoma (ccRCC) is the most common type of renal cell carcinoma. Immunotherapy, especially anti-PD-1, is becoming a pillar of ccRCC treatment. However, precise biomarkers and robust models are needed to select the appropriate patients for immunotherapy.MethodsA total of 831 ccRCC transcriptomic profiles were obtained from 6 datasets. Unsupervised clustering was performed to identify the immune subtypes among ccRCC samples based on immune cell enrichment scores. Weighted correlation network analysis (WGCNA) was used to identify hub genes distinguishing subtypes and related to prognosis. A machine learning model was established by random forest algorithm, and employed to an open and free online website to predict the immune subtype. ResultsIn the identified immune subtypes, subtype2 was enriched in immune cell enrichment scores and immunotherapy biomarkers. WGCNA analysis identified 4 hub genes related to immune subtype, CTLA4, FOXP3, IFNG, and CD19. The random forest model was constructed by mRNA expression of these four hub genes, and the value of areas under the curve of the receiver operating characteristic (AUC) was 0.78. Subtype2 patients in the independent validation cohort had a better drug response and prognosis for immunotherapy treatment. Moreover, an open and free website was developed by the random forest model (https://immunotype.shinyapps.io/ISPRF/). ConclusionsThe current study constructs a model and provides a free online website that could identify suitable ccRCC patients for immunotherapy, and it is an important step forward to personalized treatment.


2021 ◽  
Author(s):  
Santosh Kumar Paidi ◽  
Joel Rodriguez Troncoso ◽  
Mason G. Harper ◽  
Zhenhui Liu ◽  
Khue G. Nguyen ◽  
...  

AbstractThe accurate analytical characterization of metastatic phenotype at primary tumor diagnosis and its evolution with time are critical for controlling metastatic progression of cancer. Here, we report a label-free optical strategy using Raman spectroscopy and machine learning to identify distinct metastatic phenotypes observed in tumors formed by isogenic murine breast cancer cell lines of progressively increasing metastatic propensities. Our Raman spectra-based random forest analysis provided evidence that machine learning models built on spectral data can allow the accurate identification of metastatic phenotype of independent test tumors. By silencing genes critical for metastasis in highly metastatic cell lines, we showed that the random forest classifiers provided predictions consistent with the observed phenotypic switch of the resultant tumors towards lower metastatic potential. Furthermore, the spectral assessment of lipid and collagen content of these tumors was consistent with the observed phenotypic switch. Overall, our findings indicate that Raman spectroscopy may offer a novel strategy to evaluate metastatic risk during primary tumor biopsies in clinical patients.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e13558-e13558
Author(s):  
Yousri A. Rostom ◽  
Salah-Eldin Abd-El-Moneim ◽  
Nevine Makram Labib ◽  
Samia Gharib ◽  
Marwa Shaker ◽  
...  

e13558 Background: Artificial intelligence (AI) and machine learning (ML) have outstanding contributions in oncology. One of the applications is the early detection of breast cancer. Recently, several ML and data mining techniques have been used for both detection and classification of breast cancer cases. It is found that about 25% of breast cancer cases have an aggressive cancer at diagnosis time, with metastatic spread. The absence or presence of metastatic spread largely determines the patient’s survival. Hence, early detection is very important for reducing cancer mortality rates Methods: This study aims at applying ML and data mining, using AI techniques, for exploring and preprocessing breast cancer dataset, before building the ML classification Model for breast cancer metastasis prediction. The model will be implemented for mass screening, to prioritize patients who are more likely to develop metastases. A dataset of breast cancer cases was provided by the Oncology and Nuclear Medicine Department, Faculty of Medicine, Alexandria University. It contains clinical records of 5236 patients, diagnosed with breast cancer. ML libraries in Python programming language was used to explore the dataset and determine ratio of missing data, define data types, redundant data, and specify class label and predictors that to be used for the classification model. Results: In this work, the results showed that missing data ratio in some columns exceeds 90%, there are redundant features to be eliminated, data type conversion and feature reduction should be applied to prepare the data. Conclusions: Based on the previous findings, it is recommended to use ML preprocessing python libraries to prepare the dataset before building ML classification model of breast cancer metastasis prediction.


2018 ◽  
Vol 120 (6) ◽  
pp. 9522-9531 ◽  
Author(s):  
Dongyang Tang ◽  
Xin Zhao ◽  
Li Zhang ◽  
Zhiwei Wang ◽  
Cheng Wang

2020 ◽  
Author(s):  
Huanxian Wu ◽  
Huining Lian ◽  
Qianqing Chen ◽  
Jinlamao Yang ◽  
Baofang Ou ◽  
...  

Abstract Background: Breast cancer is one of the most common malignant tumors with the highest morbidity and mortality among women. Compared with the other breast cancer subtypes, Triple-negative breast cancer (TNBC) has a higher probability of recurrence and is prone to distant metastasis. To reveal the underlying disease mechanisms and identify more effective biomarkers for TNBC and breast cancer metastasis. Methods: Gene Ontology and KEGG pathway analysis were used for investigating the role of overlapping differentially expressed genes (DEGs). Hub genes among these DEGs were determined by the protein-protein interactions network analysis and CytoHubba. Oncomine databases were used for verifying the clinical relevance of hub genes. Furthermore, the differences in the expression of these genes in cancer and normal tissues were validated in the cellular, animal and human tissue.Results: Seven hub genes, including TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8 and CDC25C, were identified that might be associated with TNBC and breast cancer metastasis. Meanwhile, these genes have been verified highly expressed in tumor cells and tumor tissues, and patients with higher expression of these genes have a poorer prognosis. Conclusions: Seven hub genes were potential biomarkers for the diagnosis and therapy of TNBC and breast cancer metastasis.


Sign in / Sign up

Export Citation Format

Share Document