Predicting breast cancer metastasis by using serum biomarkers and clinicopathological data with machine learning technologies

2019 ◽  
Vol 128 ◽  
pp. 79-86 ◽  
Author(s):  
Yi-Ju Tseng ◽  
Chuan-En Huang ◽  
Chiao-Ni Wen ◽  
Po-Yin Lai ◽  
Min-Hsien Wu ◽  
...  
2021 ◽  
Vol 11 (7) ◽  
pp. 2897
Author(s):  
Byung-Chul Kim ◽  
Jingyu Kim ◽  
Ilhan Lim ◽  
Dong Ho Kim ◽  
Sang Moo Lim ◽  
...  

Breast cancer metastasis can have a fatal outcome, with the prediction of metastasis being critical for establishing effective treatment strategies. RNA-sequencing (RNA-seq) is a good tool for identifying genes that promote and support metastasis development. The hub gene analysis method is a bioinformatics method that can effectively analyze RNA sequencing results. This can be used to specify the set of genes most relevant to the function of the cell involved in metastasis. Herein, a new machine learning model based on RNA-seq data using the random forest algorithm and hub genes to estimate the accuracy of breast cancer metastasis prediction. Single-cell breast cancer samples (56 metastatic and 38 non-metastatic samples) were obtained from the Gene Expression Omnibus database, and the Weighted Gene Correlation Network Analysis package was used for the selection of gene modules and hub genes (function in mitochondrial metabolism). A machine learning prediction model using the hub gene set was devised and its accuracy was evaluated. A prediction model comprising 54-functional-gene modules and the hub gene set (NDUFA9, NDUFB5, and NDUFB3) showed an accuracy of 0.769 ± 0.02, 0.782 ± 0.012, and 0.945 ± 0.016, respectively. The test accuracy of the hub gene set was over 93% and that of the prediction model with random forest and hub genes was over 91%. A breast cancer metastasis dataset from The Cancer Genome Atlas was used for external validation, showing an accuracy of over 91%. The hub gene assay can be used to predict breast cancer metastasis by machine learning.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e13558-e13558
Author(s):  
Yousri A. Rostom ◽  
Salah-Eldin Abd-El-Moneim ◽  
Nevine Makram Labib ◽  
Samia Gharib ◽  
Marwa Shaker ◽  
...  

e13558 Background: Artificial intelligence (AI) and machine learning (ML) have outstanding contributions in oncology. One of the applications is the early detection of breast cancer. Recently, several ML and data mining techniques have been used for both detection and classification of breast cancer cases. It is found that about 25% of breast cancer cases have an aggressive cancer at diagnosis time, with metastatic spread. The absence or presence of metastatic spread largely determines the patient’s survival. Hence, early detection is very important for reducing cancer mortality rates Methods: This study aims at applying ML and data mining, using AI techniques, for exploring and preprocessing breast cancer dataset, before building the ML classification Model for breast cancer metastasis prediction. The model will be implemented for mass screening, to prioritize patients who are more likely to develop metastases. A dataset of breast cancer cases was provided by the Oncology and Nuclear Medicine Department, Faculty of Medicine, Alexandria University. It contains clinical records of 5236 patients, diagnosed with breast cancer. ML libraries in Python programming language was used to explore the dataset and determine ratio of missing data, define data types, redundant data, and specify class label and predictors that to be used for the classification model. Results: In this work, the results showed that missing data ratio in some columns exceeds 90%, there are redundant features to be eliminated, data type conversion and feature reduction should be applied to prepare the data. Conclusions: Based on the previous findings, it is recommended to use ML preprocessing python libraries to prepare the dataset before building ML classification model of breast cancer metastasis prediction.


Sign in / Sign up

Export Citation Format

Share Document