scholarly journals Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) Study by Biochemically-inspired Machine Learning

F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 2124 ◽  
Author(s):  
Iman Rezaeian ◽  
Eliseos J. Mucaki ◽  
Katherina Baranova ◽  
Huy Q. Pham ◽  
Dimo Angelov ◽  
...  

Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemotherapy (CT) agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients, was also used to derive gene signatures of other HT  (tamoxifen) and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil) used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM) model of paclitaxel response containing the ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, TUBB4B genes was 78.6% accurate in 84 patients treated with both HT and CT (median survival ≥ 4.4 yr). Accuracy was lower (73.4%) in 304 untreated patients. The performance of other machine learning approaches were also evaluated at different survival thresholds. Minimum redundancy maximum relevance feature selection of a paclitaxel-based SVM classifier based on expression of ABCB11, ABCC1, BAD, BBC3 and BCL2L1 was 79% accurate in 53 CT patients. A random forest (RF) classifier produced a gene signature (ABCB11, ABCC1, BAD, BCL2, CYP2C8, CYP3A4, MAP4, MAPT, NR1I2, TUBB1, GBP1, OPRK1) that predicted >3 year survival with 82.4% accuracy in 420 HT patients. A similar RF gene signature showed 79.6% accuracy in 504 patients treated with CT and/or HT. These results suggest that tumor gene expression signatures refined by machine learning techniques can be useful for predicting survival after drug therapies.

F1000Research ◽  
2017 ◽  
Vol 5 ◽  
pp. 2124 ◽  
Author(s):  
Iman Rezaeian ◽  
Eliseos J. Mucaki ◽  
Katherina Baranova ◽  
Huy Q. Pham ◽  
Dimo Angelov ◽  
...  

Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemotherapy (CT) agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients; was also used to derive gene signatures of other HT  (tamoxifen) and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil) used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM) model of paclitaxel response containing genes ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, and TUBB4B was 78.6% accurate in predicting survival of 84 patients treated with both HT and CT (median survival ≥ 4.4 yr). Accuracy was lower (73.4%) in 304 untreated patients. The performance of other machine learning approaches was also evaluated at different survival thresholds. Minimum redundancy maximum relevance feature selection of a paclitaxel-based SVM classifier based on expression of genes BCL2L1, BBC3, FGF2, FN1, and TWIST1 was 81.1% accurate in 53 CT patients. In addition, a random forest (RF) classifier using a gene signature (ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2,SLCO1B3, TUBB1, TUBB4A, and TUBB4B) predicted >3-year survival with 85.5% accuracy in 420 HT patients. A similar RF gene signature showed 82.7% accuracy in 504 patients treated with CT and/or HT. These results suggest that tumor gene expression signatures refined by machine learning techniques can be useful for predicting survival after drug therapies.


F1000Research ◽  
2017 ◽  
Vol 5 ◽  
pp. 2124 ◽  
Author(s):  
Eliseos J. Mucaki ◽  
Katherina Baranova ◽  
Huy Q. Pham ◽  
Iman Rezaeian ◽  
Dimo Angelov ◽  
...  

Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemotherapy (CT) agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients; was also used to derive gene signatures of other HT  (tamoxifen) and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil) used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM) model of paclitaxel response containing genes ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, and TUBB4B was 78.6% accurate in predicting survival of 84 patients treated with both HT and CT (median survival ≥ 4.4 yr). Accuracy was lower (73.4%) in 304 untreated patients. The performance of other machine learning approaches was also evaluated at different survival thresholds. Minimum redundancy maximum relevance feature selection of a paclitaxel-based SVM classifier based on expression of genes BCL2L1, BBC3, FGF2, FN1, and TWIST1 was 81.1% accurate in 53 CT patients. In addition, a random forest (RF) classifier using a gene signature (ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2,SLCO1B3, TUBB1, TUBB4A, and TUBB4B) predicted >3-year survival with 85.5% accuracy in 420 HT patients. A similar RF gene signature showed 82.7% accuracy in 504 patients treated with CT and/or HT. These results suggest that tumor gene expression signatures refined by machine learning techniques can be useful for predicting survival after drug therapies.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Leili Tapak ◽  
Saeid Afshar ◽  
Mahlagha Afrasiabi ◽  
Mohammad Kazem Ghasemi ◽  
Pedram Alirezaei

Background. Psoriasis is a chronic autoimmune disease impairing significantly the quality of life of the patient. The diagnosis of the disease is done via a visual inspection of the lesional skin by dermatologists. Classification of psoriasis using gene expression is an important issue for the early and effective treatment of the disease. Therefore, gene expression data and selection of suitable gene signatures are effective sources of information. Methods. We aimed to develop a hybrid classifier for the diagnosis of psoriasis based on two machine learning models of the genetic algorithm and support vector machine (SVM). The method also conducts gene signature selection. A publically available gene expression dataset was used to test the model. Results. A number of 181 probe sets were selected among the original 54,675 probes using the hybrid model with a prediction accuracy of 100% over the test set. A number of 10 hub genes were identified using the protein-protein interaction network. Nine out of 10 identified genes were found in significant modules. Conclusions. The results showed that the genetic algorithm improved the SVM classifier performance significantly implying the ability of the proposed model in terms of detecting relevant gene expression signatures as the best features.


2021 ◽  
Vol 11 (2) ◽  
pp. 61
Author(s):  
Jiande Wu ◽  
Chindo Hicks

Background: Breast cancer is a heterogeneous disease defined by molecular types and subtypes. Advances in genomic research have enabled use of precision medicine in clinical management of breast cancer. A critical unmet medical need is distinguishing triple negative breast cancer, the most aggressive and lethal form of breast cancer, from non-triple negative breast cancer. Here we propose use of a machine learning (ML) approach for classification of triple negative breast cancer and non-triple negative breast cancer patients using gene expression data. Methods: We performed analysis of RNA-Sequence data from 110 triple negative and 992 non-triple negative breast cancer tumor samples from The Cancer Genome Atlas to select the features (genes) used in the development and validation of the classification models. We evaluated four different classification models including Support Vector Machines, K-nearest neighbor, Naïve Bayes and Decision tree using features selected at different threshold levels to train the models for classifying the two types of breast cancer. For performance evaluation and validation, the proposed methods were applied to independent gene expression datasets. Results: Among the four ML algorithms evaluated, the Support Vector Machine algorithm was able to classify breast cancer more accurately into triple negative and non-triple negative breast cancer and had less misclassification errors than the other three algorithms evaluated. Conclusions: The prediction results show that ML algorithms are efficient and can be used for classification of breast cancer into triple negative and non-triple negative breast cancer types.


The Breast Cancer is disease which tremendously increased in women’s nowadays. Mammography is technique of low-powered X-ray diagnosis approach for detection and diagnosis of cancer diseases at early stage. The proposed system shows the solution of two problems. First shows to detect tumors as suspicious regions with a weak contrast to their background and second shows way to extract features which categorize tumors. Hence this classification can be done with SVM, a great method of statistical learning has made significant achievement in various field. Discovered in the early 90’s, which led to an interest in machine learning? Here the different types of tumor like Benign, Malignant, or Normal image are classified using the SVM classifier. This techniques shows how easily we can detect region of tumor is present in mammogram images with more than 80% of accuracy rates for linear classification using SVM. The 10-fold cross validation to get an accurate outcome is been used by proposed system. The Wisconsin breast cancer diagnosis data set is referred from UCI machine learning repository. The considering accuracy, sensitivity, specificity, false discovery rate, false omission rate and Matthews’s correlation coefficient is appraised in the proposed system. This Provides good result for both training and testing phase. The techniques also shows accuracy of 98.57% and 97.14% by use of Support Vector Machine and K-Nearest Neighbors


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Derui Yan ◽  
Mingjing Shen ◽  
Zixuan Du ◽  
Jianping Cao ◽  
Ye Tian ◽  
...  

Adjuvant radiotherapy is one of the main treatment methods for breast cancer, but its clinical benefit depends largely on the characteristics of the patient. This study aimed to explore the relationship between the expression of zinc finger (ZNF) gene family proteins and the radiosensitivity of breast cancer patients. Clinical and gene expression data on a total of 976 breast cancer samples were obtained from The Cancer Genome Atlas (TCGA) database. ZNF gene expression was dichotomized into groups with a higher or lower level than the median level of expression. Univariate and multivariate Cox regression analyses were used to evaluate the relationship between ZNF gene expression levels and radiosensitivity. The Molecular Taxonomy Data of the International Federation of Breast Cancer (METABRIC) database was used for validation. The results revealed that 4 ZNF genes were possible radiosensitivity markers. High expression of ZNF644 and low expression levels of the other 3 genes (ZNF341, ZNF541, and ZNF653) were related to the radiosensitivity of breast cancer. Hierarchical cluster, Cox, and CoxBoost analysis based on these 4 ZNF genes indicated that patients with a favorable 4-gene signature had better overall survival on radiotherapy. Thus, this 4-gene signature may have value for selecting those patients most likely to benefit from radiotherapy. ZNF gene clusters could act as radiosensitivity signatures for breast cancer patients and may be involved in determining the radiosensitivity of cancer.


2019 ◽  
Vol 92 (1103) ◽  
pp. 20190198 ◽  
Author(s):  
Venkata SK Manem ◽  
Andrew Dhawan

Objective: Radiation therapy is among the most effective and widely used modalities of cancer therapy in current clinical practice. In this era of personalized radiation medicine, high-throughput data now provide the means to investigate novel biomarkers of radiation response. Large-scale efforts have identified several radiation response signatures, which poses two challenges, namely, their analytical validity and redundancy of gene signatures. Methods: To address these fundamental radiogenomics questions, we curated a database of gene expression signatures predictive of radiation response under oxic and hypoxic conditions. RadiationGeneSigDB has a collection of 11 oxic and 24 hypoxic signatures with the standardized gene list as a gene symbol, Entrez gene ID, and its function. We present the utility of this database by gaining an understanding of hypoxia-associated miRNA by applying a penalized multivariate model; by comparing breast cancer oxic signatures in cell line data vs patient data; and by comparing the similarity of head and neck cancer hypoxia signatures at the pathway level in clinical tumour data. Results: We obtained a set of miRNA highly associated both positively and negatively to the hypoxia gene signatures, across pan-cancer. In addition, we identified moderate correlations between breast cancer oxic signatures in patient data, and significant differences across molecular subtypes. Moreover, we also found that different set of pathways to be enriched using the head and neck hypoxia signatures, although, they are found to be concordant when applied on the patient data. Conclusion: This valuable, curated repertoire of published gene expression signatures provides motivating case studies for how to search for similarities in radiation response for tumours arising from different tissues across model systems under oxic and hypoxic conditions, and how a well-curated set of gene signatures can be used to generate novel biological hypotheses about the functions of non-coding RNA. Advances in knowledge: We envision that RadiationSigDB database will help accelerate preclinical radiotherapeutic discovery pipelines in terms of analytical validity of novel biomarkers of radiation response and the need for ensemble approaches to clinical genomic biomarkers.


Sign in / Sign up

Export Citation Format

Share Document