scholarly journals pcaGoPromoter - An R Package for Biological and Regulatory Interpretation of Principal Components in Genome-Wide Gene Expression Data

PLoS ONE ◽  
2012 ◽  
Vol 7 (2) ◽  
pp. e32394 ◽  
Author(s):  
Morten Hansen ◽  
Thomas Alexander Gerds ◽  
Ole Haagen Nielsen ◽  
Jakob Benedict Seidelin ◽  
Jesper Thorvald Troelsen ◽  
...  
2009 ◽  
Vol 15 (7) ◽  
pp. 1032-1038 ◽  
Author(s):  
Jrgen Olsen ◽  
Thomas A. Gerds ◽  
Jakob B. Seidelin ◽  
Claudio Csillag ◽  
Jacob T. Bjerrum ◽  
...  

2022 ◽  
Author(s):  
Kimberly Badal ◽  
Jerome E. Foster ◽  
Rajini Haraksingh ◽  
Melford John

Abstract BackgroundRadiation therapy (RT) is frequently recommended for post-surgery treatment of early-stage breast cancer (BC) patients, though not all benefit. Clinical factors currently guide RT treatment decisions. At present, models to predict RT-benefit predominantly use statistical methods with modest performance. In this paper we present a high-accuracy genomic Machine Learning (ML) model to predict RT-benefit in early-stage BC patients. We also present a novel method for selecting genomic features for training ML algorithms. MethodsGene expression data from 463 early-stage BC patients treated with surgery and RT from the METABRIC cohort were obtained. Wilcoxon Rank Sum (Wilcoxon RS) test and Cox Proportional Hazards (Cox PH) were used to reduce the number of genes used to train eight ML algorithms. ML algorithms were trained on 80% of data using 10-fold cross validation and tested on 20% of data to assess performance in predicting relapse status. Results Genome-wide gene expression data was reduced by 96% using Wilcoxon RS and Cox PH to a 1,596 gene set and a 977 gene set. These gene sets were used to train eight ML algorithms resulting in models that ranged in performance accuracies from 54.01% to 95.6%. Highest accuracies were obtained using Support Vector Machine (SVM977–93.41%, SVM1596–95.6%) and Neural Networks algorithms (NN977 – 92.31%, NN1596 – 93.41%). In RT-untreated patients, accuracies of all models were 30% to 40% lower compared to RT-treated patients. SVM977 had the highest sensitivity of 91.09%. Members of the 977 set were enriched with genes involved in cell cycle and differentiation as well as genes associated with radiosensitivity and radioresistance. Conclusion This study presents a novel genomic feature selection approach that used Wilcoxon RS followed by Cox PH to reduce the number of genes from genome-wide gene expression data used for training ML algorithms by 96%. This approach led to an SVM model that used the expression values of 977 genes to predict RT-benefit in early-stage BC patients with 93.41% accuracy. This work demonstrates that ML models can be clinically useful for predicting cancer patient outcomes.


2020 ◽  
Vol 14 ◽  
Author(s):  
Mette Soerensen ◽  
Dominika Marzena Hozakowska-Roszkowska ◽  
Marianne Nygaard ◽  
Martin J. Larsen ◽  
Veit Schwämmle ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document