scholarly journals An Overview of Mathematical Models for RNA Sequence-based Glioblastoma Subclassification

2021 ◽  
Vol 3 (1) ◽  
pp. 001-007
Author(s):  
Yilin Wu ◽  
Eric Zander ◽  
Andrew Ardeleanu ◽  
Ryan Singleton ◽  
Barnabas Bede

Molecular marker-based glioblastoma (GBM) subclassification is emerging as a key factor in personalized GBM treatment planning. Multiple genetic alterations, including methylation status and mutations, have been proposed in GBM subclassification. RNA-Sequence (RNA-Seq)-based molecular profiling of GBM is widely implemented and readily quantifiable. Machine learning (ML) algorithms have been reported as an applicable method that can consistently subgroup GBM. In this study, we systematically studied the applicability of the commonly used ML algorithms based on The Cancer Genome Atlas Glioblastoma Multiforme (TCGA-GBM) dataset and cross-validated in the Chinese Glioma Genome Atlas (CGGA) dataset.  ML algorithms studied include Binomial and multinomial Logistic Regression, Linear discriminant analysis, Decision trees, K-Nearest Neighbors, Gaussian Naive Bayes, Support Vector Machines, Gradient Boosting, Voting Ensemble, Multi-Layer Perceptron.  RNA-Seq data of 44 biomarkers were passed through the algorithms for performance evaluation. We found ML algorithms Support Vector Machines, Multi-Layer Perceptron s, and Voting Ensemble are best equipped in assigning GBM to correct molecular subgroups of GBM without histological studies.

2004 ◽  
Vol 18 (5) ◽  
pp. 309-323 ◽  
Author(s):  
J.M. Mat�as ◽  
A. Vaamonde ◽  
J. Taboada ◽  
W. Gonz�lez-Manteiga

2014 ◽  
Author(s):  
Gokmen Zararsiz ◽  
Dincer Goksuluk ◽  
Selcuk Korkmaz ◽  
Vahap Eldem ◽  
Izzet Parug Duru ◽  
...  

Background RNA sequencing (RNA-Seq) is a powerful technique for transcriptome profiling of the organisms that uses the capabilities of next-generation sequencing (NGS) technologies. Recent advances in NGS let to measure the expression levels of tens to thousands of transcripts simultaneously. Using such information, developing expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of disease. Here, we present the bagging support vector machines (bagSVM), a machine learning approach and bagged ensembles of support vector machines (SVM), for classification of RNA-Seq data. The bagSVM basically uses bootstrap technique and trains each single SVM separately; next it combines the results of each SVM model using majority-voting technique. Results We demonstrate the performance of the bagSVM on simulated and real datasets. Simulated datasets are generated from negative binomial distribution under different scenarios and real datasets are obtained from publicly available resources. A deseq normalization and variance stabilizing transformation (vst) were applied to all datasets. We compared the results with several classifiers including Poisson linear discriminant analysis (PLDA), single SVM, classification and regression trees (CART), and random forests (RF). In slightly overdispersed data, all methods, except CART algorithm, performed well. Performance of PLDA seemed to be best and RF as second best for very slightly and substantially overdispersed datasets. While data become more spread, bagSVM turned out to be the best classifier. In overall results, bagSVM and PLDA had the highest accuracies. Conclusions According to our results, bagSVM algorithm after vst transformation can be a good choice of classifier for RNA-Seq datasets mostly for overdispersed ones. Thus, we recommend researchers to use bagSVM algorithm for the purpose of classification of RNA-Seq data. PLDA algorithm should be a method of choice for slight and moderately overdispersed datasets. An R/BIOCONDUCTOR package MLSeq with a vignette is freely available at http://www.bioconductor.org/packages/2.14/bioc/html/MLSeq.html Keywords: Bagging, machine learning, RNA-Seq classification, support vector machines, transcriptomics


Sign in / Sign up

Export Citation Format

Share Document