Glioblastoma gene expression subtypes and correlation with clinical, molecular and immunohistochemical characteristics in a homogenously treated cohort: GLIOCAT project.

2019 ◽  
Vol 37 (15_suppl) ◽  
pp. 2029-2029 ◽  
Author(s):  
Estela Pineda ◽  
Anna Esteve-Codina ◽  
Maria Martinez-Garcia ◽  
Francesc Alameda ◽  
Cristina Carrato ◽  
...  

2029 Background: Glioblastoma (GBM) gene expression subtypes have been described in last years, data in homogeneously treated patients is lacking. Methods: Clinical, molecular and immunohistochemistry (IHC) analysis from patients with newly diagnosed GBM homogeneously treated with standard radiochemotherapy were studied. Samples were classified based on the expression profiles into three different subtypes (classical, mesenchymal, proneural) using Support Vector Machine (SVM), the K-nearest neighbor (K-NN) and the single sample Gene Set Enrichment Analysis (ssGSEA) classification algorithms provided by GlioVis web application. Results: GLIOCAT Project recruited 432 patients from 6 catalan institutions, all of whom received standard first-line treatment (2004 -2015). Best paraffin tissue samples were selected for RNAseq and reliable data were obtained from 124. 82 cases (66%) were classified into the same subtype by all three classification algorithms. SVM and ssGEA algorithms obtain more similar results (87%). No differences in clinical variables were found between the 3 GBM subtypes. Proneural subtype was enriched with IDH1 mutated and G-CIMP positive tumors. Mesenchymal subtype (SVM) was enriched in unmethylated MGMT tumors (p = 0.008), and classical (SVM) in methylated MGMT tumors (p = 0.008). Long survivors ( > 30 months) were rarely classified as mesenchymal (0-7.5%) and were more frequently classified as Proneural (23.1-26.). Clinical (age, resection, KPS) and molecular ( IDH1, MGMT) known prognostic factors were confirmed in this serie. Overall, no differences in prognosis were observed between 3 subtypes, but a trend to worse survival in mesenchymal was observed in K-NN (9.6 vs 15 ). Mesenchymal subtype presented less expression of Olig2 (p < 0.001) and SOX2 (p = 0.003) by IHC, but more YLK-40 expression (p = 0.023, SVM). On the other hand, classical subtype expressed more Nestin (p = 0.004) compared to the other subtypes (K-NN). Conclusions: In our study we have not found correlation between glioblastoma expression subtype and outcome. This large serie provides reproducible data regarding clinical-molecular-immunohistochemistry features of glioblastoma genetic subtypes.

Author(s):  
Maria Morgan ◽  
Carla Blank ◽  
Raed Seetan

<p>This paper investigates the capability of six existing classification algorithms (Artificial Neural Network, Naïve Bayes, k-Nearest Neighbor, Support Vector Machine, Decision Tree and Random Forest) in classifying and predicting diseases in soybean and mushroom datasets using datasets with numerical or categorical attributes. While many similar studies have been conducted on datasets of images to predict plant diseases, the main objective of this study is to suggest classification methods that can be used for disease classification and prediction in datasets that contain raw measurements instead of images. A fungus and a plant dataset, which had many differences, were chosen so that the findings in this paper could be applied to future research for disease prediction and classification in a variety of datasets which contain raw measurements. A key difference between the two datasets, other than one being a fungus and one being a plant, is that the mushroom dataset is balanced and only contained two classes while the soybean dataset is imbalanced and contained eighteen classes. All six algorithms performed well on the mushroom dataset, while the Artificial Neural Network and k-Nearest Neighbor algorithms performed best on the soybean dataset. The findings of this paper can be applied to future research on disease classification and prediction in a variety of dataset types such as fungi, plants, humans, and animals.</p>


2016 ◽  
Vol 1 (1) ◽  
pp. 13 ◽  
Author(s):  
Debby Erce Sondakh

Penelitian ini bertujuan untuk mengukur dan membandingkan kinerja lima algoritma klasifikasi teks berbasis pembelajaran mesin, yaitu decision rules, decision tree, k-nearest neighbor (k-NN), naïve Bayes, dan Support Vector Machine (SVM), menggunakan dokumen teks multi-class. Perbandingan dilakukan pada efektifiatas algoritma, yaitu kemampuan untuk mengklasifikasi dokumen pada kategori yang tepat, menggunakan metode holdout atau percentage split. Ukuran efektifitas yang digunakan adalah precision, recall, F-measure, dan akurasi. Hasil eksperimen menunjukkan bahwa untuk algoritma naïve Bayes, semakin besar persentase dokumen pelatihan semakin tinggi akurasi model yang dihasilkan. Akurasi tertinggi naïve Bayes pada persentase 90/10, SVM pada 80/20, dan decision tree pada 70/30. Hasil eksperimen juga menunjukkan, algoritma naïve Bayes memiliki nilai efektifitas tertinggi di antara lima algoritma yang diuji, dan waktu membangun model klasiifikasi yang tercepat, yaitu 0.02 detik. Algoritma decision tree dapat mengklasifikasi dokumen teks dengan nilai akurasi yang lebih tinggi dibanding SVM, namun waktu membangun modelnya lebih lambat. Dalam hal waktu membangun model, k-NN adalah yang tercepat namun nilai akurasinya kurang.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1692 ◽  
Author(s):  
Iván Silva ◽  
José Eugenio Naranjo

Identifying driving styles using classification models with in-vehicle data can provide automated feedback to drivers on their driving behavior, particularly if they are driving safely. Although several classification models have been developed for this purpose, there is no consensus on which classifier performs better at identifying driving styles. Therefore, more research is needed to evaluate classification models by comparing performance metrics. In this paper, a data-driven machine-learning methodology for classifying driving styles is introduced. This methodology is grounded in well-established machine-learning (ML) methods and literature related to driving-styles research. The methodology is illustrated through a study involving data collected from 50 drivers from two different cities in a naturalistic setting. Five features were extracted from the raw data. Fifteen experts were involved in the data labeling to derive the ground truth of the dataset. The dataset fed five different models (Support Vector Machines (SVM), Artificial Neural Networks (ANN), fuzzy logic, k-Nearest Neighbor (kNN), and Random Forests (RF)). These models were evaluated in terms of a set of performance metrics and statistical tests. The experimental results from performance metrics showed that SVM outperformed the other four models, achieving an average accuracy of 0.96, F1-Score of 0.9595, Area Under the Curve (AUC) of 0.9730, and Kappa of 0.9375. In addition, Wilcoxon tests indicated that ANN predicts differently to the other four models. These promising results demonstrate that the proposed methodology may support researchers in making informed decisions about which ML model performs better for driving-styles classification.


Author(s):  
Jiahua Jin ◽  
Lu Lu

Hotel social media provides access to dissatisfied customers and their experiences with services. However, due to massive topics and posts in social media, and the sparse distribution of complaint-related posts and, manually identifying complaints is inefficient and time-consuming. In this study, we propose a supervised learning method including training samples enlargement and classifier construction. We first identified reliable complaint and noncomplaint samples from the unlabeled dataset by using small labeled samples as training samples. Combining the labeled samples and enlarged samples, classification algorithms support vector machine and k-nearest neighbor were then adopted to build binary classifiers during the classifier construction process. Experimental results indicate the proposed method can identify complaints from social media efficiently, especially when the amount of labeled training samples is small. This study provides an efficient approach for hotel companies to distinguish a certain kind of consumer complaint information from large number of unrelated information in hotel social media.


Author(s):  
Duan Mei ◽  
Qiang Liu

Based on MicroRNA (miRNA) expression profiles, this article proposes a new algorithm—SVM-RFE-FKNN, which combines the support vector machine-recursive feature elimination (SVM-RFE) algorithm and the fuzzy K -nearest neighbor (FKNN) algorithm, to realize binary classification of tumors. First, the SVM-RFE algorithm was used to select features from the miRNA expression profile dataset to constitute feature subsets and to determine the maximum number of support vectors. Next, this maximum number was regarded as the upper limit of the parameter K in the FKNN algorithm that was then used to classify the samples to be tested. Finally, the leave-one-out cross-validation method was adopted to assess the classification performance of the proposed algorithm. Through experiments, our proposed algorithm was compared with other twelve classification methods, and the result shows that our algorithm had better classification performance. Specifically, with only a few miRNA biomarkers, the proposed algorithm could reach an accuracy of 99.46% and an area under the receiver operating characteristic curve (AUC) of 0.9874.


Author(s):  
Seyma Kiziltas Koc ◽  
Mustafa Yeniad

Technologies which are used in the healthcare industry are changing rapidly because the technology is evolving to improve people's lifestyles constantly. For instance, different technological devices are used for the diagnosis and treatment of diseases. It has been revealed that diagnosis of disease can be made by computer systems with developing technology.Machine learning algorithms are frequently used tools because of their high performance in the field of health as well as many field. The aim of this study is to investigate different machine learning classification algorithms that can be used in the diagnosis of diabetes and to make comparative analyzes according to the metrics in the literature. In the study, seven classification algorithms were used in the literature. These algorithms are Logistic Regression, K-Nearest Neighbor, Multilayer Perceptron, Random Forest, Decision Trees, Support Vector Machine and Naive Bayes. Firstly, classification performance of algorithms are compared. These comparisons are based on accuracy, sensitivity, precision, and F1-score. The results obtained showed that support vector machine algorithm had the highest accuracy with 78.65%.


Text Classification plays a vital role in the world of data mining and same is true for the classification algorithms in text categorization. There are many techniques for text classification but this paper mainly focuses on these approaches Support vector machine (SVM), Naïve Bayes (NB), k-nearest neighbor (k-NN). This paper reveals results of the classifiers on mini-newsgroups data which consists of the classifies on mini-newsgroups data which consists a lot of documents and step by step tasks like a listing of files, preprocessing, the creation of terms(a specific subset of terms), using classifiers on specific subset of datasets. Finally, after the results and experiments over the dataset, it is concluded that SVM achieves good classification output corresponding to accuracy, precision, F-measure and recall but execution time is good for the k-NN approach.


The world today has made giant leaps in the field of Medicine. There is tremendous amount of researches being carried out in this field leading to new discoveries that is making a heavy impact on the mankind. Data being generated in this field is increasing enormously. A need has arisen to analyze these data in order to find out the meaningful and relevant hidden patterns. These patterns can be used for clinical diagnosis. Data mining is an efficient approach in discovering these patterns. Among the many data mining techniques that exists, this paper aims at analyzing the medical data using various Classification techniques. The classification techniques used in this study include k-Nearest neighbor (kNN), Decision Tree, Naive Bayes which are hard computing algorithms, whereas the soft computing algorithms used in this study include Support Vector Machine (SVM), Artificial Neural Networks (ANN) and Fuzzy k-Means clustering. We have applied these algorithms to three kinds of datasets that are Breast Cancer Wisconsin, Haberman Data and Contraceptive Method Choice dataset. Our results show that soft computing based classification algorithms better classifications than the traditional classification algorithms in terms of various classification performance measures


2021 ◽  
Author(s):  
Haiqiang Duan ◽  
Chenyun Dai ◽  
Wei Chen

Abstract Background: The transmission of human body movements to other devices through wearable smart bracelets have attracted more and more attentions in the field of human-machine interface (HMI) applications. However, due to the limitation of the collection range of wearable bracelets, it is necessary to study the relationship between the superposition of wrist and finger motion and their cooperative motion to simplify the collection system of the device.Methods: The multi-channel high-density surface electromyogram (HD-sEMG) signal has high spatial resolution and can improve the accuracy of multi-channel fitting. In this study, we quantified the HD-sEMG forearm spatial activation features of 256 channels of hand movement, and performed a linear fitting of the quantified features of fingers and wrist movements to verify the linear superposition relationship between fingers and wrist cooperative movements and their independent movements. The most important thing is to classify and predict the results of the fitting and the actual measured fingers and wrist cooperative actions by four commonly used classifiers: Linear Discriminant Analysis (LDA) ,K-Nearest Neighbor (KNN) ,Support Vector Machine (SVM) and Random Forest (RF), and evaluate the performance of the four classifiers in gesture fitting in detail according to the classification results.Results: In a total of 12 kinds of synthetic gesture actions, in the three cases where the number of fitting channels was selected as 8, 32 and 64, four classifiers of LDA, SVM, RF and KNN are used for classification prediction. When the number of fitting channels was 8, the prediction accuracy of LDA classifier was 99.70%, the classification accuracy of KNN was 99.40%, the classification accuracy of SVM was 99.20%, and the classification accuracy of RF was 93.75%. When the number of fitting channels was 32, the accuracy of LDA was 98.51%, the classification accuracy of KNN was 97.92%, the accuracy of SVM is 96.73%, and the accuracy of RF was 86.61%. When the number of fitting channels is 64, the accuracy of LDA is 95.83%, the classification accuracy of KNN is 91.67%, the accuracy of SVM is 86.90%, and the accuracy of RF is 83.30%.Conclusion: It can be seen from the results that when the number of fitting channels is 8, the classification accuracy of the three classifiers of LDA, KNN and SVM is basically the same, but the time-consuming of SVM is very small. When the amount of data is large, the priority should be selected SVM as the classifier. When the number of fitting channels increases, the classification accuracy of the LDA classifier will be higher than the other three classifiers, so the LDA classifier should be more appropriate. The classification accuracy of the RF classifier in this type of problem has always been far lower than the other three classifiers, so it is not recommended to use the RF classifier as a classifier for gesture stacking related work.


Sign in / Sign up

Export Citation Format

Share Document