Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

Introduction As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI) mass spectrometer. Methods Nine supervised classification algorithms are implemented in R software and compared for the classification accuracy. Results We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups. Conclusions Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI) mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction analysis of microarrays provide better prediction accuracy for hepatocellular carcinoma using SELDI proteomic data than six other approaches.

Download Full-text

Image Classification of Tourist Attractions with K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine

International Journal on Advanced Science Engineering and Information Technology ◽

10.18517/ijaseit.10.6.9098 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2207

Author(s):

Herry Sujaini

Keyword(s):

Support Vector Machine ◽

Logistic Regression ◽

Random Forest ◽

Image Classification ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Tourist Attractions

Download Full-text

A method for handling metabonomics data from liquid chromatography/mass spectrometry: combinational use of support vector machine recursive feature elimination, genetic algorithm and random forest for feature selection

Metabolomics ◽

10.1007/s11306-011-0274-7 ◽

2011 ◽

Vol 7 (4) ◽

pp. 549-558 ◽

Cited By ~ 40

Author(s):

Xiaohui Lin ◽

Quancai Wang ◽

Peiyuan Yin ◽

Liang Tang ◽

Yexiong Tan ◽

...

Keyword(s):

Mass Spectrometry ◽

Genetic Algorithm ◽

Support Vector Machine ◽

Feature Selection ◽

Liquid Chromatography ◽

Random Forest ◽

Recursive Feature Elimination ◽

Support Vector ◽

Liquid Chromatography Mass Spectrometry ◽

Chromatography Mass Spectrometry

Download Full-text

Comparison between Support Vector Machine and Random Forest for Hepatocellular Carcinoma (HCC) Classification

2020 International Conference on Decision Aid Sciences and Application (DASA) ◽

10.1109/dasa51403.2020.9317083 ◽

2020 ◽

Author(s):

Velery Virgina Putri Wibowo ◽

Zuherman Rustam ◽

Sri Hartini ◽

Qisthina Syifa Setiawan ◽

Jane Eva Aurelia

Keyword(s):

Hepatocellular Carcinoma ◽

Support Vector Machine ◽

Random Forest ◽

Support Vector

Download Full-text

Comparison of Machine Learning Algorithms Using WEKA and Sci-Kit Learn in Classifying Online Shopper Intention

JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING ◽

10.31289/jite.v3i1.2599 ◽

2019 ◽

Vol 3 (1) ◽

pp. 58

Author(s):

Yefta Christian

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Kappa Statistic ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Support Vector ◽

Test Results ◽

Online Shoppers ◽

Online Stores

<p class="8AbstrakBahasaIndonesia"><span>The growth of online stores nowadays is very rapid. This is supported by faster and better internet infrastructure. The increasing growth of online stores makes the competition more difficult in this business field. It is necessary for online stores to have a website or an application that is able to measure and classify consumers’ spending intentions, so that the consumers will have eyes on things on the sites and applications to make purchases eventually. Classification of online shoppers’ intentions can be done by using several algorithms, such as Naïve Bayes, Multi-Layer Perceptron, Support Vector Machine, Random Forest and J48 Decision Trees. In this case, the comparison of algorithms is done with two tools, WEKA and Sci-Kit Learn by comparing the values of F1-Score, accuracy, Kappa Statistic and mean absolute error. There is a difference between the test results using WEKA and Sci-Kit Learn on the Support Vector Machine algorithm. Based on this research, the Random Forest algorithm is the most appropriate algorithm to be used as an algorithm for classifying online shoppers’ intentions.</span></p>

Download Full-text

Comparison of Random Forest and Support Vector Machine for Indonesian Tweet Complaint Classification

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195628 ◽

2019 ◽

pp. 202-207 ◽

Cited By ~ 1

Author(s):

Desi Ramayanti

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Text Classification ◽

Research Area ◽

Computational Time ◽

Support Vector ◽

Svm Classifier ◽

Text Documents ◽

Case Organization

In digital business, the managerial commonly need to process text so that it can be used to support decision-making. The number of text documents contained ideas and opinions is progressing and challenging to understand one by one. Whereas if the data are processed and correctly rendered using machine learning, it can present a general overview of a particular case, organization, or object quickly. Numerous researches have been accomplished in this research area, nevertheless, most of the studies concentrated on English text classification. Every language has various techniques or methods to classify text depending on the characteristics of its grammar. The result of classification among languages may be different even though it used the same algorithm. Given the greatness of text classification, text classification algorithms that can be implemented is the support vector machine (SVM) and Random Forest (RF). Based on the background above, this research is aimed to find out the performance of support vector machine algorithm and random forest in classification of Indonesian text. 1. Result of SVM classifier with cross validation k-10 is derived the best accuracy with value 0.9648, however, it spends computational time as long as 40.118 second. Then, result of RF classifier with values, i.e. 'bootstrap': False, 'min_samples_leaf': 1, 'n_estimators': 10, 'min_samples_split': 3, 'criterion': 'entropy', 'max_features': 3, 'max_depth': None is achieved accuracy is 0.9561 and computational time 109.399 second.

Download Full-text

Classification of Cyclooxygenase-2 Inhibitors Using Support Vector Machine and Random Forest Methods

Journal of Chemical Information and Modeling ◽

10.1021/acs.jcim.8b00876 ◽

2019 ◽

Vol 59 (5) ◽

pp. 1988-2008 ◽

Cited By ~ 3

Author(s):

Zijian Qin ◽

Yao Xi ◽

Shengde Zhang ◽

Guiping Tu ◽

Aixia Yan

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Cyclooxygenase 2 ◽

Support Vector ◽

Cyclooxygenase 2 Inhibitors

Download Full-text

A Comparison of WorldView-2 and Landsat 8 Images for the Classification of Forests Affected by Bark Beetle Outbreaks Using a Support Vector Machine and a Neural Network: A Case Study in the Sumava Mountains

Geosciences ◽

10.3390/geosciences9090396 ◽

2019 ◽

Vol 9 (9) ◽

pp. 396 ◽

Cited By ~ 2

Author(s):

Premysl Stych ◽

Barbora Jerabkova ◽

Josef Lastovicka ◽

Martin Riedl ◽

Daniel Paluba

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Bark Beetle ◽

National Park ◽

Support Vector ◽

Landsat 8 ◽

Kappa Index ◽

Sentinel 2

The objective of this paper is to assess WorldView-2 (WV2) and Landsat OLI (L8) images in the detection of bark beetle outbreaks in the Sumava National Park. WV2 and L8 images were used for the classification of forests infected by bark beetle outbreaks using a Support Vector Machine (SVM) and a Neural Network (NN). After evaluating all the available results, the SVM can be considered the best method used in this study. This classifier achieved the highest overall accuracy and Kappa index for both classified images. In the cases of WV2 and L8, total overall accuracies of 86% and 71% and Kappa indices of 0.84 and 0.66 were achieved with SVM, respectively. The NN algorithm using WV2 also produced very promising results, with over 80% overall accuracy and a Kappa index of 0.79. The methods used in this study may be inspirational for testing other types of satellite data (e.g., Sentinel-2) or other classification algorithms such as the Random Forest Classifier.

Download Full-text

Combination of support vector machine, artificial neural network and random forest for improving the classification of convective and stratiform rain using spectral features of SEVIRI data

Atmospheric Research ◽

10.1016/j.atmosres.2017.12.006 ◽

2018 ◽

Vol 203 ◽

pp. 118-129 ◽

Cited By ~ 14

Author(s):

Mourad Lazri ◽

Soltane Ameur

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Support Vector ◽

Spectral Features ◽

Stratiform Rain ◽

Artificial Neural

Download Full-text

Entropy based disease classification of proteomic mass spectrometry data of the human serum by a support vector machine

Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. ◽

10.1109/ijcnn.2005.1555889 ◽

2006 ◽

Author(s):

T. Kristensen ◽

G. Kumar

Keyword(s):

Mass Spectrometry ◽

Support Vector Machine ◽

Human Serum ◽

Disease Classification ◽

Mass Spectrometry Data ◽

Support Vector

Download Full-text

SAR and LIDAR Datasets for Building Damage Evaluation Based on Support Vector Machine and Random Forest Algorithms—A Case Study of Kumamoto Earthquake, Japan

Applied Sciences ◽

10.3390/app10248932 ◽

2020 ◽

Vol 10 (24) ◽

pp. 8932

Author(s):

Masoud Hajeb ◽

Sadra Karimzadeh ◽

Masashi Matsuoka

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Machine Learning Algorithms ◽

Support Vector ◽

Kumamoto Earthquake ◽

Svm Algorithm ◽

Before And After ◽

Elevation Difference ◽

Collapsed Buildings

The evaluation of buildings damage following disasters from natural hazards is a crucial step in determining the extent of the damage and measuring renovation needs. In this study, a combination of the synthetic aperture radar (SAR) and light detection and ranging (LIDAR) data before and after the earthquake were used to assess the damage to buildings caused by the Kumamoto earthquake. For damage assessment, three variables including elevation difference (ELD) and texture difference (TD) in pre- and post-event LIDAR images and coherence difference (CD) in SAR images before and after the event were considered and their results were extracted. Machine learning algorithms including random forest (RDF) and the support vector machine (SVM) were used to classify and predict the rate of damage. The results showed that ELD parameter played a key role in identifying the damaged buildings. The SVM algorithm using the ELD parameter and considering three damage rates, including D0 and D1 (Negligible to slight damages), D2, D3 and D4 (Moderate to Heavy damages) and D5 and D6 (Collapsed buildings) provided an overall accuracy of about 87.1%. In addition, for four damage rates, the overall accuracy was about 78.1%.

Download Full-text