Rice Yield Forecasting using Support Vector Machine

In the domain of Soft Computing, Support Vector Machines (SVMs) have acquired considerable significance. These are widely used in making predictions, owing to their ability of generalization. This paper is about the development of SVM based classification models for the prediction of rice yield in India. Experiments have been conducted involving oneagainst-one multi classification method, k-fold cross validation and polynomial kernel function for SVM training. Rice production data of India has been sourced from Directorate of Economics and Statistics, Ministry of Agriculture, Government of India, for this work. The best prediction accuracy for the 4- year relative average increase has been achieved as 75.06% using 4-fold cross validation method. MATLAB software has been used for experimentation in this work.

Download Full-text

Hierarchy-Based File Fragment Classification

Machine Learning and Knowledge Extraction ◽

10.3390/make2030012 ◽

2020 ◽

Vol 2 (3) ◽

pp. 216-232

Author(s):

Manish Bhatt ◽

Avdesh Mishra ◽

Md Wasi Ul Kabir ◽

S. E. Blake-Gatto ◽

Rishav Rajendra ◽

...

Keyword(s):

Cross Validation ◽

Hierarchical Classification ◽

Future Research ◽

Support Vector ◽

Challenging Problem ◽

Fine Grain ◽

Average Accuracy ◽

Vector Machines ◽

Essential Problem ◽

Fold Cross Validation

File fragment classification is an essential problem in digital forensics. Although several attempts had been made to solve this challenging problem, a general solution has not been found. In this work, we propose a hierarchical machine-learning-based approach with optimized support vector machines (SVM) as the base classifiers for file fragment classification. This approach consists of more general classifiers at the top level and more specialized fine-grain classifiers at the lower levels of the hierarchy. We also propose a primitive taxonomy for file types that can be used to perform hierarchical classification. We evaluate our model with a dataset of 14 file types, with 1000 fragments measuring 512 bytes from each file type derived from a subset of the publicly available Digital Corpora, the govdocs1 corpus. Our experiment shows comparable results to the present literature, with an average accuracy of 67.78% and an F1-measure of 65% using 10-fold cross-validation. We then improve on the hierarchy and find better results, with an increase in the F1-measure of 1%. Finally, we make our assessment and observations, then conclude the paper by discussing the scope of future research.

Download Full-text

Multiclass Kernel Function Evaluation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.542-543.1438 ◽

2012 ◽

Vol 542-543 ◽

pp. 1438-1442

Author(s):

Ting Hua Wang ◽

Cai Yun Cai ◽

Yan Liao

Keyword(s):

Cross Validation ◽

Selection Criterion ◽

Feature Space ◽

Function Evaluation ◽

Support Vector ◽

Computationally Efficient ◽

Computational Overhead ◽

Vector Machines ◽

Validation Technique ◽

Fold Cross Validation

Kernel is a key component of the support vector machines (SVMs) and other kernel methods. Based on the data distributions of classes in the feature space, this paper proposed a model selection criterion to evaluate the goodness of a kernel in multiclass classification scenario. This criterion is computationally efficient and is differentiable with respect to the kernel parameters. Compared with the k-fold cross validation technique which is often regarded as a benchmark, this criterion is found to yield about the same performance with much less computational overhead.

Download Full-text

Music Performers Classification by Using Multifractal Features: A Case Study

Archives of Acoustics ◽

10.1515/aoa-2017-0025 ◽

2017 ◽

Vol 42 (2) ◽

pp. 223-233 ◽

Cited By ~ 1

Author(s):

Natasa Reljin ◽

David Pokrajac

Keyword(s):

Cross Validation ◽

Classification Performance ◽

Support Vector ◽

Mel Frequency Cepstral Coefficients ◽

Vector Machines ◽

Characteristic Points ◽

Fold Cross Validation ◽

F Measure ◽

Better Than

Abstract In this paper, we investigated the possibility to classify different performers playing the same melodies at the same manner being subjectively quite similar and very difficult to distinguish even for musically skilled persons. For resolving this problem we propose the use of multifractal (MF) analysis, which is proven as an efficient method for describing and quantifying complex natural structures, phenomena or signals. We found experimentally that parameters associated to some characteristic points within the MF spectrum can be used as music descriptors, thus permitting accurate discrimination of music performers. Our approach is tested on the dataset containing the same songs performed by music group ABBA and by actors in the movie Mamma Mia. As a classifier we used the support vector machines and the classification performance was evaluated by using the four-fold cross-validation. The results of proposed method were compared with those obtained using mel-frequency cepstral coefficients (MFCCs) as descriptors. For the considered two-class problem, the overall accuracy and F-measure higher than 98% are obtained with the MF descriptors, which was considerably better than by using the MFCC descriptors when the best results were less than 77%.

Download Full-text

Special Issue on Using Machine Learning Algorithms in the Prediction of Kyphosis Disease: A Comparative Study

Applied Sciences ◽

10.3390/app9163322 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3322 ◽

Cited By ~ 2

Author(s):

Stephen Dankwa ◽

Wenfeng Zheng

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Cross Validation ◽

Machine Learning Algorithms ◽

Support Vector ◽

Grid Search ◽

Baseline Model ◽

Vector Machines ◽

Ann Models ◽

Fold Cross Validation

Machine learning (ML) is the technology that allows a computer system to learn from the environment, through re-iterative processes, and improve itself from experience. Recently, machine learning has gained massive attention across numerous fields, and is making it easy to model data extremely well, without the importance of using strong assumptions about the modeled system. The rise of machine learning has proven to better describe data as a result of providing both engineering solutions and an important benchmark. Therefore, in this current research work, we applied three different machine learning algorithms, which were, the Random Forest (RF), Support Vector Machines (SVM), and Artificial Neural Network (ANN) to predict kyphosis disease based on a biomedical data. At the initial stage of the experiments, we performed 5- and 10-Fold Cross-Validation using Logistic Regression as a baseline model to compare with our ML models without performing grid search. We then evaluated the models and compared their performances based on 5- and 10-Fold Cross-Validation after running grid search algorithms on the ML models. Among the Support Vector Machines, we experimented with the three kernels (Linear, Radial Basis Function (RBF), Polynomial). We observed overall accuracies of the models between 79%–85%, and 77%–86% based on the 5- and 10-Fold Cross-Validation, after running grid search respectively. Based on the 5- and 10-Fold Cross-Validation as evaluation metrics, the RF, SVM-RBF, and ANN models achieved accuracies more than 80%. The RF, SVM-RBF and ANN models outperformed the baseline model based on the 10-Fold Cross-Validation with grid search. Overall, in terms of accuracies, the ANN model outperformed all the other ML models, achieving 85.19% and 86.42% based on the 5- and 10-Fold Cross-Validation. We proposed that RF, SVM-RBF and ANN models should be used to detect and predict kyphosis disease after a patient had undergone surgery or operation. We suggest that machine learning should be adopted and used as an essential and critical tool across the maximum spectrum of answering biomedical questions.

Download Full-text

iMPTCE-Hnetwork: A Multilabel Classifier for Identifying Metabolic Pathway Types of Chemicals and Enzymes with a Heterogeneous Network

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/6683051 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yuanyuan Zhu ◽

Bin Hu ◽

Lei Chen ◽

Qi Dai

Keyword(s):

Metabolic Pathway ◽

Metabolic Pathways ◽

Heterogeneous Network ◽

Cross Validation ◽

Polynomial Kernel ◽

Support Vector ◽

Exact Match ◽

Living Organisms ◽

A Chain ◽

Fold Cross Validation

Metabolic pathway is an important type of biological pathways. It produces essential molecules and energies to maintain the life of living organisms. Each metabolic pathway consists of a chain of chemical reactions, which always need enzymes to participate in. Thus, chemicals and enzymes are two major components for each metabolic pathway. Although several metabolic pathways have been uncovered, the metabolic pathway system is still far from complete. Some hidden chemicals or enzymes are not discovered in a certain metabolic pathway. Besides the traditional experiments to detect hidden chemicals or enzymes, an alternative pipeline is to design efficient computational methods. In this study, we proposed a powerful multilabel classifier, called iMPTCE-Hnetwork, to uniformly assign chemicals and enzymes to metabolic pathway types reported in KEGG. Such classifier adopted the embedding features derived from a heterogeneous network, which defined chemicals and enzymes as nodes and the interactions between chemicals and enzymes as edges, through a powerful network embedding algorithm, Mashup. The popular RAndom k-labELsets (RAKEL) algorithm was employed to construct the classifier, which incorporated the support vector machine (polynomial kernel) as the basic classifier. The ten-fold cross-validation results indicated that such a classifier had good performance with accuracy higher than 0.800 and exact match higher than 0.750. Several comparisons were done to indicate the superiority of the iMPTCE-Hnetwork.

Download Full-text

KOMPARASI ALGORITMA NAIVE BAYES DAN SUPPORT VECTOR MACHINE UNTUK ANALISA SENTIMEN REVIEW FILM

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v14i2.918 ◽

2018 ◽

Vol 14 (2) ◽

pp. 175

Author(s):

Elly Indrayuni

Keyword(s):

Support Vector Machine ◽

Support Vector Machines ◽

Cross Validation ◽

Opinion Mining ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Support Vector ◽

Vector Machines ◽

Fold Cross Validation

Film merupakan subjek yang diminati oleh sejumlah besar orang diantara komunitas jaringan sosial yang memiliki perbedaan signifikan dalam pendapat atau sentimen mereka. Analisa sentimen atau opinion mining merupakan salah satu solusi mengatasi masalah untuk mengelompokan opini atau review menjadi opini positif atau negatif secara otomatis. Teknik yang digunakan dalam penelitian ini adalah Naive Bayes dan Support Vector Machines (SVM). Naive Bayes memiliki kelebihan yaitu sederhana, cepat dan memiliki akurasi yang tinggi. Sedangkan SVM mampu mengidentifikasi hyperplane terpisah yang memaksimalkan margin antara dua kelas yang berbeda. Hasil klasifikasi sentimen pada penelitian ini terdiri dari dua label class, yaitu positif dan negatif. Nilai akurasi yang dihasilkan akan menjadi tolak ukur untuk mencari model pengujian terbaik untuk kasus klasifikasi sentimen. Evaluasi dilakukan menggunakan 10 fold cross validation. Pengukuran akurasi diukur dengan confusion matrix dan kurva ROC. Hasil penelitian menunjukkan nilai akurasi untuk algoritma Naive Bayes sebesar 84.50%. Sedangkan nilai akurasi algoritma Support Vector Machine (SVM) lebih besar dari Naive Bayes yaitu sebesar 90.00%.

Download Full-text

Combination of Support Vector Machine and K-Fold cross-validation for prediction of long-term degradation of the compressive strength of marine concrete

International Journal of Computational Physics Series ◽

10.29167/a1i1p120-130 ◽

2018 ◽

Vol 1 (1) ◽

pp. 120-130 ◽

Cited By ~ 1

Author(s):

Chunxiang Qian ◽

Wence Kang ◽

Hao Ling ◽

Hua Dong ◽

Chengyao Liang ◽

...

Keyword(s):

Support Vector Machine ◽

Environmental Factors ◽

Cross Validation ◽

Concrete Strength ◽

Simulation Method ◽

Support Vector ◽

Svm Model ◽

Artificial Neural Network Ann ◽

Influence Degree ◽

Fold Cross Validation

Support Vector Machine (SVM) model optimized by K-Fold cross-validation was built to predict and evaluate the degradation of concrete strength in a complicated marine environment. Meanwhile, several mathematical models, such as Artificial Neural Network (ANN) and Decision Tree (DT), were also built and compared with SVM to determine which one could make the most accurate predictions. The material factors and environmental factors that influence the results were considered. The materials factors mainly involved the original concrete strength, the amount of cement replaced by fly ash and slag. The environmental factors consisted of the concentration of Mg2+, SO42-, Cl-, temperature and exposing time. It was concluded from the prediction results that the optimized SVM model appeared to perform better than other models in predicting the concrete strength. Based on SVM model, a simulation method of variables limitation was used to determine the sensitivity of various factors and the influence degree of these factors on the degradation of concrete strength.

Download Full-text

Performance analysis of support vector machines with polynomial kernel for sentiment polarity identification: A case study in lecturer’s performance questionnaire

Journal of Physics Conference Series ◽

10.1088/1742-6596/1810/1/012033 ◽

2021 ◽

Vol 1810 (1) ◽

pp. 012033

Author(s):

G A Pradnyana ◽

I G M Darmawiguna ◽

D K S Suditresna Jaya ◽

A Sasmita

Keyword(s):

Support Vector Machines ◽

Performance Analysis ◽

Polynomial Kernel ◽

Support Vector ◽

Vector Machines

Download Full-text

Persian Handwritten Number Recognition Using Adapted Framing Feature and Support Vector Machines

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026816500048 ◽

2016 ◽

Vol 15 (01) ◽

pp. 1650004 ◽

Cited By ~ 3

Author(s):

Hedieh Sajedi ◽

Mehran Bahador

Keyword(s):

Support Vector Machines ◽

Recognition Rate ◽

Nearest Neighbors ◽

Polynomial Kernel ◽

Support Vector ◽

K Nearest Neighbors ◽

New Approach ◽

Number Recognition ◽

Vector Machines

In this paper, a new approach for segmentation and recognition of Persian handwritten numbers is presented. This method utilizes the framing feature technique in combination with outer profile feature that we named this the adapted framing feature. In our proposed approach, segmentation of the numbers into digits has been carried out automatically. In the classification stage of the proposed method, Support Vector Machines (SVM) and k-Nearest Neighbors (k-NN) are used. Experimentations are conducted on the IFHCDB database consisting 17,740 numeral images and HODA database consisting 102,352 numeral images. In isolated digit level on IFHCDB, the recognition rate of 99.27%, is achieved by using SVM with polynomial kernel. Furthermore, in isolated digit level on HODA, the recognition rate of 99.07% is achieved by using SVM with polynomial kernel. The experiments illustrate that applying our proposed method resulted higher accuracy compared to previous researches.

Download Full-text

Microbiological Quality Assessment of Chicken Thigh Fillets Using Spectroscopic Sensors and Multivariate Data Analysis

Foods ◽

10.3390/foods10112723 ◽

2021 ◽

Vol 10 (11) ◽

pp. 2723

Author(s):

Evgenia D. Spyrelli ◽

Christina Papachristou ◽

George-John E. Nychas ◽

Efstathios Z. Panagou

Keyword(s):

Support Vector Machines ◽

Discriminant Analysis ◽

Microbiological Quality ◽

Poultry Meat ◽

Support Vector ◽

Classification Models ◽

Pseudomonas Spp ◽

Vector Machines ◽

Ft Ir

Fourier transform infrared spectroscopy (FT-IR) and multispectral imaging (MSI) were evaluated for the prediction of the microbiological quality of poultry meat via regression and classification models. Chicken thigh fillets (n = 402) were subjected to spoilage experiments at eight isothermal and two dynamic temperature profiles. Samples were analyzed microbiologically (total viable counts (TVCs) and Pseudomonas spp.), while simultaneously MSI and FT-IR spectra were acquired. The organoleptic quality of the samples was also evaluated by a sensory panel, establishing a TVC spoilage threshold at 6.99 log CFU/cm2. Partial least squares regression (PLS-R) models were employed in the assessment of TVCs and Pseudomonas spp. counts on chicken’s surface. Furthermore, classification models (linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVMs), and quadratic support vector machines (QSVMs)) were developed to discriminate the samples in two quality classes (fresh vs. spoiled). PLS-R models developed on MSI data predicted TVCs and Pseudomonas spp. counts satisfactorily, with root mean squared error (RMSE) values of 0.987 and 1.215 log CFU/cm2, respectively. SVM model coupled to MSI data exhibited the highest performance with an overall accuracy of 94.4%, while in the case of FT-IR, improved classification was obtained with the QDA model (overall accuracy 71.4%). These results confirm the efficacy of MSI and FT-IR as rapid methods to assess the quality in poultry products.

Download Full-text