A Prediction based Cloud Resource Provisioning using SVM

Aim: To develop a prediction model grounded on Machine Learning using Support Vector Machine (SVM). Background: Prediction of workload in a Cloud Environment is one of the primary task in provisioning resources. Forecasting the requirements of future workload lies in the competency of predicting technique which could maximize the usage of resources in a cloud computing environment. Objective: To reduce the training time of SVM model. Methods: K-Means clustering is applied on the training dataset to form ‘n’ clusters firstly. Then, for every tuple in the cluster, the tuple’s class label is compared with the tuple’s cluster label. If the two labels are identical then the tuple is rightly classified and such a tuple would not contribute much during the SVM training process that formulates the separating hyperplane with lowest generalization error. Otherwise the tuple is added to the reduced training dataset. This selective addition of tuples to train SVM is carried for all clusters. The support vectors are a few among the samples in reduced training dataset that determines the optimal separating hyperplane. Results: On Google Cluster Trace dataset, the proposed model incurred a reduction in the training time, Root Mean Square Error and a marginal increase in the R2 Score than the traditional SVM. The model has also been tested on Los Alamos National Laboratory’s Mustang and Trinity cluster traces. Conclusion: The Cloudsim’s CPU utilization (VM and Cloudlet utilization) was measured and it was found to increase upon running the same set of tasks through our proposed model.

Download Full-text

A computational model for predicting transmembrane regions of retroviruses

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972001750010x ◽

2017 ◽

Vol 15 (03) ◽

pp. 1750010 ◽

Cited By ~ 1

Author(s):

Ze Liu ◽

Hongqiang Lv ◽

Jiuqiang Han ◽

Ruiling Liu

Keyword(s):

Selection Criterion ◽

Training Dataset ◽

Evolutionary Information ◽

Support Vector ◽

Hydrophobic Residues ◽

Proposed Model ◽

Svm Model ◽

Feature Importance ◽

Covalently Linked ◽

Env Glycoprotein

Transmembrane region (TR) is a conserved region of transmembrane (TM) subunit in envelope (env) glycoprotein of retrovirus. Evidences have shown that TR is responsible for anchoring the env glycoprotein on the lipid bilayer and substitution of the TR for a covalently linked lipid anchor abrogates fusion. However, universal software could not achieve sufficient accuracy as TM in env also has several motifs such as signal peptide, fusion peptide and immunosuppressive domain composed largely of hydrophobic residues. In this paper, a support vector machine-based (SVM) model is proposed to identify TRs in retroviruses. Firstly, physicochemical and evolutionary information properties were extracted as original features. And then, the feature importance was analyzed by minimum Redundancy Maximum Relevance (mRMR) feature selection criterion. Our model achieved an Sn of 0.955, Sp of 0.998, ACC of 0.995, MCC of 0.954 using 10-fold cross-validation on the training dataset. These results suggest that the proposed model can be used to predict TRs in non-annotation retroviruses and 11917, 3344, 2, 289 and 6 new putative TRs were found in HERV, HIV, HTLV, SIV, MLV, respectively.

Download Full-text

A Forecast Model of the Number of Containers for Containership Voyage

Algorithms ◽

10.3390/a11120193 ◽

2018 ◽

Vol 11 (12) ◽

pp. 193

Author(s):

Yuchuang Wang ◽

Guoyou Shi ◽

Xiaotong Sun

Keyword(s):

Container Terminal ◽

Forecast Model ◽

Gray Relational Analysis ◽

Support Vector ◽

Model Parameters ◽

Container Ship ◽

Kernel Support Vector Machine ◽

Proposed Model ◽

Svm Model ◽

Pass Through

Container ships must pass through multiple ports of call during a voyage. Therefore, forecasting container volume information at the port of origin followed by sending such information to subsequent ports is crucial for container terminal management and container stowage personnel. Numerous factors influence container allocation to container ships for a voyage, and the degree of influence varies, engendering a complex nonlinearity. Therefore, this paper proposes a model based on gray relational analysis (GRA) and mixed kernel support vector machine (SVM) for predicting container allocation to a container ship for a voyage. First, in this model, the weights of influencing factors are determined through GRA. Then, the weighted factors serve as the input of the SVM model, and SVM model parameters are optimized through a genetic algorithm. Numerical simulations revealed that the proposed model could effectively predict the number of containers for container ship voyage and that it exhibited strong generalization ability and high accuracy. Accordingly, this model provides a new method for predicting container volume for a voyage.

Download Full-text

Deep Learning Based Biomedical Literature Classification Using Criteria of Scientific Rigor

Electronics ◽

10.3390/electronics9081253 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1253

Author(s):

Muhammad Afzal ◽

Beom Joo Park ◽

Maqbool Hussain ◽

Sungyoung Lee

Keyword(s):

Deep Learning ◽

Clinical Decision Making ◽

Clinical Decision ◽

Biomedical Literature ◽

Training Dataset ◽

Support Vector ◽

Scientific Rigor ◽

Cochrane Reviews ◽

Cumulative Score ◽

Proposed Model

A major blockade to support the evidence-based clinical decision-making is accurately and efficiently recognizing appropriate and scientifically rigorous studies in the biomedical literature. We trained a multi-layer perceptron (MLP) model on a dataset with two textual features, title and abstract. The dataset consisting of 7958 PubMed citations classified in two classes: scientific rigor and non-rigor, is used to train the proposed model. We compare our model with other promising machine learning models such as Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosted Tree (GBT) approaches. Based on the higher cumulative score, deep learning was chosen and was tested on test datasets obtained by running a set of domain-specific queries. On the training dataset, the proposed deep learning model obtained significantly higher accuracy and AUC of 97.3% and 0.993, respectively, than the competitors, but was slightly lower in the recall of 95.1% as compared to GBT. The trained model sustained the performance of testing datasets. Unlike previous approaches, the proposed model does not require a human expert to create fresh annotated data; instead, we used studies cited in Cochrane reviews as a surrogate for quality studies in a clinical topic. We learn that deep learning methods are beneficial to use for biomedical literature classification. Not only do such methods minimize the workload in feature engineering, but they also show better performance on large and noisy data.

Download Full-text

Predicting Tunnel Squeezing Using Multiclass Support Vector Machines

Advances in Civil Engineering ◽

10.1155/2018/4543984 ◽

2018 ◽

Vol 2018 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Yang Sun ◽

Xianda Feng ◽

Lingqiang Yang

Keyword(s):

Predictive Accuracy ◽

Average Error ◽

Support Vector ◽

Proposed Model ◽

Svm Model ◽

Vector Machines ◽

Multiclass Support Vector Machine ◽

Multiclass Svm ◽

Tunnel Instability ◽

Weak Rock Masses

Tunnel squeezing is one of the major geological disasters that often occur during the construction of tunnels in weak rock masses subjected to high in situ stresses. It could cause shield jamming, budget overruns, and construction delays and could even lead to tunnel instability and casualties. Therefore, accurate prediction or identification of tunnel squeezing is extremely important in the design and construction of tunnels. This study presents a modified application of a multiclass support vector machine (SVM) to predict tunnel squeezing based on four parameters, that is, diameter (D), buried depth (H), support stiffness (K), and rock tunneling quality index (Q). We compiled a database from the literature, including 117 case histories obtained from different countries such as India, Nepal, and Bhutan, to train the multiclass SVM model. The proposed model was validated using 8-fold cross validation, and the average error percentage was approximately 11.87%. Compared with existing approaches, the proposed multiclass SVM model yields a better performance in predictive accuracy. More importantly, one could estimate the severity of potential squeezing problems based on the predicted squeezing categories/classes.

Download Full-text

Forecasting Dry Bulk Freight Index with Improved SVM

Mathematical Problems in Engineering ◽

10.1155/2014/460684 ◽

2014 ◽

Vol 2014 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Qianqian Han ◽

Bo Yan ◽

Guobao Ning ◽

B. Yu

Keyword(s):

Wavelet Transform ◽

Support Vector ◽

Data Series ◽

Shipping Industry ◽

Combined Model ◽

Final Model ◽

Proposed Model ◽

Svm Model ◽

The Impact ◽

Short Term Trend

An improved SVM model is presented to forecast dry bulk freight index (BDI) in this paper, which is a powerful tool for operators and investors to manage the market trend and avoid price risking shipping industry. The BDI is influenced by many factors, especially the random incidents in dry bulk market, inducing the difficulty in forecasting of BDI. Therefore, to eliminate the impact of random incidents in dry bulk market, wavelet transform is adopted to denoise the BDI data series. Hence, the combined model of wavelet transform and support vector machine is developed to forecast BDI in this paper. Lastly, the BDI data in 2005 to 2012 are presented to test the proposed model. The 84 prior consecutive monthly BDI data are the inputs of the model, and the last 12 monthly BDI data are the outputs of model. The parameters of the model are optimized by genetic algorithm and the final model is conformed through SVM training. This paper compares the forecasting result of proposed method and three other forecasting methods. The result shows that the proposed method has higher accuracy and could be used to forecast the short-term trend of the BDI.

Download Full-text

An Effective K-Means Clustering Based SVM Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.333-335.1344 ◽

2013 ◽

Vol 333-335 ◽

pp. 1344-1348

Author(s):

Yu Kai Yao ◽

Yang Liu ◽

Zhao Li ◽

Xiao Yun Chen

Keyword(s):

Support Vector ◽

Svm Classifier ◽

Small Subset ◽

Training Set ◽

Data Mining Algorithms ◽

Svm Algorithm ◽

Svm Model ◽

Separating Hyperplane ◽

Regression Problems ◽

Mining Algorithms

Support Vector Machine (SVM) is one of the most popular and effective data mining algorithms which can be used to resolve classification or regression problems, and has attracted much attention these years. SVM could find the optimal separating hyperplane between classes, which afford outstanding generalization ability with it. Usually all the labeled records are used as training set. However, the optimal separating hyperplane only depends on a few crucial samples (Support Vectors, SVs), we neednt train SVM model on the whole training set. In this paper a novel SVM model based on K-means clustering is presented, in which only a small subset of the original training set is selected to constitute the final training set, and the SVM classifier is built through training on these selected samples. This greatly decrease the scale of the training set, and effectively saves the training and predicting cost of SVM, meanwhile guarantees its generalization performance.

Download Full-text

A Novel Hybrid LE and SVM with CV in Intrusion Detection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.644-650.2572 ◽

2014 ◽

Vol 644-650 ◽

pp. 2572-2576

Author(s):

Qing Liu ◽

Yun Kai Zhang ◽

Qing Ru Li

Keyword(s):

Intrusion Detection ◽

False Positive Rate ◽

Likelihood Estimation ◽

Support Vector ◽

Training Time ◽

Detection Algorithms ◽

Lower False Positive Rate ◽

Proposed Model ◽

Positive Rate ◽

Rbf Kernel

A support vector machine (SVM) model combined Laplacian Eigenmaps (LE) with Cross Validation (CV) is proposed for intrusion detection. In the proposed model, a classifier is adopted to estimate whether an action is an attack or not. Maximum Likelihood Estimation (MLE) is used to estimate the intrinsic dimensions, and LE is used as a preprocessor of SVM to reduce the dimensions of feature vectors then training time is shortened. In order to improve the performance of SVM, CV is used to optimize the parameters of SVM in RBF kernel function. Compared with other detection algorithms, the experimental results show that the proposed model has the advantages: shorter training time, higher accuracy rate and lower false positive rate.

Download Full-text

Open-Source Essential Protein Prediction Model by Integrating Chi-Square and Support Vector Machine

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2020070103 ◽

2020 ◽

Vol 11 (3) ◽

pp. 38-56

Author(s):

S. R. Mani Sekhar ◽

Siddesh G. M. ◽

Sunilkumar S. Manvi

Keyword(s):

Support Vector Machine ◽

Open Source ◽

Vital Role ◽

Support Vector ◽

Chi Square ◽

Essential Proteins ◽

Topological Features ◽

Proposed Model ◽

Protein Prediction ◽

Svm Model

Identification and analysis of protein play a vital role in drug design and disease prediction. There are several open-source applications that have been developed for identifying essential proteins which are based on biological or topological features. These techniques infer the possibility of proteins to be essential by using the network topology and feature selection, which can ignore some of the features to reduce the complexity and, subsequently, results in less accuracy. In the paper, the authors have used selenium driver to scrap the dataset. Later, the authors integrated the chi-square method with support vector machine for the prediction of essential proteins in baker yeast. Here, chi-square is a test of dissimilarity used for altering the record, and afterward, the support vector machine is used to classify the test dataset. The results show that the proposed model Chi-SVM model achieves an accuracy of 99.56%, whereas BC and CC achieved an accuracy of 84.0% and 86.0%. Finally, the proposed model is validated using Statistical performance measures such as PPA, NPA, SA, and STA.

Download Full-text

Apple quality identification and classification by image processing based on convolutional neural networks

Scientific Reports ◽

10.1038/s41598-021-96103-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yanfei Li ◽

Xianying Feng ◽

Yandong Liu ◽

Xingchang Han

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Support Vector ◽

Svm Classifier ◽

Test Accuracy ◽

Training Time ◽

Proposed Model ◽

Specific Complex ◽

Occurrence Matrix ◽

Apple Quality

AbstractThis work researched apple quality identification and classification from real images containing complicated disturbance information (background was similar to the surface of the apples). This paper proposed a novel model based on convolutional neural networks (CNN) which aimed at accurate and fast grading of apple quality. Specific, complex, and useful image characteristics for detection and classification were captured by the proposed model. Compared with existing methods, the proposed model could better learn high-order features of two adjacent layers that were not in the same channel but were very related. The proposed model was trained and validated, with best training and validation accuracy of 99% and 98.98% at 2590th and 3000th step, respectively. The overall accuracy of the proposed model tested using an independent 300 apple dataset was 95.33%. The results showed that the training accuracy, overall test accuracy and training time of the proposed model were better than Google Inception v3 model and traditional imaging process method based on histogram of oriented gradient (HOG), gray level co-occurrence matrix (GLCM) features merging and support vector machine (SVM) classifier. The proposed model has great potential in Apple’s quality detection and classification.

Download Full-text

Prediction of Classification of Rock Burst Risk Based on Genetic Algorithms with SVM

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.628.383 ◽

2014 ◽

Vol 628 ◽

pp. 383-389 ◽

Cited By ~ 6

Author(s):

Ya Hui Peng ◽

Kang Peng ◽

Jian Zhou ◽

Zhi Xiang Liu

Keyword(s):

Support Vector Machine ◽

Genetic Algorithms ◽

Rock Burst ◽

Support Vector ◽

Proposed Model ◽

Svm Model ◽

Rock Burst Risk ◽

Rock Burst Prediction ◽

Optimization Search

Due to the complex features of rock burst hazard assessment systems, a support vector machine (SVM) model for predicting of classification of rock burst was established based on the SVM theory and the actual characteristics of the project in this study. The main factors of rock burst, such as coal seam, dip, buried depth, structure situation, change of pitch angle, change of coal thickness, gas concentration, roof management, pressure relief and shooting were defined as the criterion indices for rock burst prediction in the proposed model. In order to determine reasonable and efficient the parameters of SVM, Firstly, the appropriate fitness function for genetic algorithms (GA) operation was determined, and then optimization parameters of SVM model were selected by real coded GA, therefore, the genetic algorithms and support vector machine (GSVM) model was established. A GSVM model was obtained through training 23 sets of measured data, the cross-validation method was introduced to verify the stability of GSVM model and the ratio of mis-discrimination is 0. Moreover, the proposed model was used to predict 12 new samples rock burst, the correct rate of prediction results is 91.6667% and are identical with actual situation. The results show that the genetic algorithm can speed up SVM parameter optimization search, the proposed model has a high credibility in the study of rock burst prediction of risk classification, which can be applied to practical engineering.

Download Full-text