PSO-tuned support vector machine metamodels for assessment of turbulent flows in pipe bends

Purpose Computational fluid dynamics (CFD) technique is the most commonly used numerical approach to simulate fluid flow behaviour. Owing to its computationally, cost-intensive nature CFD models may not be easily and quickly deployable. In this regard, this study aims to present a support vector machine (SVM)-based metamodelling approach that can be easily trained and quickly deployed for carrying out large-scale studies. Design/methodology/approach Radial basis function and ε^*-insensitive loss function are used as kernel function and loss function, respectively. To prevent overfitting of the model, five-fold cross-validation root mean squared error is used while training the SVM metamodel. Rather than blindly using any SVM tuning parameters, a particle swarm optimisation (PSO) is used to fine-tune them. The developed SVM metamodel is tested using various error metrics on disjoint test data. Findings Using the SVM metamodel, a parametric study is conducted to understand the effect of various factors influencing the behaviour of the turbulent fluid flow in the pipe bend with CFD simulation data set. Based on the parametric study carried out, it is seen that the diametric position has the most effect on dimensionless axial velocity, whereas Reynolds number has the least effect. Originality/value This paper provides an effective PSO-tuned SVM metamodelling approach, which may be used as a significant cost-saving approach to quickly and accurately estimate fluid flow characteristics that, in general, require the use of expensive CFD models.

Download Full-text

Large-Scale Twin Parametric Support Vector Machine Using Pinball Loss Function

IEEE Transactions on Systems Man and Cybernetics Systems ◽

10.1109/tsmc.2019.2896642 ◽

2019 ◽

pp. 1-17 ◽

Cited By ~ 4

Author(s):

Sweta Sharma ◽

Reshma Rastogi ◽

Suresh Chandra

Keyword(s):

Support Vector Machine ◽

Loss Function ◽

Large Scale ◽

Support Vector ◽

Pinball Loss

Download Full-text

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

Information Discovery and Delivery ◽

10.1108/idd-09-2018-0045 ◽

2019 ◽

Vol 47 (3) ◽

pp. 154-170

Author(s):

Janani Balakumar ◽

S. Vijayarani Mohan

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Classification ◽

Support Vector ◽

Data Sets ◽

Selection Algorithm ◽

Data Set ◽

Content Type ◽

Benchmark Data ◽

Bee Colony

Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper. Findings The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy. Originality/value This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.

Download Full-text

Stochastic Subgradient for Large-Scale Support Vector Machine Using the Generalized Pinball Loss Function

Symmetry ◽

10.3390/sym13091652 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1652

Author(s):

Wanida Panup ◽

Rabian Wangkeeree

Keyword(s):

Support Vector Machine ◽

Loss Function ◽

Gradient Descent ◽

Large Scale ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Hinge Loss ◽

Gradient Descent Algorithm ◽

Pinball Loss

In this paper, we propose a stochastic gradient descent algorithm, called stochastic gradient descent method-based generalized pinball support vector machine (SG-GPSVM), to solve data classification problems. This approach was developed by replacing the hinge loss function in the conventional support vector machine (SVM) with a generalized pinball loss function. We show that SG-GPSVM is convergent and that it approximates the conventional generalized pinball support vector machine (GPSVM). Further, the symmetric kernel method was adopted to evaluate the performance of SG-GPSVM as a nonlinear classifier. Our suggested algorithm surpasses existing methods in terms of noise insensitivity, resampling stability, and accuracy for large-scale data scenarios, according to the experimental results.

Download Full-text

Cluster Reduction Support Vector Machine for Large-Scale Data Set Classification

2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application ◽

10.1109/paciia.2008.43 ◽

2008 ◽

Author(s):

Guangxi Chen ◽

Yan Cheng ◽

Jian Xu

Keyword(s):

Support Vector Machine ◽

Large Scale ◽

Support Vector ◽

Data Set ◽

Large Scale Data ◽

Scale Data ◽

Cluster Reduction

Download Full-text

A Novel Surface Reconstruction Method for Noisy Cloud Points Based on Support Vector Machine

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.950.139 ◽

2014 ◽

Vol 950 ◽

pp. 139-144

Author(s):

Da Li Yu

Keyword(s):

Support Vector Machine ◽

Theoretical Analysis ◽

Surface Reconstruction ◽

Large Scale ◽

Support Vector ◽

Reconstruction Method ◽

Data Set ◽

Running Time ◽

Cloud Points ◽

Surface Construction

This study proposes a novel suface reconstruction method. Surface reconstruction based on Support Vector Machine (SVM) is a hot topic in the field of 3D surface construction. But it is difficult to apply this method to noisy cloud points and its running time is very long. In this paper, firstly, Fuzzy c-means (FCM) is used to delete the large-scale noise, and then a feature-preserved non-uniform simplification method for cloud points is presented, which simplifies the data set to remove the redundancy while keeping down the features of the model. Finally, the surface is reconstructed from the simplified data using SVM. Both theoretical analysis and experimental results show that after the simplification, the performance of method for surface reconstruction based on SVM is improved greatly as well as the details of the surface are preserved well.

Download Full-text

A hybrid firefly and support vector machine classifier for phishing email detection

Kybernetes ◽

10.1108/k-07-2014-0129 ◽

2016 ◽

Vol 45 (6) ◽

pp. 977-994 ◽

Cited By ~ 10

Author(s):

Oluyinka Aderemi Adewumi ◽

Ayobami Andronicus Akinyelu

Keyword(s):

Support Vector Machine ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Support Vector ◽

Data Set ◽

Content Type ◽

Phishing Attacks ◽

Positive Rate ◽

E Mail

Purpose – Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars has been lost by many companies and individuals. The global impact of phishing attacks will continue to be on the increase and thus a more efficient phishing detection technique is required. The purpose of this paper is to investigate and report the use of a nature inspired based-machine learning (ML) approach in classification of phishing e-mails. Design/methodology/approach – ML-based techniques have been shown to be efficient in detecting phishing attacks. In this paper, firefly algorithm (FFA) was integrated with support vector machine (SVM) with the primary aim of developing an improved phishing e-mail classifier (known as FFA_SVM), capable of accurately detecting new phishing patterns as they occur. From a data set consisting of 4,000 phishing and ham e-mails, a set of features, suitable for phishing e-mail detection, was extracted and used to construct the hybrid classifier. Findings – The FFA_SVM was applied to a data set consisting of up to 4,000 phishing and ham e-mails. Simulation experiments were performed to evaluate and compared the performance of the classifier. The tests yielded a classification accuracy of 99.94 percent, false positive rate of 0.06 percent and false negative rate of 0.04 percent. Originality/value – The hybrid algorithm has not been earlier apply, as in this work, to the classification and detection of phishing e-mail, to the best of the authors’ knowledge.

Download Full-text

Induction machine stator short-circuit fault detection using support vector machine

COMPEL The International Journal for Computation and Mathematics in Electrical and Electronic Engineering ◽

10.1108/compel-06-2020-0208 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Saddam Bensaoucha ◽

Youcef Brik ◽

Sandrine Moreau ◽

Sid Ahmed Bessedik ◽

Aissa Ameur

Keyword(s):

Neural Networks ◽

Support Vector Machine ◽

Short Circuit ◽

Induction Machine ◽

Support Vector ◽

Data Set ◽

Content Type ◽

Three Phase ◽

Learning Machine ◽

Short Circuit Fault

Purpose This paper provides an effective study to detect and locate the inter-turn short-circuit faults (ITSC) in a three-phase induction motor (IM) using the support vector machine (SVM). The characteristics extracted from the analysis of the phase shifts between the stator currents and their corresponding voltages are used as inputs to train the SVM. The latter automatically decides on the IM state, either a healthy motor or a short-circuit fault on one of its three phases. Design/methodology/approach To evaluate the performance of the SVM, three supervised algorithms of machine learning, namely, multi-layer perceptron neural networks (MLPNNs), radial basis function neural networks (RBFNNs) and extreme learning machine (ELM) are used along with the SVM in this study. Thus, all classifiers (SVM, MLPNN, RBFNN and ELM) are tested and the results are compared with the same data set. Findings The obtained results showed that the SVM outperforms MLPNN, RBFNNs and ELM to diagnose the health status of the IM. Especially, this technique (SVM) provides an excellent performance because it is able to detect a fault of two short-circuited turns (early detection) when the IM is operating under a low load. Originality/value The original of this work is to use the SVM algorithm based on the phase shift between the stator currents and their voltages as inputs to detect and locate the ITSC fault.

Download Full-text

A Comparative Study of Energy Big Data Analysis for Product Management in a Smart Factory

Journal of Organizational and End User Computing ◽

10.4018/joeuc.291559 ◽

2022 ◽

Vol 34 (2) ◽

pp. 1-17

Author(s):

Rahman A. B. M. Salman ◽

Lee Myeongbae ◽

Lim Jonghyun ◽

Yongyun Cho ◽

Shin Changsun

Keyword(s):

Economic Growth ◽

Support Vector Machine ◽

Coefficient Of Variation ◽

Prediction Models ◽

Mean Squared Error ◽

Training Data ◽

Smart Factory ◽

Support Vector ◽

Product Management ◽

Data Set

Energy has been obtained as one of the key inputs for a country's economic growth and social development. Analysis and modeling of industrial energy are currently a time-insertion process because more and more energy is consumed for economic growth in a smart factory. This study aims to present and analyse the predictive models of the data-driven system to be used by appliances and find out the most significant product item. With repeated cross-validation, three statistical models were trained and tested in a test set: 1) General Linear Regression Model (GLM), 2) Support Vector Machine (SVM), and 3) boosting Tree (BT). The performance of prediction models measured by R2 error, Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Variation (CV). The best model from the study is the Support Vector Machine (SVM) that has been able to provide R2 of 0.86 for the training data set and 0.85 for the testing data set with a low coefficient of variation, and the most significant product of this smart factory is Skelp.

Download Full-text

What books will be your bestseller? A machine learning approach with Amazon Kindle

The Electronic Library ◽

10.1108/el-08-2020-0234 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Seungpeel Lee ◽

Honggeun Ji ◽

Jina Kim ◽

Eunil Park

Keyword(s):

Machine Learning ◽

Random Forest ◽

User Satisfaction ◽

Large Scale ◽

Recommendation Systems ◽

Feature Representation ◽

Support Vector ◽

Data Set ◽

Content Type ◽

Online Stores

Purpose With the rapid increase in internet use, most people tend to purchase books through online stores. Several such stores also provide book recommendations for buyer convenience, and both collaborative and content-based filtering approaches have been widely used for building these recommendation systems. However, both approaches have significant limitations, including cold start and data sparsity. To overcome these limitations, this study aims to investigate whether user satisfaction can be predicted based on easily accessible book descriptions. Design/methodology/approach The authors collected a large-scale Kindle Books data set containing book descriptions and ratings, and calculated whether a specific book will receive a high rating. For this purpose, several feature representation methods (bag-of-words, term frequency–inverse document frequency [TF-IDF] and Word2vec) and machine learning classifiers (logistic regression, random forest, naive Bayes and support vector machine) were used. Findings The used classifiers show substantial accuracy in predicting reader satisfaction. Among them, the random forest classifier combined with the TF-IDF feature representation method exhibited the highest accuracy at 96.09%. Originality/value This study revealed that user satisfaction can be predicted based on book descriptions and shed light on the limitations of existing recommendation systems. Further, both practical and theoretical implications have been discussed.

Download Full-text

A Computational Method for the Identification of Endolysins and Autolysins

Protein and Peptide Letters ◽

10.2174/0929866526666191002104735 ◽

2020 ◽

Vol 27 (4) ◽

pp. 329-336 ◽

Cited By ~ 1

Author(s):

Lei Xu ◽

Guangmin Liang ◽

Baowen Chen ◽

Xu Tan ◽

Huaikun Xiang ◽

...

Keyword(s):

Support Vector Machine ◽

Cell Wall ◽

Experimental Results ◽

Computational Method ◽

Lytic Enzyme ◽

Support Vector ◽

Lytic Enzymes ◽

Data Set ◽

Optimal Feature ◽

Better Than

Background: Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes. Objective: In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work. Method: We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme. Results: Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set. Conclusion: The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.

Download Full-text