Support Vector Machine Optimized by Genetic Algorithm for Data Analysis of Near-Infrared Spectroscopy Sensors

Near-infrared (NIR) spectral sensors deliver the spectral response of the light absorbed by materials for quantification, qualification or identification. Spectral analysis technology based on the NIR sensor has been a useful tool for complex information processing and high precision identification in the tobacco industry. In this paper, a novel method based on the support vector machine (SVM) is proposed to discriminate the tobacco cultivation region using the near-infrared (NIR) sensors, where the genetic algorithm (GA) is employed for input subset selection to identify the effective principal components (PCs) for the SVM model. With the same number of PCs as the inputs to the SVM model, a number of comparative experiments were conducted between the effective PCs selected by GA and the PCs orderly starting from the first one. The model performance was evaluated in terms of prediction accuracy and four parameters of assessment criteria (true positive rate, true negative rate, positive predictive value and F1 score). From the results, it is interesting to find that some PCs with less information may contribute more to the cultivation regions and are considered as more effective PCs, and the SVM model with the effective PCs selected by GA has a superior discrimination capacity. The proposed GA-SVM model can effectively learn the relationship between tobacco cultivation regions and tobacco NIR sensor data.

Download Full-text

Discrimination of Gentiana and Its Related Species Using IR Spectroscopy Combined with Feature Selection and Stacked Generalization

Molecules ◽

10.3390/molecules25061442 ◽

2020 ◽

Vol 25 (6) ◽

pp. 1442 ◽

Cited By ~ 2

Author(s):

Tao Shen ◽

Hong Yu ◽

Yuan-Zhong Wang

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Feature Selection ◽

Related Species ◽

Predictive Accuracy ◽

Classification Model ◽

Venn Diagram ◽

Support Vector ◽

Stacked Generalization ◽

Svm Model

Gentiana, which is one of the largest genera of Gentianoideae, most of which had potential pharmaceutical value, and applied to local traditional medical treatment. Because of the phytochemical diversity and difference of bioactive compounds among species, which makes it crucial to accurately identify authentic Gentiana species. In this paper, the feasibility of using the infrared spectroscopy technique combined with chemometrics analysis to identify Gentiana and its related species was studied. A total of 180 batches of raw spectral fingerprints were obtained from 18 species of Gentiana and Tripterospermum by near-infrared (NIR: 10,000–4000 cm−1) and Fourier transform mid-infrared (MIR: 4000–600 cm−1) spectrum. Firstly, principal component analysis (PCA) was utilized to explore the natural grouping of the 180 samples. Secondly, random forests (RF), support vector machine (SVM), and K-nearest neighbors (KNN) models were built while using full spectra (including 1487 NIR variables and 1214 FT-MIR variables, respectively). The MIR-SVM model had a higher classification accuracy rate than the other models that were based on the results of the calibration sets and prediction sets. The five feature selection strategies, VIP (variable importance in the projection), Boruta, GARF (genetic algorithm combined with random forest), GASVM (genetic algorithm combined with support vector machine), and Venn diagram calculation, were used to reduce the dimensions of the data variable in order to further reduce numbers of variables for modeling. Finally, 101 NIR and 73 FT-MIR bands were selected as the feature variables, respectively. Thirdly, stacking models were built based on the optimal spectral dataset. Most of the stacking models performed better than the full spectra-based models. RF and SVM (as base learners), combined with the SVM meta-classifier, was the optimal stacked generalization strategy. For the SG-Ven-MIR-SVM model, the accuracy (ACC) of the calibration set and validation set were both 100%. Sensitivity (SE), specificity (SP), efficiency (EFF), Matthews correlation coefficient (MCC), and Cohen’s kappa coefficient (K) were all 1, which showed that the model had the optimal authenticity identification performance. Those parameters indicated that stacked generalization combined with feature selection is probably an important technique for improving the classification model predictive accuracy and avoid overfitting. The study result can provide a valuable reference for the safety and effectiveness of the clinical application of medicinal Gentiana.

Download Full-text

Study on the Quantitative Method of Oversaturated Intersection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.587-589.2100 ◽

2014 ◽

Vol 587-589 ◽

pp. 2100-2104

Author(s):

Qin Liu ◽

Jian Min Xu ◽

Kai Lu

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Travel Speed ◽

Urban Traffic ◽

Support Vector ◽

Model Parameters ◽

Guangzhou City ◽

Traffic System ◽

Traffic Conditions ◽

Svm Model

Oversaturation in the modern urban traffic often happens. In order to describe the degree of oversaturation, the indexes of intersection oversaturation degree are put forward include dissipation time, stranded queue, overflow queue and travel speed. On the basis of selected indexes, the genetic algorithm support vector machine (GA-SVM) model was proposed to quantify the degree of oversaturation. In this method the genetic algorithm is used to select the model parameters. The GA-SVM model built is used to quantify the degree of oversaturation. Combining with the volume of intersections in Guangzhou city the method is calculated and simulated through programming. The simulation results show that GA-SVM method is effective and the accuracy of GA-SVM is higher than support vector machine (SVM).This method provides a theoretical basis for the analysis of traffic system under over-saturated traffic conditions.

Download Full-text

A Novel Method of Pattern Recognition for Honey Source Based on Visible/Near Infrared Spectroscopy: Genetic Algorithm Combined with Support Vector Machine

2010 International Conference on Artificial Intelligence and Computational Intelligence ◽

10.1109/aici.2010.114 ◽

2010 ◽

Cited By ~ 1

Author(s):

Yan Yang ◽

Peng-Cheng Nie ◽

Wei Zhang ◽

Yong He

Keyword(s):

Genetic Algorithm ◽

Pattern Recognition ◽

Support Vector Machine ◽

Infrared Spectroscopy ◽

Near Infrared Spectroscopy ◽

Near Infrared ◽

Support Vector ◽

Novel Method

Download Full-text

Forest Fire Disaster Area Prediction Based on Genetic Algorithm and Support Vector Machine

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.446-449.3037 ◽

2012 ◽

Vol 446-449 ◽

pp. 3037-3041 ◽

Cited By ~ 1

Author(s):

Fang Xiao

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Forest Fire ◽

Optimal Solution ◽

Jiangxi Province ◽

Support Vector ◽

Disaster Area ◽

Fire Disaster ◽

Svm Model ◽

Forest Fire Disaster

Forest fire disaster area prediction based on genetic algorithm and support vector machine is presented in the paper.Genetic algorithm is used to select appropriate parameters of support vector machine. Genetic algorithm can obtain the optimal solution by a series of iterative computations.The forest fire disaster area data in Jiangxi Province from 1970 to 1997 are used as our research data. The comparison of the forest fire disaster area forecasting results between the proposed GA-SVM model and the SVM model is given,which indicates that the proposed GA-SVM model has more excellent forest fire disaster area forecasting results than the SVM model.

Download Full-text

Integration of Genetic Algorithm and Support Vector Machine to Predict Rail Track Degradation

MATEC Web of Conferences ◽

10.1051/matecconf/201925902007 ◽

2019 ◽

Vol 259 ◽

pp. 02007 ◽

Cited By ~ 1

Author(s):

Amir Falamarzi ◽

Sara Moridpour ◽

Majidreza Nazem ◽

Reyhaneh Hesami

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Preventive Maintenance ◽

Mean Squared Error ◽

Study Data ◽

Transport Infrastructure ◽

Coefficient Of Determination ◽

Ride Quality ◽

Support Vector ◽

Svm Model

Gradual deviation in track gauge of tram systems resulted from tram traffic is unavoidable. Tram gauge deviation is considered as an important parameter in poor ride quality and the risk of train derailment. In order to decrease the potential problems associated with excessive gauge deviation, implementation of preventive maintenance activities is inevitable. Preventive maintenance operation is a key factor in development of sustainable rail transport infrastructure. Track degradation prediction modelling is the basic prerequisite for developing efficient preventive maintenance strategies of a tram system. In this study, the data sets of Melbourne tram network is used and straight rail tracks sections are examined. Two model types including plain Support Vector Machine (SVM) and SVM optimised by Genetic Algorithm (GA- SVM) have been applied to the case study data. Two assessment indexes including Mean Squared Error (MSE) and the coefficient of determination (R2) are employed to evaluate the performance of the proposed models. Based on the results, GA-SVM model produces more accurate outcomes than plain SVM model.

Download Full-text

A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

BioMed Research International ◽

10.1155/2015/292683 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 5

Author(s):

Daqing Zhang ◽

Jianfeng Xiao ◽

Nannan Zhou ◽

Mingyue Zheng ◽

Xiaomin Luo ◽

...

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Blood Brain Barrier ◽

Subset Selection ◽

Feature Subset Selection ◽

Brain Barrier ◽

Support Vector ◽

Feature Subset ◽

Svm Model ◽

Kernel Parameters

Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available logBBmodels. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our logBBmodel suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration.

Download Full-text

A New Time Series Regression Method Based on Support Vector Machine Plus and Genetic Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.201-203.2277 ◽

2011 ◽

Vol 201-203 ◽

pp. 2277-2280

Author(s):

Wei Sun ◽

Guo Xiang Meng ◽

Qian Ye ◽

Jian Zheng Zhang ◽

Li Weng Zhang

Keyword(s):

Genetic Algorithm ◽

Time Series ◽

Support Vector Machine ◽

Regression Method ◽

Experimental Result ◽

Support Vector ◽

Hidden Information ◽

Time Series Regression ◽

Svm Model ◽

New Time

Support vector machine (SVM) is gaining popularity on time series analysis due to its advanced theory foundation. The introduction of the hidden information on the basis of SVM is called support vector machine plus (SVM+). However, the hidden information which provides something closely associated with the time series increases the difficulty of training SVM model. In this paper, a new time series regression method GA-RSVM+ is put forward, in which Genetic Algorithm (GA) is used to search the optimal combination of free parameters. The experimental result shows that GA-RSVM+ can accurately determine the parameters on its own and achieve best regression precision. This method has a clear advantage in the regression analysis of time series.

Download Full-text

Classification and Location of Transformer Winding Deformations using Genetic Algorithm and Support Vector Machine

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096514666211026142216 ◽

2021 ◽

Vol 14 ◽

Author(s):

Zhenhua Li ◽

Junjie Cheng ◽

A. Abu-Siada

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Fault Diagnosis ◽

Fault Location ◽

Classification Model ◽

Fault Classification ◽

Support Vector ◽

Fault Type ◽

Svm Model ◽

Transformer Winding

Background: Winding deformation is one of the most common faults that an operating power transformer experiences over its operational life. Thus it is essential to detect and rectify such faults at early stages to avoid potential catastrophic consequences to the transformer. At present, methods published in the literature for transformer winding fault diagnosis are mainly focused on identifying fault type and quantifying its extent without giving much attention to the identification of fault location. Methods: This paper presents a method based on a genetic algorithm and support vector machine (GA-SVM) to improve the faults’ classification of power transformers in terms of type and location. In this regard, a sinusoidal sweep signal in the frequency range of 600 kHz to 1MHz is applied to one terminal of the transformer winding. A mathematical index of the induced current at the head and end of the transformer winding under various fault conditions is used to extract unique features that are fed to a support vector machine (SVM) model for training. Parameters of the SVM model are optimized using a genetic algorithm (GA). Results : The effectiveness of mathematical indicators to extract fault type characteristics and the proposed fault classification model for fault diagnosis is demonstrated through extensive simulation analysis for various transformer winding faults at different locations. Conclusion : The proposed model can effectively identify different fault types and determine their location within the transformer winding, and the diagnostic rate of the fault type and fault location are 100% and 90%, respectively.

Download Full-text

Mapping Mineral Prospectivity Using a Hybrid Genetic Algorithm–Support Vector Machine (GA–SVM) Model

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10110766 ◽

2021 ◽

Vol 10 (11) ◽

pp. 766

Author(s):

Xishihui Du ◽

Kefa Zhou ◽

Yao Cui ◽

Jinlin Wang ◽

Shuguang Zhou

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Hybrid Genetic Algorithm ◽

Geochemical Data ◽

Support Vector ◽

Adaptive Optimization ◽

Spatial Efficiency ◽

Svm Model ◽

Mineral Prospectivity ◽

Optimization Search

Machine learning (ML) as a powerful data-driven method is widely used for mineral prospectivity mapping. This study employs a hybrid of the genetic algorithm (GA) and support vector machine (SVM) model to map prospective areas for Au deposits in Karamay, northwest China. In the proposed method, GA is used as an adaptive optimization search method to optimize the SVM parameters that result in the best fitness. After obtaining evidence layers from geological and geochemical data, GA–SVM models trained using different training datasets were applied to discriminate between prospective and non-prospective areas for Au deposits, and to produce prospectivity maps for mineral exploration. The F1 score and spatial efficiency of classification were calculated to objectively evaluate the performance of each prospectivity model. The best model predicted 95.83% of the known Au deposits within prospective areas, occupying 35.68% of the study area. The results demonstrate the effectiveness of the GA–SVM model as a tool for mapping mineral prospectivity.

Download Full-text

Combination of Support Vector Machine and K-Fold cross-validation for prediction of long-term degradation of the compressive strength of marine concrete

International Journal of Computational Physics Series ◽

10.29167/a1i1p120-130 ◽

2018 ◽

Vol 1 (1) ◽

pp. 120-130 ◽

Cited By ~ 1

Author(s):

Chunxiang Qian ◽

Wence Kang ◽

Hao Ling ◽

Hua Dong ◽

Chengyao Liang ◽

...

Keyword(s):

Support Vector Machine ◽

Environmental Factors ◽

Cross Validation ◽

Concrete Strength ◽

Simulation Method ◽

Support Vector ◽

Svm Model ◽

Artificial Neural Network Ann ◽

Influence Degree ◽

Fold Cross Validation

Support Vector Machine (SVM) model optimized by K-Fold cross-validation was built to predict and evaluate the degradation of concrete strength in a complicated marine environment. Meanwhile, several mathematical models, such as Artificial Neural Network (ANN) and Decision Tree (DT), were also built and compared with SVM to determine which one could make the most accurate predictions. The material factors and environmental factors that influence the results were considered. The materials factors mainly involved the original concrete strength, the amount of cement replaced by fly ash and slag. The environmental factors consisted of the concentration of Mg2+, SO42-, Cl-, temperature and exposing time. It was concluded from the prediction results that the optimized SVM model appeared to perform better than other models in predicting the concrete strength. Based on SVM model, a simulation method of variables limitation was used to determine the sensitivity of various factors and the influence degree of these factors on the degradation of concrete strength.

Download Full-text