scholarly journals Applications of Support Vector Machine in Genomic Prediction in Pig and Maize Populations

2020 ◽  
Vol 11 ◽  
Author(s):  
Wei Zhao ◽  
Xueshuang Lai ◽  
Dengying Liu ◽  
Zhenyang Zhang ◽  
Peipei Ma ◽  
...  

Genomic prediction (GP) has revolutionized animal and plant breeding. However, better statistical models that can improve the accuracy of GP are required. For this reason, in this study, we explored the genomic-based prediction performance of a popular machine learning method, the Support Vector Machine (SVM) model. We selected the most suitable kernel function and hyperparameters for the SVM model in eight published genomic data sets on pigs and maize. Next, we compared the SVM model with RBF and the linear kernel functions to the two most commonly used genome-enabled prediction models (GBLUP and BayesR) in terms of prediction accuracy, time, and the memory used. The results showed that the SVM model had the best prediction performance in two of the eight data sets, but in general, the predictions of both models were similar. In terms of time, the SVM model was better than BayesR but worse than GBLUP. In terms of memory, the SVM model was better than GBLUP and worse than BayesR in pig data but the same with BayesR in maize data. According to the results, SVM is a competitive method in animal and plant breeding, and there is no universal prediction model.

2020 ◽  
Vol 10 (11) ◽  
pp. 4083-4102
Author(s):  
Abelardo Montesinos-López ◽  
Humberto Gutierrez-Pulido ◽  
Osval Antonio Montesinos-López ◽  
José Crossa

Due to the ever-increasing data collected in genomic breeding programs, there is a need for genomic prediction models that can deal better with big data. For this reason, here we propose a Maximum a posteriori Threshold Genomic Prediction (MAPT) model for ordinal traits that is more efficient than the conventional Bayesian Threshold Genomic Prediction model for ordinal traits. The MAPT performs the predictions of the Threshold Genomic Prediction model by using the maximum a posteriori estimation of the parameters, that is, the values of the parameters that maximize the joint posterior density. We compared the prediction performance of the proposed MAPT to the conventional Bayesian Threshold Genomic Prediction model, the multinomial Ridge regression and support vector machine on 8 real data sets. We found that the proposed MAPT was competitive with regard to the multinomial and support vector machine models in terms of prediction performance, and slightly better than the conventional Bayesian Threshold Genomic Prediction model. With regard to the implementation time, we found that in general the MAPT and the support vector machine were the best, while the slowest was the multinomial Ridge regression model. However, it is important to point out that the successful implementation of the proposed MAPT model depends on the informative priors used to avoid underestimation of variance components.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Bin Zhang ◽  
Jinke Gong ◽  
Wenhua Yuan ◽  
Jun Fu ◽  
Yi Huang

In order to effectively predict the sieving efficiency of a vibrating screen, experiments to investigate the sieving efficiency were carried out. Relation between sieving efficiency and other working parameters in a vibrating screen such as mesh aperture size, screen length, inclination angle, vibration amplitude, and vibration frequency was analyzed. Based on the experiments, least square support vector machine (LS-SVM) was established to predict the sieving efficiency, and adaptive genetic algorithm and cross-validation algorithm were used to optimize the parameters in LS-SVM. By the examination of testing points, the prediction performance of least square support vector machine is better than that of the existing formula and neural network, and its average relative error is only 4.2%.


2020 ◽  
Author(s):  
Zhanyou Xu ◽  
Andreomar Kurek ◽  
Steven B. Cannon ◽  
Williams D. Beavis

AbstractSelection of markers linked to alleles at quantitative trait loci (QTL) for tolerance to Iron Deficiency Chlorosis (IDC) has not been successful. Genomic selection has been advocated for continuous numeric traits such as yield and plant height. For ordinal data types such as IDC, genomic prediction models have not been systematically compared. The objectives of research reported in this manuscript were to evaluate the most commonly used genomic prediction method, ridge regression and it’s equivalent logistic ridge regression method, with algorithmic modeling methods including random forest, gradient boosting, support vector machine, K-nearest neighbors, Naïve Bayes, and artificial neural network using the usual comparator metric of prediction accuracy. In addition we compared the methods using metrics of greater importance for decisions about selecting and culling lines for use in variety development and genetic improvement projects. These metrics include specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. We found that Support Vector Machine provided the best specificity for culling IDC susceptible lines, while Random Forest GP models provided the best combined set of decision metrics for retaining IDC tolerant and culling IDC susceptible lines.


2013 ◽  
Vol 16 (5) ◽  
pp. 973-988 ◽  
Author(s):  
Xiao-Li Li ◽  
Haishen Lü ◽  
Robert Horton ◽  
Tianqing An ◽  
Zhongbo Yu

An accurate and real-time flood forecast is a crucial nonstructural step to flood mitigation. A support vector machine (SVM) is based on the principle of structural risk minimization and has a good generalization capability. The ensemble Kalman filter (EnKF) is a proven method with the capability of handling nonlinearity in a computationally efficient manner. In this paper, a type of SVM model is established to simulate the rainfall–runoff (RR) process. Then, a coupling model of SVM and EnKF (SVM + EnKF) is used for RR simulation. The impact of the assimilation time scale on the SVM + EnKF model is also studied. A total of four different combinations of the SVM and EnKF models are studied in the paper. The Xinanjiang RR model is employed to evaluate the SVM and the SVM + EnKF models. The study area is located in the Luo River Basin, Guangdong Province, China, during a nine-year period from 1994 to 2002. Compared to SVM, the SVM + EnKF model substantially improves the accuracy of flood prediction, and the Xinanjiang RR model also performs better than the SVM model. The simulated result for the assimilation time scale of 5 days is better than the results for the other cases.


2020 ◽  
Author(s):  
Ya-feng Ji ◽  
Le-Bao Song ◽  
Hao Yuan ◽  
Wen Peng ◽  
Hua-Ying Li ◽  
...  

Abstract In order to enhance the prediction accuracy of the strip crown and improve the quality of final product in the hot strip rolling, an optimized model based upon support vector machine (SVM) is proposed firstly. Meanwhile, for purposes of enriching data information and ensuring data quality, the actual data from a hot-rolled plant are collected to establish prediction model, as well as the prediction performance of models was evaluated by using multiple indicators. Besides, the traditional SVM model and the combined prediction models with the particle swarm optimization (PSO) and the cuckoo search (CS) optimization algorithm are also proposed. Furthermore, the prediction performance comparisons of the three different methods are discussed and validated. The results show that the CS-SVM has the highest prediction accuracy compared to the other two methods, and the root mean squared error (RMSE) of the proposed CS-SVM is 2.05µm, and 98.11% of prediction data have an absolute error below 4.5μm. In addition, the results also demonstrated that the CS-SVM not only with faster convergence speed and higher prediction accuracy but can be well applied to the actual hot strip rolling production.


2011 ◽  
Vol 71-78 ◽  
pp. 4155-4159
Author(s):  
Hai Xia Wei ◽  
Jie Zhu

Based on the nonlinear regression theory of Support Vector Machine, SVM model was put forward to predict blasting vibration velocity by using monitoring data obtained in blasting site as training samples. By comparing the results of the two prediction models of the improved Sadaovsk and SVM, the feasibility of the new learning method of SVM model was verified, which will provide a new way to predict and control intensity of blasting vibration. The best way to select the parameters of SVM needs to be further explored.


2020 ◽  
Vol 51 (5) ◽  
pp. 942-958 ◽  
Author(s):  
Jianzhu Li ◽  
Siyao Zhang ◽  
Lingmei Huang ◽  
Ting Zhang ◽  
Ping Feng

Abstract Drought is an important factor that limits economic and social development due to its frequent occurrence and profound influence. Therefore, it is of great significance to make accurate predictions of drought for early warning and disaster alleviation. In this paper, SPEI-1 was confirmed to classify drought grades in the Guanzhong Area, and the autoregressive integrated moving average (ARIMA), random forest (RF) and support vector machine (SVM) model were established. Meteorological data and remote sensing data were used to derive the prediction models. The results showed the following. (1) The SVM model performed the best when the models were developed using meteorological data, remote sensing data and a combination of meteorological and remote sensing data, but the model's corresponding kernel functions are different and include linear, polynomial and Gaussian radial basis kernel functions, respectively. (2) The RF model driven by the remote sensing data and the SVM model driven by the combined meteorological and remote sensing data were found to perform better than the model driven by the corresponding other data in the Guanzhong Area. It is difficult to accurately measure drought with the single meteorological data. Only by considering the combined factors can we more accurately monitor and predict drought. This study can provide an important scientific basis for regional drought warnings and predictions.


2014 ◽  
Vol 509 ◽  
pp. 38-43
Author(s):  
Zhong Jie Fan ◽  
Yan Qiu Leng ◽  
Yong Long Xu ◽  
Zheng Jiang Meng ◽  
Ji Wei Xu

Based on the analysis of influence factors of saturated sand, this paper expounds the limitations of traditional evaluation of liquefaction, and introduces the criterion of support vector machine (SVM) based on the principle of structural risk minimization. According to the main influence factors of sand liquefaction, a SVM discriminant model of sand liquefaction with different kernel functions is established. Through studying small sample data, this model can establish nonlinear mapping relationship between influence factors and liquefaction type. On the basis of seismic data, a radial based kernel function is selected to predict sand liquefaction type. The research results show that the predicted magnitude is identical with the actual result, to prove that it is effective to apply this SVM model to evaluate the level of sand liquefaction.


Author(s):  
Osval Antonio Montesinos López ◽  
Abelardo Montesinos López ◽  
Jose Crossa

AbstractIn this chapter, the support vector machines (svm) methods are studied. We first point out the origin and popularity of these methods and then we define the hyperplane concept which is the key for building these methods. We derive methods related to svm: the maximum margin classifier and the support vector classifier. We describe the derivation of the svm along with some kernel functions that are fundamental for building the different kernels methods that are allowed in svm. We explain how the svm for binary response variables can be expanded for categorical response variables and give examples of svm for binary and categorical response variables with plant breeding data for genomic selection. Finally, general issues for adopting the svm methodology for continuous response variables are provided, and some examples of svm for continuous response variables for genomic prediction are described.


2016 ◽  
Vol 2 (1) ◽  
pp. 16 ◽  
Author(s):  
Motoki Sakai

Heart rate (HR) is one of the vital signs used to assess our physical condition; it would be beneficial if HR could easily be obtained without special medical instruments. In this study, a feature of vocal frequency was used to estimate HR, because it can easily be recorded with a common device such as a smartphone. Previous studies proposed that a support vector machine (SVM) that adopted the inner product as the kernel function was efficient for estimating HR to a certain extent. However, these studies did not present the effectiveness of other kernel functions, such as the hyperbolic tangent function. Therefore, this study identified a combination of kernel functions of the kernel ridge regression (KRR). In addition, features of vocal frequency to effectively estimate HR were investigated. To evaluate the effectiveness, experiments were conducted with two subjects. In the experiment, 60 sets of HRs and voice data were measured per subject. To identify the most effective kernel function, four kernel functions (the inner function, Gaussian function, polynomial function, and hyperbolic tangent function) were compared. Moreover, effective features of vocal frequency were selected with the sequential feature selection (SFS) method. As a consequence, the hyperbolic tangent function worked best, and high-frequency components of voice were efficient. However, results of this research indicated that effective vocal spectrum components to estimate HR differ depending on prediction models.


Sign in / Sign up

Export Citation Format

Share Document