Multivariate information fusion for identifying antifungal peptides with Hilbert-Schmidt Independence Criterion

2021 ◽  
Vol 16 ◽  
Author(s):  
Haohao Zhou ◽  
Hao Wang ◽  
Yijie Ding ◽  
Jijun Tang

Background: Antifungal peptides (AFP) have been found to be effective against many fungal infections. Objective: However, it is difficult to identify AFP. Therefore, it is great practical significance to identify AFP via machine learning methods (with sequence information). Method: In this study, a Multi-Kernel Support Vector Machine (MKSVM) with Hilbert-Schmidt Independence Criterion (HSIC) is proposed. Proteins are encoded with five types of features (188-bit, AAC, ASDC, CKSAAP, DPC), and then construct kernels using Gaussian kernel function. HSIC are used to combine kernels and multi-kernel SVM model is built. Results: Our model performed well on three AFPs datasets and the performance is better than or comparable to other state-of-art predictive models. Conclusion: Our method will be a useful tool for identifying antifungal peptides.

Author(s):  
DANIEL T. H. LAI ◽  
REZAUL BEGG ◽  
MARIMUTHU PALANISWAMI

Trip-related falls are a major problem in the elderly population and research in the area has received much attention recently. The focus has been on devising ways of identifying individuals at risk of sustaining such falls. The main aim of this work is to explore the effectiveness of models based on Support Vector Machines (SVMs) for the automated recognition of gait patterns that exhibit falling behavior. Minimum toe clearance (MTC) during continuous walking on a treadmill was recorded on 10 healthy elderly and 10 elderly with balance problems and with a history of tripping falls. Statistical features obtained from MTC histograms were used as inputs to the SVM model to classify between the healthy and balance-impaired subjects. The leave-one-out technique was utilized for training the SVM model in order to find the optimal model parameters. Tests were conducted with various kernels (linear, Gaussian and polynomial) and with a change in the regularization parameter, C, in an effort to identify the optimum model for this gait data. The receiver operating characteristic (ROC) plots of sensitivity and specificity were further used to evaluate the diagnostic performance of the model. The maximum accuracy was found to be 90% using a Gaussian kernel with σ2 = 10 and the maximum ROC area 0.98 (80% sensitivity and 100% specificity), when all statistical features were used by the SVM models to diagnose gait patterns of healthy and balance-impaired individuals. This accuracy was further improved by using a feature selection method in order to reduce the effect of redundant features. It was found that two features (standard deviation and maximum value) were adequate to give an improved accuracy of 95% (90% sensitivity and 100% specificity) using a polynomial kernel of degree 2. These preliminary results are encouraging and could be useful not only for diagnostic applications but also for evaluating improvements in gait function in the clinical/rehabilitation contexts.


Algorithms ◽  
2018 ◽  
Vol 11 (12) ◽  
pp. 193
Author(s):  
Yuchuang Wang ◽  
Guoyou Shi ◽  
Xiaotong Sun

Container ships must pass through multiple ports of call during a voyage. Therefore, forecasting container volume information at the port of origin followed by sending such information to subsequent ports is crucial for container terminal management and container stowage personnel. Numerous factors influence container allocation to container ships for a voyage, and the degree of influence varies, engendering a complex nonlinearity. Therefore, this paper proposes a model based on gray relational analysis (GRA) and mixed kernel support vector machine (SVM) for predicting container allocation to a container ship for a voyage. First, in this model, the weights of influencing factors are determined through GRA. Then, the weighted factors serve as the input of the SVM model, and SVM model parameters are optimized through a genetic algorithm. Numerical simulations revealed that the proposed model could effectively predict the number of containers for container ship voyage and that it exhibited strong generalization ability and high accuracy. Accordingly, this model provides a new method for predicting container volume for a voyage.


2013 ◽  
Vol 16 (5) ◽  
pp. 973-988 ◽  
Author(s):  
Xiao-Li Li ◽  
Haishen Lü ◽  
Robert Horton ◽  
Tianqing An ◽  
Zhongbo Yu

An accurate and real-time flood forecast is a crucial nonstructural step to flood mitigation. A support vector machine (SVM) is based on the principle of structural risk minimization and has a good generalization capability. The ensemble Kalman filter (EnKF) is a proven method with the capability of handling nonlinearity in a computationally efficient manner. In this paper, a type of SVM model is established to simulate the rainfall–runoff (RR) process. Then, a coupling model of SVM and EnKF (SVM + EnKF) is used for RR simulation. The impact of the assimilation time scale on the SVM + EnKF model is also studied. A total of four different combinations of the SVM and EnKF models are studied in the paper. The Xinanjiang RR model is employed to evaluate the SVM and the SVM + EnKF models. The study area is located in the Luo River Basin, Guangdong Province, China, during a nine-year period from 1994 to 2002. Compared to SVM, the SVM + EnKF model substantially improves the accuracy of flood prediction, and the Xinanjiang RR model also performs better than the SVM model. The simulated result for the assimilation time scale of 5 days is better than the results for the other cases.


2020 ◽  
Author(s):  
Resheng PAN ◽  
Hui Li ◽  
Zhidong WANG ◽  
Dong PENG ◽  
Lang ZHAO ◽  
...  

Abstract Background Due to the influence of power market reform policies, the conversion of power loads has become more and more complicated. The current load forecasting methods have long calculation times and inaccurate volatility load forecasting. The difficulty of forecasting is becoming greater and more accurate. It becomes very important to predict the electrical load. Under this background, this paper proposes the application methods of collaborative knowledge mining and SMO in solving prediction models based on hyperball support vector machine (CKM / SMO-SVM). Methods This study first analyzes the impact of historical data on samples and different parameters. The prediction of power load, sample data and various parameters have a significant impact on the prediction results. Secondly, applying weak entropy theory for collaborative knowledge mining, preprocessing sample data and historical information. Third, a short-term power load forecasting system based on the hypersphere support vector machine model is established and the problem is solved by SMO. Finally, the SVM model and BP model are selected for prediction to verify the new model. Results Our research proves that the rms relative error of the CKM / SMO-SVM model is only 2.32%, which is 0.67% and 1.56% lower than the SVM and BP models, respectively, and the optimization speed is faster. Conclusions The model proposed in this paper utilizes Hyper-sphere SVM which is suitable for Gaussian kernel function to achieve faster and more accurate load forecasting, which can provide more accurate services for energy spot transactions and energy scheduling plans.


2020 ◽  
Vol 17 (4) ◽  
pp. 302-310
Author(s):  
Yijie Ding ◽  
Feng Chen ◽  
Xiaoyi Guo ◽  
Jijun Tang ◽  
Hongjie Wu

Background: The DNA-binding proteins is an important process in multiple biomolecular functions. However, the tradition experimental methods for DNA-binding proteins identification are still time consuming and extremely expensive. Objective: In past several years, various computational methods have been developed to detect DNAbinding proteins. However, most of them do not integrate multiple information. Methods: In this study, we propose a novel computational method to predict DNA-binding proteins by two steps Multiple Kernel Support Vector Machine (MK-SVM) and sequence information. Firstly, we extract several feature and construct multiple kernels. Then, multiple kernels are linear combined by Multiple Kernel Learning (MKL). At last, a final SVM model, constructed by combined kernel, is built to predict DNA-binding proteins. Results: The proposed method is tested on two benchmark data sets. Compared with other existing method, our approach is comparable, even better than other methods on some data sets. Conclusion: We can conclude that MK-SVM is more suitable than common SVM, as the classifier for DNA-binding proteins identification.


2019 ◽  
Vol 2019 ◽  
pp. 1-20 ◽  
Author(s):  
Dalian Yang ◽  
Jingjing Miao ◽  
Fanyu Zhang ◽  
Jie Tao ◽  
Guangbin Wang ◽  
...  

Bearing is an important mechanical component that easily fails in a bad working environment. Support vector machines can be used to diagnose bearing faults; however, the recognition ability of the model is greatly affected by the kernel function and its parameters. Unfortunately, optimal parameters are difficult to select. To address these limitations, an escape mechanism and adaptive convergence conditions were introduced to the ALO algorithm. As a result, the EALO method was proposed and has been applied to the more accurate selection of SVM model parameters. To assess the model, the vibration acceleration signals of normal, inner ring fault, outer ring fault, and ball fault bearings were collected at different rotation speeds (1500 r/min, 1800 r/min, 2100 r/min, and 2400 r/min). The vibration signals were decomposed using the variational mode decomposition (VMD) method. The features were extracted through the kernel function to fuse the energy value of each VMD component. In these experiments, the two most important parameters for the support vector machine—the Gaussian kernel parameter σ and the penalty factor C—were optimized using the EALO algorithm, ALO algorithm, genetic algorithm (GA), and particle swarm optimization (PSO) algorithm. The performance of these four methods to optimize the two parameters was then compared and analyzed, with the EALO method having the best performance. The recognition rates for bearing faults under different tested rotation speeds were improved when the SVM model parameters optimized by the EALO were used.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Hui Wang ◽  
Feng Qin ◽  
Qi Liu ◽  
Liu Ruan ◽  
Rui Wang ◽  
...  

Stripe rust and leaf rust with similar symptoms are two important wheat diseases. In this study, to investigate a method to identify and assess the two diseases, the canopy hyperspectral data of healthy wheat, wheat in incubation period, and wheat in diseased period of the diseases were collected, respectively. After data preprocessing, three support vector machine (SVM) models for disease identification and six support vector regression (SVR) models for disease index (DI) inversion were built. The results showed that the SVM model based on wavelet packet decomposition coefficients with the overall identification accuracy of the training set equal to 99.67% and that of the testing set equal to 82.00% was better than the other two models. To improve the identification accuracy, it was suggested that a combination model could be constructed with one SVM model and two models built usingK-nearest neighbors (KNN) method. Using the DI inversion SVR models, the satisfactory results were obtained for the two diseases. The results demonstrated that identification and DI inversion of stripe rust and leaf rust can be implemented based on hyperspectral data at the canopy level.


2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Shuang Pan ◽  
Jianguo Wei ◽  
Hao Pan

Accurate evaluation of the risk level and operation performances of P2P online lending platforms is not only conducive to better functioning of information intermediaries but also effective protection of investors’ interests. This paper proposes a genetic algorithm (GA) improved hybrid kernel support vector machine (SVM) with an index system to construct such an evaluation model. A hybrid kernel consisting of polynomial function and radial basis function is improved, specifically kernel parameters and the weight of two kernels, by GA method with excellent global optimization and rapid convergence. Empirical testing based on cross-sectional data from Chinese P2P lending market demonstrates the superiority of the improved hybrid kernel SVM model. The classification accuracy of credit risk level and operation quality is higher than the single kernel SVM model as well as the hybrid kernel model with empirical parameter values.


Sign in / Sign up

Export Citation Format

Share Document