An Improved Training Algorithm of Support Vector Machines Based on Three Data Points Iteration

Author(s):  
Li Cunhe ◽  
Liu Kangwei ◽  
Zhu Lina
2021 ◽  
Vol 26 (1) ◽  
pp. 1-21
Author(s):  
Sebastian Schlag ◽  
Matthias Schmitt ◽  
Christian Schulz

The time complexity of support vector machines (SVMs) prohibits training on huge datasets with millions of data points. Recently, multilevel approaches to train SVMs have been developed to allow for time-efficient training on huge datasets. While regular SVMs perform the entire training in one—time-consuming—optimization step, multilevel SVMs first build a hierarchy of problems decreasing in size that resemble the original problem and then train an SVM model for each hierarchy level, benefiting from the solved models of previous levels. We present a faster multilevel support vector machine that uses a label propagation algorithm to construct the problem hierarchy. Extensive experiments indicate that our approach is up to orders of magnitude faster than the previous fastest algorithm while having comparable classification quality. For example, already one of our sequential solvers is on average a factor 15 faster than the parallel ThunderSVM algorithm, while having similar classification quality. 1


2011 ◽  
Vol 301-303 ◽  
pp. 677-681
Author(s):  
Liang Qin ◽  
Hong Wei Yin ◽  
Xian Jun Shi ◽  
Zhi Cai Xiao

In order to figure out the deficiency of the SVM on extensive sample, nature of SV is studied in this paper. An improved incremental training algorithm is put forward based on dimensional of samples. A chosen gene which got by density and distance criterion is used in this method. In this method the number of training samples is decreased and the space information is keeped. So, the training speed is improved while the precision is not reduced. And the simulation proved the efficiency of this method.


Biosensors ◽  
2020 ◽  
Vol 10 (10) ◽  
pp. 140
Author(s):  
Nathan Meyer ◽  
Jean-Marc Janot ◽  
Mathilde Lepoitevin ◽  
Michaël Smietana ◽  
Jean-Jacques Vasseur ◽  
...  

Single nanopore is a powerful platform to detect, discriminate and identify biomacromolecules. Among the different devices, the conical nanopores obtained by the track-etched technique on a polymer film are stable and easy to functionalize. However, these advantages are hampered by their high aspect ratio that avoids the discrimination of similar samples. Using machine learning, we demonstrate an improved resolution so that it can identify short single- and double-stranded DNA (10- and 40-mers). We have characterized each current blockade event by the relative intensity, dwell time, surface area and both the right and left slope. We show an overlap of the relative current blockade amplitudes and dwell time distributions that prevents their identification. We define the different parameters that characterize the events as features and the type of DNA sample as the target. By applying support-vector machines to discriminate each sample, we show accuracy between 50% and 72% by using two features that distinctly classify the data points. Finally, we achieved an increased accuracy (up to 82%) when five features were implemented.


2020 ◽  
Author(s):  
Lewis Mervin ◽  
Avid M. Afzal ◽  
Ola Engkvist ◽  
Andreas Bender

In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into reliable probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely Platt Scaling, Isotonic Regression and Venn-ABERS in calibrating prediction scores for ligand-target prediction comprising the Naïve Bayes, Support Vector Machines and Random Forest algorithms with bioactivity data available at AstraZeneca (40 million data points (compound-target pairs) across 2112 targets). Performance was assessed using Stratified Shuffle Split (SSS) and Leave 20% of Scaffolds Out (L20SO) validation.


Sign in / Sign up

Export Citation Format

Share Document