scholarly journals Sparse Reductions for Fixed-Size Least Squares Support Vector Machines on Large Scale Data

Author(s):  
Raghvendra Mall ◽  
Johan A. K. Suykens
Author(s):  
Denali Molitor ◽  
Deanna Needell

Abstract In today’s data-driven world, storing, processing and gleaning insights from large-scale data are major challenges. Data compression is often required in order to store large amounts of high-dimensional data, and thus, efficient inference methods for analyzing compressed data are necessary. Building on a recently designed simple framework for classification using binary data, we demonstrate that one can improve classification accuracy of this approach through iterative applications whose output serves as input to the next application. As a side consequence, we show that the original framework can be used as a data preprocessing step to improve the performance of other methods, such as support vector machines. For several simple settings, we showcase the ability to obtain theoretical guarantees for the accuracy of the iterative classification method. The simplicity of the underlying classification framework makes it amenable to theoretical analysis.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yixue Zhu ◽  
Boyue Chai

With the development of increasingly advanced information technology and electronic technology, especially with regard to physical information systems, cloud computing systems, and social services, big data will be widely visible, creating benefits for people and at the same time facing huge challenges. In addition, with the advent of the era of big data, the scale of data sets is getting larger and larger. Traditional data analysis methods can no longer solve the problem of large-scale data sets, and the hidden information behind big data is digging out, especially in the field of e-commerce. We have become a key factor in competition among enterprises. We use a support vector machine method based on parallel computing to analyze the data. First, the training samples are divided into several working subsets through the SOM self-organizing neural network classification method. Compared with the ever-increasing progress of information technology and electronic equipment, especially the related physical information system finally merges the training results of each working set, so as to quickly deal with the problem of massive data prediction and analysis. This paper proposes that big data has the flexibility of expansion and quality assessment system, so it is meaningful to replace the double-sidedness of quality assessment with big data. Finally, considering the excellent performance of parallel support vector machines in data mining and analysis, we apply this method to the big data analysis of e-commerce. The research results show that parallel support vector machines can solve the problem of processing large-scale data sets. The emergence of data dirty problems has increased the effective rate by at least 70%.


2015 ◽  
Vol 82 (12) ◽  
Author(s):  
Matthias Richter ◽  
Thomas Längle ◽  
Jürgen Beyerer

AbstractHyperspectral sensors are becoming cheaper, faster and more readily available. Apart from industry applications, manufacturers push to bring compact devices into the end-consumer market. This development gives rise to many interesting applications such as the identification of counterfeit pharmaceutical products or the classification of food stuffs. These applications require precise models of the underlying classes. However, building these models from expert knowledge is not feasible. In this paper, we propose to use machine learning techniques to infer a model of many classes from an annotated dataset instead. We investigate the use of three popular methods: support vector machines, random forest classifiers and partial least squares. In contrast to similar approaches using support vector machines, we restrict ourselves to the linear formulation and train the classifiers by solving the primal, instead of dual optimization problem. Our experiments on a large dataset show that the support vector machine approach is superior to random forests and partial least squares in classification accuracy as well as training time.


2011 ◽  
Vol 216 ◽  
pp. 738-741
Author(s):  
Yue E Chen ◽  
Bai Li Ren

SVM has got very good results in the area of solving the classification, regression and density estimation problem in machine learning, has been successfully applied to practical problems of text recognition, speech classification, but the training time is too long is a big drawback. A new reduction strategy is proposed for training support vector machines. This method is fast in convergence without learning machine’s generalization performance, the results of simulation experiments show the feasibility and effectiveness of that method through this method.


Sign in / Sign up

Export Citation Format

Share Document