Divide and conquer algorithm based support vector machine for massive data analysis

2021 ◽  
Vol 32 (3) ◽  
pp. 463-473
Author(s):  
Sungwan Bang ◽  
Seokwon Han ◽  
Jaeoh Kim
2019 ◽  
Vol 1 (1) ◽  
pp. 483-491 ◽  
Author(s):  
Makhamisa Senekane

The ubiquity of data, including multi-media data such as images, enables easy mining and analysis of such data. However, such an analysis might involve the use of sensitive data such as medical records (including radiological images) and financial records. Privacy-preserving machine learning is an approach that is aimed at the analysis of such data in such a way that privacy is not compromised. There are various privacy-preserving data analysis approaches such as k-anonymity, l-diversity, t-closeness and Differential Privacy (DP). Currently, DP is a golden standard of privacy-preserving data analysis due to its robustness against background knowledge attacks. In this paper, we report a scheme for privacy-preserving image classification using Support Vector Machine (SVM) and DP. SVM is chosen as a classification algorithm because unlike variants of artificial neural networks, it converges to a global optimum. SVM kernels used are linear and Radial Basis Function (RBF), while ϵ -differential privacy was the DP framework used. The proposed scheme achieved an accuracy of up to 98%. The results obtained underline the utility of using SVM and DP for privacy-preserving image classification.


2014 ◽  
Vol 548-549 ◽  
pp. 1265-1269
Author(s):  
Yun Sik Hwang ◽  
Byeong Joo Jun ◽  
Tae Seon Yoon

As the stage of bioinformatics has been upgraded, classification of certain pathogen has been improved into a new manner. The main topic of this research is genetic singularity of HCV (Hepatitis C Virus) and our objective is to assay features of the HCV's amino acid under usage of Support Vector Machine (SVM) algorithm. HCV data used in our experiment has 10 kinds of sequences and 257 kinds of data. According to data analysis, some peculiar genetic patterns of HCV’s linearity that discord pre-existing neural network and C5.0 were found.


2010 ◽  
Vol 82 (16) ◽  
pp. 7000-7007 ◽  
Author(s):  
Patrick W. T. Krooshof ◽  
Bülent Üstün ◽  
Geert J. Postma ◽  
Lutgarde M. C. Buydens

Author(s):  
K. Nafees Ahmed ◽  
T. Abdul Razak

<p>Information extraction from data is one of the key necessities for data analysis. Unsupervised nature of data leads to complex computational methods for analysis. This paper presents a density based spatial clustering technique integrated with one-class Support Vector Machine (SVM), a machine learning technique for noise reduction, a modified variant of DBSCAN called Noise Reduced DBSCAN (NRDBSCAN). Analysis of DBSCAN exhibits its major requirement of accurate thresholds, absence of which yields suboptimal results. However, identifying accurate threshold settings is unattainable. Noise is one of the major side-effects of the threshold gap. The proposed work reduces noise by integrating a machine learning classifier into the operation structure of DBSCAN. The Experimental results indicate high homogeneity levels in the clustering process.</p>


2015 ◽  
Vol 13 ◽  
pp. 127-132 ◽  
Author(s):  
P. Jansen ◽  
D. Vergossen ◽  
D. Renner ◽  
W. John ◽  
J. Götze

Abstract. An alternative method for determining the state of charge (SOC) on lithium iron phosphate cells by impedance spectra classification is given. Methods based on the electric equivalent circuit diagram (ECD), such as the Kalman Filter, the extended Kalman Filter and the state space observer, for instance, have reached their limits for this cell chemistry. The new method resigns on the open circuit voltage curve and the parameters for the electric ECD. Impedance spectra classification is implemented by a Support Vector Machine (SVM). The classes for the SVM-algorithm are represented by all the impedance spectra that correspond to the SOC (the SOC classes) for defined temperature and aging states. A divide and conquer based search algorithm on a binary search tree makes it possible to grade measured impedances using the SVM method. Statistical analysis is used to verify the concept by grading every single impedance from each impedance spectrum corresponding to the SOC by class with different magnitudes of charged error.


Author(s):  
Z. Ghaemi ◽  
M. Farnaghi ◽  
A. Alimohammadi

The critical impact of air pollution on human health and environment in one hand and the complexity of pollutant concentration behavior in the other hand lead the scientists to look for advance techniques for monitoring and predicting the urban air quality. Additionally, recent developments in data measurement techniques have led to collection of various types of data about air quality. Such data is extremely voluminous and to be useful it must be processed at high velocity. Due to the complexity of big data analysis especially for dynamic applications, online forecasting of pollutant concentration trends within a reasonable processing time is still an open problem. The purpose of this paper is to present an online forecasting approach based on Support Vector Machine (SVM) to predict the air quality one day in advance. In order to overcome the computational requirements for large-scale data analysis, distributed computing based on the Hadoop platform has been employed to leverage the processing power of multiple processing units. The MapReduce programming model is adopted for massive parallel processing in this study. Based on the online algorithm and Hadoop framework, an online forecasting system is designed to predict the air pollution of Tehran for the next 24 hours. The results have been assessed on the basis of Processing Time and Efficiency. Quite accurate predictions of air pollutant indicator levels within an acceptable processing time prove that the presented approach is very suitable to tackle large scale air pollution prediction problems.


Sign in / Sign up

Export Citation Format

Share Document