scholarly journals Distributed and Robust Support Vector Machine

Author(s):  
Yangwei Liu ◽  
Hu Ding ◽  
Ziyun Huang ◽  
Jinhui Xu

In this paper, we consider the distributed version of Support Vector Machine (SVM) under the coordinator model, where all input data (i.e., points in [Formula: see text] space) of SVM are arbitrarily distributed among [Formula: see text] nodes in some network with a coordinator which can communicate with all nodes. We investigate two variants of this problem, with and without outliers. For distributed SVM without outliers, we prove a lower bound on the communication complexity and give a distributed [Formula: see text]-approximation algorithm to reach this lower bound, where [Formula: see text] is a user specified small constant. For distributed SVM with outliers, we present a [Formula: see text]-approximation algorithm to explicitly remove the influence of outliers. Our algorithm is based on a deterministic distributed top [Formula: see text] selection algorithm with communication complexity of [Formula: see text] in the coordinator model.

Symmetry ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 212
Author(s):  
Yu-Wei Liu ◽  
Huan Feng ◽  
Heng-Yi Li ◽  
Ling-Ling Li

Accurate prediction of photovoltaic power is conducive to the application of clean energy and sustainable development. An improved whale algorithm is proposed to optimize the Support Vector Machine model. The characteristic of the model is that it needs less training data to symmetrically adapt to the prediction conditions of different weather, and has high prediction accuracy in different weather conditions. This study aims to (1) select light intensity, ambient temperature and relative humidity, which are strictly related to photovoltaic output power as the input data; (2) apply wavelet soft threshold denoising to preprocess input data to reduce the noise contained in input data to symmetrically enhance the adaptability of the prediction model in different weather conditions; (3) improve the whale algorithm by using tent chaotic mapping, nonlinear disturbance and differential evolution algorithm; (4) apply the improved whale algorithm to optimize the Support Vector Machine model in order to improve the prediction accuracy of the prediction model. The experiment proves that the short-term prediction model of photovoltaic power based on symmetry concept achieves ideal accuracy in different weather. The systematic method for output power prediction of renewable energy is conductive to reducing the workload of predicting the output power and to promoting the application of clean energy and sustainable development.


2013 ◽  
Vol 706-708 ◽  
pp. 613-617
Author(s):  
Fu Cheng Liu ◽  
Zhao Hui Liu ◽  
Wen Liu ◽  
Dong Sheng Liang ◽  
Kai Cui ◽  
...  

A navigation star catalog (NSC) selection algorithm via support vector machine (SVM) is proposed in this paper. The sphere spiral method is utilized to generate the sampling boresight directions by virtue of obtaining the uniform sampling data. Then the theory of regression analysis methods is adopted to extract the NSC, and an evenly distributed and small capacity NSC is obtained. Two criterions, namely a global criterion and a local criterion, are defined as the uniformity criteria to test the performance of the NSC generated. Simulations show that, compared with MFM, magnitude weighted method (MWM) and self-organizing algorithm(S-OA), the Boltzmann entropy (B.e) of SVM selection algorithm (SVM-SA) is the minimum, to 0.00207. Simultaneously, under the conditions such as the same field of view (FOV) and elimination of the hole, both the number of guide stars (NGS) and standard deviation (std) of SVM-SA is the least, respectively 7668 and 2.17. Consequently, the SVM-SA is optimal in terms of the NGS and the uniform distribution, and has also a strong adaptability.


2020 ◽  
Vol 202 ◽  
pp. 15004
Author(s):  
Aditya Tegar Satria ◽  
Mustafid ◽  
Dinar Mutiara Kusumo Nugraheni

Nowadays, the utilization of Internet of Things (IoT) is commonly used in the tourism industry, including aviation, where passengers of flight services can rate their satisfaction levels towards the product and service they use by writing their reviews in the form of text-based data on many popular websites. These passenger reviews are collections of potential big data and can be analyzed in order to extract meaningful informations. Some text mining algorithms are already in common use, including the Bayes formula and Support Vector Machine methods. This research proposes an implementation of the Bayes and SVM methods where these algorithms will operate independently yet integrated with other modules such as input data, text pre-processing and shows output result concisely in one single information system. The proposed system was successfully delivered 1000 documents of passenger reviews as input data, then after implemented the pre-processing method, the Bayes formula was used to classify the document reviews into 5 categories, including plane condition, flight comfort, staff service, food and entertainment, and price. While simultanously, the positive and negative sentiment contained in the review document was analyzed with SVM method and shows the accuracy score of 83.6% for a training to testing set ratio of 50:50, while 82.75% accuracy for the 60:40 ratio, and 83.3% accuracy for the 70:30 ratio. This research shows that two different text mining algorithms can be implemented simultaneously in a effective and efficient way, while still providing an accurate and satisfying performance results in one integrated information system.


2019 ◽  
Vol 47 (3) ◽  
pp. 154-170
Author(s):  
Janani Balakumar ◽  
S. Vijayarani Mohan

Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and χ2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper. Findings The experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy. Originality/value This paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.


2014 ◽  
Vol 493 ◽  
pp. 337-342 ◽  
Author(s):  
Achmad Widodo ◽  
I. Haryanto ◽  
T. Prahasto

This paper deals with implementation of intelligent system for fault diagnostics of rolling element bearing. In this work, the proposed intelligent system was basically created using support vector machine (SVM) due to its excellent performance in classification task. Moreover, SVM was modified by introducing wavelet function as kernel for mapping input data into feature space. Input data were vibration signals acquired from bearings through standard data acquisition process. Statistical features were then calculated from bearing signals, and extraction of salient features was conducted using component analysis. Results of fault diagnostics are shown by observing classification of bearing conditions which gives plausible accuracy in testing of the proposed system.


2020 ◽  
Author(s):  
Amit Thakur ◽  
Rajesh Singh ◽  
Anita Gehlot ◽  
Shaik Vaseem Akram ◽  
Prabin Kumar Das

BACKGROUND COVID-19 is chronic based disease which is spreading with rapid pace in the entire world. Present study addresses the situation of outbreak of the COVID-19 disease in India and estimate the rise of the cases in India. This study addresses the present health infrastructure, infected health workforce clearly with the statistics. Support Vector Machine and Linear Regression are implemented in this study for predicting the expected cases. For the purpose of modelling, the input data of number of cases is considered from the march 15th , 2020. With the input data, the two models are trained for prediction of the cases. In the end, the results show that support vector machine and linear regression are giving good accuracy for prediction. OBJECTIVE The current studies aim to analyze and estimate the developments in the near future with reference to COVID-19 in India. The research is also planned to look at the preparation level of Indian government for this outbreak. The scope of the study is narrowed to build prediction models for the Indian region and uses SVMs for prediction methods based on time series that are easily built and readable under these crucial conditions. The study does not cover coverage of a COVID-19 outbreak for any other country. METHODS Support Vector Machine and Linear Regression are implemented in this study for predicting the expected cases. For the purpose of modelling, the input data of number of cases is considered from the march 15th , 2020. With the input data, the two models are trained for prediction of the cases. In the end, the results show that support vector machine and linear regression are giving good accuracy for prediction. RESULTS 1.Considering the change, the change in slope of the both curves in the graph, it can be concluded that the trained model is giving a quite good range of accuracy. 2.The Graph shows the plot of the predicted values and actual values fed during the testing of model. Considering the change, the change in slope of the both curves in the graph, it can be concluded that the trained model is giving a quite good range of accuracy. CONCLUSIONS In conclusion, the present work emphasized on presenting observations and predictions about COVID-19 outbreaks in the Indian region. Although the rate of growth at world level is not equal to the rate of growth, the situation appears dangerous as India is heading towards exponential growth. The expected patients are reaching in millions in the next 30 days by means of two separate time series forecasting models. With regard to the poor health facilities, it is going to difficult to combat the outbreak of virus without government addressing the effective measurements. Contrast to strict lockdown, social distancing, isolation, patient testing and medical care need to implement with war base for combating the pandemic in India. The forecasting in this study are still in beginning phases as the historical data is limit for creating reliable model. That to the risen of cases in India followed from the last 10 days so the training for the model may not be accurate, however the prediction model would be enhanced from existing models, as the greater number of medical and demographic data is available.Furthermore, even if the predictions are 60-70 percent correct, then the nation will also encounter this quite hard days.


2016 ◽  
Vol 2016 ◽  
pp. 1-10 ◽  
Author(s):  
Qin Miao ◽  
Justin Derbas ◽  
Aya Eid ◽  
Hariharan Subramanian ◽  
Vadim Backman

Partial wave spectroscopy (PWS) enables quantification of the statistical properties of cell structures at the nanoscale, which has been used to identify patients harboring premalignant tumors by interrogating easily accessible sites distant from location of the lesion. Due to its high sensitivity, cells that are well preserved need to be selected from the smear images for further analysis. To date, such cell selection has been done manually. This is time-consuming, is labor-intensive, is vulnerable to bias, and has considerable inter- and intraoperator variability. In this study, we developed a classification scheme to identify and remove the corrupted cells or debris that are of no diagnostic value from raw smear images. The slide of smear sample is digitized by acquiring and stitching low-magnification transmission. Objects are then extracted from these images through segmentation algorithms. A training-set is created by manually classifying objects as suitable or unsuitable. A feature-set is created by quantifying a large number of features for each object. The training-set and feature-set are used to train a selection algorithm using Support Vector Machine (SVM) classifiers. We show that the selection algorithm achieves an error rate of 93% with a sensitivity of 95%.


Sign in / Sign up

Export Citation Format

Share Document