Learning Complexity vs Communication Complexity

This paper has two main focal points. We first consider an important class of machine learning algorithms: large margin classifiers, such as Support Vector Machines. The notion of margin complexity quantifies the extent to which a given class of functions can be learned by large margin classifiers. We prove that up to a small multiplicative constant, margin complexity is equal to the inverse of discrepancy. This establishes a strong tie between seemingly very different notions from two distinct areas.In the same way that matrix rigidity is related to rank, we introduce the notion of rigidity of margin complexity. We prove that sign matrices with small margin complexity rigidity are very rare. This leads to the question of proving lower bounds on the rigidity of margin complexity. Quite surprisingly, this question turns out to be closely related to basic open problems in communication complexity, e.g., whether PSPACE can be separated from the polynomial hierarchy in communication complexity.Communication is a key ingredient in many types of learning. This explains the relations between the field of learning theory and that of communication complexity [6, l0, 16, 26]. The results of this paper constitute another link in this rich web of relations. These new results have already been applied toward the solution of several open problems in communication complexity [18, 20, 29].

Download Full-text

A comparative study of multi-class support vector machines in the unifying framework of large margin classifiers

Applied Stochastic Models in Business and Industry ◽

10.1002/asmb.534 ◽

2005 ◽

Vol 21 (2) ◽

pp. 199-214 ◽

Cited By ~ 5

Author(s):

Yann Guermeur ◽

Andr� Elisseeff ◽

Dominique Zelus

Keyword(s):

Support Vector Machines ◽

Comparative Study ◽

Support Vector ◽

Large Margin ◽

Vector Machines ◽

Large Margin Classifiers

Download Full-text

Heart disease prediction using machine learning techniques : a survey

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.8.10557 ◽

2018 ◽

Vol 7 (2.8) ◽

pp. 684 ◽

Cited By ~ 12

Author(s):

V V. Ramalingam ◽

Ayantan Dandapath ◽

M Karthik Raja

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Complex Data ◽

Learning Techniques ◽

Vector Machines ◽

Supervised Learning Algorithms ◽

Life Threatening

Heart related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need of reliable, accurate and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart related diseases. This paper presents a survey of various models based on such algorithms and techniques andanalyze their performance. Models based on supervised learning algorithms such as Support Vector Machines (SVM), K-Nearest Neighbour (KNN), NaïveBayes, Decision Trees (DT), Random Forest (RF) and ensemble models are found very popular among the researchers.

Download Full-text

Statistical and Electrical Features Evaluation for Electrical Appliances Energy Disaggregation

Sustainability ◽

10.3390/su11113222 ◽

2019 ◽

Vol 11 (11) ◽

pp. 3222 ◽

Cited By ~ 15

Author(s):

Pascal Schirmer ◽

Iosif Mporas

Keyword(s):

Random Forest ◽

Machine Learning Algorithms ◽

Support Vector ◽

Random Forest Regression ◽

Nearest Neighbours ◽

Energy Disaggregation ◽

Vector Machines ◽

Non Linear ◽

Load Monitoring ◽

Sinusoidal Current

In this paper we evaluate several well-known and widely used machine learning algorithms for regression in the energy disaggregation task. Specifically, the Non-Intrusive Load Monitoring approach was considered and the K-Nearest-Neighbours, Support Vector Machines, Deep Neural Networks and Random Forest algorithms were evaluated across five datasets using seven different sets of statistical and electrical features. The experimental results demonstrated the importance of selecting both appropriate features and regression algorithms. Analysis on device level showed that linear devices can be disaggregated using statistical features, while for non-linear devices the use of electrical features significantly improves the disaggregation accuracy, as non-linear appliances have non-sinusoidal current draw and thus cannot be well parametrized only by their active power consumption. The best performance in terms of energy disaggregation accuracy was achieved by the Random Forest regression algorithm.

Download Full-text

Two-phased DEA-MLA approach for predicting efficiency of NBA players

Yugoslav journal of operations research ◽

10.2298/yjor140430030r ◽

2014 ◽

Vol 24 (3) ◽

pp. 347-358 ◽

Cited By ~ 4

Author(s):

Sandro Radovanovic ◽

Milan Radojicic ◽

Gordana Savic

Keyword(s):

Linear Regression ◽

High Reliability ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Efficiency Frontier ◽

Function Form ◽

Vector Machines ◽

Challenging Tasks ◽

Dea Analysis

In sports, a calculation of efficiency is considered to be one of the most challenging tasks. In this paper, DEA is used to evaluate an efficiency of the NBA players, based on multiple inputs and multiple outputs. The efficiency is evaluated for 26 NBA players at the guard position based on existing data. However, if we want to generate the efficiency for a new player, we would have to re-conduct the DEA analysis. Therefore, to predict the efficiency of a new player, machine learning algorithms are applied. The DEA results are incorporated as an input for the learning algorithms, defining thereby an efficiency frontier function form with high reliability. In this paper, linear regression, neural network, and support vector machines are used to predict an efficiency frontier. The results have shown that neural networks can predict the efficiency with an error less than 1%, and the linear regression with an error less than 2%.

Download Full-text

Research on Parallel Support Vector Machine Based on Spark Big Data Platform

Scientific Programming ◽

10.1155/2021/7998417 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Yao Huimin

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Support Vector Machines ◽

Cross Validation ◽

Machine Learning Algorithms ◽

Support Vector ◽

Lambda Architecture ◽

Vector Machines ◽

Data Platform

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.

Download Full-text

Comparative Performance of Machine Learning Algorithms for Cryptocurrency Forecasting

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v11.i3.pp1121-1128 ◽

2018 ◽

Vol 11 (3) ◽

pp. 1121 ◽

Cited By ~ 4

Author(s):

Nor Azizah Hitam ◽

Amelia Ritahani Ismail

Keyword(s):

Machine Learning ◽

Time Series Data ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Series Data ◽

Support Vector ◽

Small Range ◽

Accuracy Rate ◽

Comparative Performance ◽

Vector Machines

Machine Learning is part of Artificial Intelligence that has the ability to make future forecastings based on the previous experience. Methods has been proposed to construct models including machine learning algorithms such as Neural Networks (NN), Support Vector Machines (SVM) and Deep Learning. This paper presents a comparative performance of Machine Learning algorithms for cryptocurrency forecasting. Specifically, this paper concentrates on forecasting of time series data. SVM has several advantages over the other models in forecasting, and previous research revealed that SVM provides a result that is almost or close to actual result yet also improve the accuracy of the result itself. However, recent research has showed that due to small range of samples and data manipulation by inadequate evidence and professional analyzers, overall status and accuracy rate of the forecasting needs to be improved in further studies. Thus, advanced research on the accuracy rate of the forecasted price has to be done.

Download Full-text

Combination with Machine Learning Algorithms for the Classification in E-Bussiness

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.230-232.625 ◽

2011 ◽

Vol 230-232 ◽

pp. 625-628

Author(s):

Lei Shi ◽

Xin Ming Ma ◽

Xiao Hong Hu

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector ◽

Mathematical Tool ◽

Vector Machines

E-bussiness has grown rapidly in the last decade and massive amount of data on customer purchases, browsing pattern and preferences has been generated. Classification of electronic data plays a pivotal role to mine the valuable information and thus has become one of the most important applications of E-bussiness. Support Vector Machines are popular and powerful machine learning techniques, and they offer state-of-the-art performance. Rough set theory is a formal mathematical tool to deal with incomplete or imprecise information and one of its important applications is feature selection. In this paper, rough set theory and support vector machines are combined to construct a classification model to classify the data of E-bussiness effectively.

Download Full-text

Linear Support Vector Machines for Prediction of Student Performance in School-Based Education

Mathematical Problems in Engineering ◽

10.1155/2020/4761468 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Nalindren Naicker ◽

Timothy Adeliyi ◽

Jeanette Wing

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Student Performance ◽

State Of The Art ◽

Learning Algorithms ◽

The State ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Support Vector ◽

Vector Machines

Educational Data Mining (EDM) is a rich research field in computer science. Tools and techniques in EDM are useful to predict student performance which gives practitioners useful insights to develop appropriate intervention strategies to improve pass rates and increase retention. The performance of the state-of-the-art machine learning classifiers is very much dependent on the task at hand. Investigating support vector machines has been used extensively in classification problems; however, the extant of literature shows a gap in the application of linear support vector machines as a predictor of student performance. The aim of this study was to compare the performance of linear support vector machines with the performance of the state-of-the-art classical machine learning algorithms in order to determine the algorithm that would improve prediction of student performance. In this quantitative study, an experimental research design was used. Experiments were set up using feature selection on a publicly available dataset of 1000 alpha-numeric student records. Linear support vector machines benchmarked with ten categorical machine learning algorithms showed superior performance in predicting student performance. The results of this research showed that features like race, gender, and lunch influence performance in mathematics whilst access to lunch was the primary factor which influences reading and writing performance.

Download Full-text

Assessment of Interventions in Fuel Management Zones Using Remote Sensing

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9090533 ◽

2020 ◽

Vol 9 (9) ◽

pp. 533 ◽

Cited By ~ 2

Author(s):

Ricardo Afonso ◽

André Neves ◽

Carlos Viegas Damásio ◽

João Moura Pires ◽

Fernando Birra ◽

...

Keyword(s):

Satellite Images ◽

Vegetation Indices ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Support Vector ◽

Fuel Management ◽

K Nearest Neighbors ◽

Management Zones ◽

Vector Machines ◽

Sentinel 2

Every year, wildfires strike the Portuguese territory and are a concern for public entities and the population. To prevent a wildfire progression and minimize its impact, Fuel Management Zones (FMZs) have been stipulated, by law, around buildings, settlements, along national roads, and other infrastructures. FMZs require monitoring of the vegetation condition to promptly proceed with the maintenance and cleaning of these zones. To improve FMZ monitoring, this paper proposes the use of satellite images, such as the Sentinel-1 and Sentinel-2, along with vegetation indices and extracted temporal characteristics (max, min, mean and standard deviation) associated with the vegetation within and outside the FMZs and to determine if they were treated. These characteristics feed machine-learning algorithms, such as XGBoost, Support Vector Machines, K-nearest neighbors and Random Forest. The results show that it is possible to detect an intervention in an FMZ with high accuracy, namely with an F1-score ranging from 90% up to 94% and a Kappa ranging from 0.80 up to 0.89.

Download Full-text

Mobile Money Fraud Prediction—A Cross-Case Analysis on the Efficiency of Support Vector Machines, Gradient Boosted Decision Trees, and Naïve Bayes Algorithms

Information ◽

10.3390/info11080383 ◽

2020 ◽

Vol 11 (8) ◽

pp. 383

Author(s):

Francis Effirim Botchey ◽

Zhen Qin ◽

Kwesi Hughes-Lartey

Keyword(s):

Developing Countries ◽

Support Vector Machines ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Mobile Money ◽

Vector Machines ◽

Boosted Decision Tree

The onset of COVID-19 has re-emphasized the importance of FinTech especially in developing countries as the major powers of the world are already enjoying the advantages that come with the adoption of FinTech. Handling of physical cash has been established as a means of transmitting the novel corona virus. Again, research has established that, been unbanked raises the potential of sinking one into abject poverty. Over the years, developing countries have been piloting the various forms of FinTech, but the very one that has come to stay is the Mobile Money Transactions (MMT). As mobile money transactions attempt to gain a foothold, it faces several problems, the most important of them is mobile money fraud. This paper seeks to provide a solution to this problem by looking at machine learning algorithms based on support vector machines (kernel-based), gradient boosted decision tree (tree-based) and Naïve Bayes (probabilistic based) algorithms, taking into consideration the imbalanced nature of the dataset. Our experiments showed that the use of gradient boosted decision tree holds a great potential in combating the problem of mobile money fraud as it was able to produce near perfect results.

Download Full-text