Combination with Machine Learning Algorithms for the Classification in E-Bussiness

E-bussiness has grown rapidly in the last decade and massive amount of data on customer purchases, browsing pattern and preferences has been generated. Classification of electronic data plays a pivotal role to mine the valuable information and thus has become one of the most important applications of E-bussiness. Support Vector Machines are popular and powerful machine learning techniques, and they offer state-of-the-art performance. Rough set theory is a formal mathematical tool to deal with incomplete or imprecise information and one of its important applications is feature selection. In this paper, rough set theory and support vector machines are combined to construct a classification model to classify the data of E-bussiness effectively.

Download Full-text

A Hybrid Detecting Fraudulent Financial Statements Model Using Rough Set Theory and Support Vector Machines

Cybernetics & Systems ◽

10.1080/01969722.2016.1158553 ◽

2016 ◽

Vol 47 (4) ◽

pp. 261-276 ◽

Cited By ~ 9

Author(s):

Ching-Chiang Yeh ◽

Der-Jang Chi ◽

Tzu-Yu Lin ◽

Sheng-Hsiung Chiu

Keyword(s):

Support Vector Machines ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Financial Statements ◽

Support Vector ◽

Vector Machines ◽

Fraudulent Financial Statements

Download Full-text

Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory

Scientific Programming ◽

10.1155/2016/5945192 ◽

2016 ◽

Vol 2016 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

N. Pérez-Díaz ◽

D. Ruano-Ordás ◽

F. Fdez-Riverola ◽

J. R. Méndez

Keyword(s):

Machine Learning ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Wide Range ◽

Vector Machines ◽

Bayes Algorithm

Nowadays, spam deliveries represent a major problem to benefit from the wide range of Internet-based communication forms. Despite the existence of different well-known intelligent techniques for fighting spam, only some specific implementations of Naïve Bayes algorithm are finally used in real environments for performance reasons. As long as some of these algorithms suffer from a large number of false positive errors, in this work we propose a rough set postprocessing approach able to significantly improve their accuracy. In order to demonstrate the advantages of the proposed method, we carried out a straightforward study based on a publicly available standard corpus (SpamAssassin), which compares the performance of previously successful well-known antispam classifiers (i.e., Support Vector Machines, AdaBoost, Flexible Bayes, and Naïve Bayes) with and without the application of our developed technique. Results clearly evidence the suitability of our rough set postprocessing approach for increasing the accuracy of previous successful antispam classifiers when working in real scenarios.

Download Full-text

Emotion Recognition from Text based on the Rough Set Theory and the Support Vector Machines

2007 International Conference on Natural Language Processing and Knowledge Engineering ◽

10.1109/nlpke.2007.4368008 ◽

2007 ◽

Cited By ~ 7

Author(s):

Zhi Teng ◽

Fuji Ren ◽

Shingo Kuroiwa

Keyword(s):

Support Vector Machines ◽

Emotion Recognition ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Support Vector ◽

Vector Machines

Download Full-text

Landslide susceptibility mapping based on rough set theory and support vector machines: A case of the Three Gorges area, China

Geomorphology ◽

10.1016/j.geomorph.2013.08.013 ◽

2014 ◽

Vol 204 ◽

pp. 287-301 ◽

Cited By ~ 114

Author(s):

Ling Peng ◽

Ruiqing Niu ◽

Bo Huang ◽

Xueling Wu ◽

Yannan Zhao ◽

...

Keyword(s):

Support Vector Machines ◽

Set Theory ◽

Rough Set ◽

Landslide Susceptibility ◽

Rough Set Theory ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Three Gorges ◽

The Three Gorges ◽

Vector Machines

Download Full-text

A Hybrid Classifier Based on Rough Set Theory and Support Vector Machines

Fuzzy Systems and Knowledge Discovery - Lecture Notes in Computer Science ◽

10.1007/11539506_162 ◽

2005 ◽

pp. 1287-1296 ◽

Cited By ~ 10

Author(s):

Gexiang Zhang ◽

Zhexin Cao ◽

Yajun Gu

Keyword(s):

Support Vector Machines ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Support Vector ◽

Hybrid Classifier ◽

Vector Machines

Download Full-text

Analyzing foreign exchange rates by rough set theory and directed acyclic graph support vector machines

Expert Systems with Applications ◽

10.1016/j.eswa.2010.02.006 ◽

2010 ◽

Vol 37 (8) ◽

pp. 5993-5998 ◽

Cited By ~ 7

Author(s):

Ping-Feng Pai ◽

Shi-Yu Chen ◽

Chao-Wei Huang ◽

Ya-Hsin Chang

Keyword(s):

Support Vector Machines ◽

Exchange Rates ◽

Set Theory ◽

Rough Set ◽

Directed Acyclic Graph ◽

Rough Set Theory ◽

Support Vector ◽

Foreign Exchange Rates ◽

Acyclic Graph ◽

Vector Machines

Download Full-text

Research on Parallel Support Vector Machine Based on Spark Big Data Platform

Scientific Programming ◽

10.1155/2021/7998417 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Yao Huimin

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Support Vector Machines ◽

Cross Validation ◽

Machine Learning Algorithms ◽

Support Vector ◽

Lambda Architecture ◽

Vector Machines ◽

Data Platform

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.

Download Full-text

Linear Support Vector Machines for Prediction of Student Performance in School-Based Education

Mathematical Problems in Engineering ◽

10.1155/2020/4761468 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Nalindren Naicker ◽

Timothy Adeliyi ◽

Jeanette Wing

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Student Performance ◽

State Of The Art ◽

Learning Algorithms ◽

The State ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Support Vector ◽

Vector Machines

Educational Data Mining (EDM) is a rich research field in computer science. Tools and techniques in EDM are useful to predict student performance which gives practitioners useful insights to develop appropriate intervention strategies to improve pass rates and increase retention. The performance of the state-of-the-art machine learning classifiers is very much dependent on the task at hand. Investigating support vector machines has been used extensively in classification problems; however, the extant of literature shows a gap in the application of linear support vector machines as a predictor of student performance. The aim of this study was to compare the performance of linear support vector machines with the performance of the state-of-the-art classical machine learning algorithms in order to determine the algorithm that would improve prediction of student performance. In this quantitative study, an experimental research design was used. Experiments were set up using feature selection on a publicly available dataset of 1000 alpha-numeric student records. Linear support vector machines benchmarked with ten categorical machine learning algorithms showed superior performance in predicting student performance. The results of this research showed that features like race, gender, and lunch influence performance in mathematics whilst access to lunch was the primary factor which influences reading and writing performance.

Download Full-text

Twitter sentiment analysis for the estimation of voting intention in the 2017 Chilean elections

Intelligent Data Analysis ◽

10.3233/ida-194768 ◽

2020 ◽

Vol 24 (5) ◽

pp. 1141-1160

Author(s):

Tomás Alegre Sepúlveda ◽

Brian Keith Norambuena

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Sentiment Analysis ◽

Classification Model ◽

Machine Learning Techniques ◽

Support Vector ◽

Traditional Methods ◽

Actual Result ◽

Learning Techniques ◽

Vector Machines

In this paper, we apply sentiment analysis methods in the context of the first round of the 2017 Chilean elections. The purpose of this work is to estimate the voting intention associated with each candidate in order to contrast this with the results from classical methods (e.g., polls and surveys). The data are collected from Twitter, because of its high usage in Chile and in the sentiment analysis literature. We obtained tweets associated with the three main candidates: Sebastián Piñera (SP), Alejandro Guillier (AG) and Beatriz Sánchez (BS). For each candidate, we estimated the voting intention and compared it to the traditional methods. To do this, we first acquired the data and labeled the tweets as positive or negative. Afterward, we built a model using machine learning techniques. The classification model had an accuracy of 76.45% using support vector machines, which yielded the best model for our case. Finally, we use a formula to estimate the voting intention from the number of positive and negative tweets for each candidate. For the last period, we obtained a voting intention of 35.84% for SP, compared to a range of 34–44% according to traditional polls and 36% in the actual elections. For AG we obtained an estimate of 37%, compared with a range of 15.40% to 30.00% for traditional polls and 20.27% in the elections. For BS we obtained an estimate of 27.77%, compared with the range of 8.50% to 11.00% given by traditional polls and an actual result of 22.70% in the elections. These results are promising, in some cases providing an estimate closer to reality than traditional polls. Some differences can be explained due to the fact that some candidates have been omitted, even though they held a significant number of votes.

Download Full-text

LOCAL DESCRIPTOR MATCHING WITH SUPPORT VECTOR MACHINES

International Journal of Information Acquisition ◽

10.1142/s0219878910002051 ◽

2010 ◽

Vol 07 (01) ◽

pp. 59-80

Author(s):

D. CHENG ◽

S. Q. XIE ◽

E. HÄMMERLE

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Local Descriptor ◽

Local Descriptors ◽

Vector Machines ◽

Three Stages ◽

Image Transformations

Local descriptor matching is the most overlooked stage of the three stages of the local descriptor process, and this paper proposes a new method for matching local descriptors based on support vector machines. Results from experiments show that the developed method is more robust for matching local descriptors for all image transformations considered. The method is able to be integrated with different local descriptor methods, and with different machine learning algorithms and this shows that the approach is sufficiently robust and versatile.

Download Full-text