Study of SVM algorithm for Data Mining in Big Data

Data mining is a process which finds useful patterns from large amount of data.These days, there are an excessive number of Data Mining Algorithms present. Support Vector Machine (SVM) is assuming a crucial function as it gives strategies so as to acquire brings about a viable route and with an elevated level of value. In this paper, we examine about the function of SVM calculation in large information from information mining viewpoint undertakings like order, bunching, expectation, estimating and others applications. In current situation world is comprised of "huge information". The principle point of this paper is to unmistakably comprehend the premise of SVM procedures in different zones. In our perspective, we have assessed the quantity of exploration distributions that have been advanced in various rumored diaries for the information mining applications and furthermore recommended a potential number of issues of SVM.

Download Full-text

Performance Analysis of Data Mining Algorithms

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8260 ◽

2019 ◽

Vol 16 (9) ◽

pp. 3849-3853

Author(s):

Dar Masroof Amin ◽

Atul Garg

Keyword(s):

Data Mining ◽

Big Data ◽

Future Trend ◽

Easy Access ◽

Support Vector ◽

Linear Discriminant ◽

Data Mining Algorithms ◽

Data Files ◽

Using Data ◽

Mining Algorithms

The globalisation of Internet is creating enormous amount of data on servers. The data created during last two years is itself equivalent to the data created during all these years. This exponential creation of data is due to the easy access to devices based on Internet of things. This information has become a source of predictive analysis for future happenings. The versatile use of computing devices is creating data of diverse nature and the analysts are predicting the future trend using data of their respective domain. The technology used to analyse the data has become a bottleneck over the time. The main reason behind this is that the rate with which the data is getting created is much more than the technology used to access the same. There are various mining techniques used to explore the useful information. In this research there is detailed analysis of how data is used and perceived by various data mining algorithms. Mining algorithms like Naïve Bayes, Support Vector Machines, Linear Discriminant Analysis Algorithm, Artificial Neural Networks, C4.5, C5.0, K-Nearest Neighbour are analysed. The input data used in these algorithms is big data files. This research mainly focuses on how the existing data algorithms are interacting with big data files. The research has been done on twitter comments.

Download Full-text

Air Temperature Prediction Using Different Datamining Approaches In Sulaymaniyah City In Iraq

10.24271/psr.21 ◽

2021 ◽

Vol 3 (2) ◽

pp. 1-9

Author(s):

Yosra Mohammed ◽

Sherko Murad ◽

Brzu Tahir

Keyword(s):

Climate Change ◽

Data Mining ◽

Support Vector Machine ◽

Air Temperature ◽

Significant Feature ◽

Support Vector ◽

Temperature Prediction ◽

Data Mining Algorithms ◽

Air Temperature Prediction ◽

Mining Algorithms

Climate change has a historical impact at universal and local levels over the past era. Climate change is one of the greatest challenge issues in the globe for meteorological research. Air temperature estimation, in particular, has been measured as a significant feature in weather impression studies on industrial sectors, environmental, ecological, and agricultural. Accurately predicting air temperature guides to measure lifestyle, perform a key character for the government, industries, and public in development activities. In this paper, we investigate the use of various data mining approaches such as Support Vector Machine (SVM), Decision tree (DT), and Naïve Bayes for air temperature prediction within Sulaymaniyah City in Kurdistan, IRAQ. The metrological data is collected from the local Weather Forecast Department in the city within the range 2013 to 2018 inclusive. A dataset for the metrological data was developed and used to train the data mining algorithms. The proposed data mining algorithms were tested on the dataset to predict the air temperature and the performance of these algorithms were compared using standard performance metrics. Support vector machine has accomplished promising performance among using algorithms

Download Full-text

Assessment of land subsidence susceptibility in Semnan plain (Iran): a comparison of support vector machine and weights of evidence data mining algorithms

Natural Hazards ◽

10.1007/s11069-019-03785-z ◽

2019 ◽

Vol 99 (2) ◽

pp. 951-971 ◽

Cited By ~ 11

Author(s):

Majid Mohammady ◽

Hamid Reza Pourghasemi ◽

Mojtaba Amiri

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Land Subsidence ◽

Weights Of Evidence ◽

Support Vector ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

A Review Study on Data Mining Algorithms for Prediction Diseases

International Journal for Research in Engineering Application & Management ◽

10.35291/2454-9150.2020.0340 ◽

2020 ◽

pp. 504-507

Keyword(s):

Neural Network ◽

Data Mining ◽

Support Vector Machine ◽

Support Vector ◽

Healthcare Industry ◽

Network Support ◽

Data Mining Algorithms ◽

Review Study ◽

Using Data ◽

Mining Algorithms

The healthcare industry assembles massive volume of healthcare information or data that circulate the information into useful data. In everyday life several factors that affect the human diseases. Hospitals are producing large amount of information related to patients. This paper describes the various data mining algorithms such as neural network, support vector machine, KNN, decision tree etc. and provides an overall brief of the existing work. The major advantage of using data mining is that to identify the structures.

Download Full-text

Osteoporosis Risk Prediction Using Data Mining Algorithms

Journal of Community Health Research ◽

10.18502/jchr.v9i2.3401 ◽

2020 ◽

Author(s):

Efat Jabarpour ◽

Amin Abedini ◽

Abbasali Keshtkar

Keyword(s):

Data Mining ◽

Personal Information ◽

Disease Diagnosis ◽

Support Vector ◽

Data Mining Algorithms ◽

Industry Standard ◽

Disease Information ◽

Increased Risk ◽

Using Data ◽

Mining Algorithms

Introduction: Osteoporosis is a disease that reduces bone density and loses the quality of bone microstructure leading to an increased risk of fractures. It is one of the major causes of inability and death in elderly people. The current study aims at determining the factors influencing the incidence of osteoporosis and providing a predictive model for the disease diagnosis to increase the diagnostic speed and reduce diagnostic costs. Methods: An Individual's data including personal information, lifestyle, and disease information were reviewed. A new model has been presented based on the Cross-Industry Standard Process CRISP methodology. Besides, Support Vector Machine (SVM) and Bayes methods (Tree Augmented Naïve Bayes (TAN)) and Clementine12 have been used as data mining tools. Results: Some features have been detected to affect this disease. The rules have been extracted that can be used as a pattern for the prediction of the patients' status. Classification precision was calculated to be 88.39% for SVM, and 91.29% for (TAN) when the precision of TAN is higher comparing to other methods. Conclusion: The most effective factors concerning osteoporosis are detected and can be used for a new sample with defined characteristics to predict the possibility of osteoporosis in a person.

Download Full-text

Applying data mining algorithms to real estate appraisals: a comparative study

International Journal of Housing Markets and Analysis ◽

10.1108/ijhma-07-2020-0080 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Thiago Cesar de Oliveira ◽

Lúcio de Medeiros ◽

Daniel Henrique Marco Detzel

Keyword(s):

Data Mining ◽

Real Estate ◽

Support Vector ◽

Predictive Capacity ◽

Content Type ◽

Data Mining Algorithms ◽

Wide Range ◽

Very Large Databases ◽

Mining Algorithms ◽

Statistical Results

Purpose Real estate appraisals are becoming an increasingly important means of backing up financial operations based on the values of these kinds of assets. However, in very large databases, there is a reduction in the predictive capacity when traditional methods, such as multiple linear regression (MLR), are used. This paper aims to determine whether in these cases the application of data mining algorithms can achieve superior statistical results. First, real estate appraisal databases from five towns and cities in the State of Paraná, Brazil, were obtained from Caixa Econômica Federal bank. Design/methodology/approach After initial validations, additional databases were generated with both real, transformed and nominal values, in clean and raw data. Each was assisted by the application of a wide range of data mining algorithms (multilayer perceptron, support vector regression, K-star, M5Rules and random forest), either isolated or combined (regression by discretization – logistic, bagging and stacking), with the use of 10-fold cross-validation in Weka software. Findings The results showed more varied incremental statistical results with the use of algorithms than those obtained by MLR, especially when combined algorithms were used. The largest increments were obtained in databases with a large amount of data and in those where minor initial data cleaning was carried out. The paper also conducts a further analysis, including an algorithmic ranking based on the number of significant results obtained. Originality/value The authors did not find similar studies or research studies conducted in Brazil.

Download Full-text

Big Data Mining Algorithms for Predicting Dynamic Product Price by Online Analysis

Advances in Intelligent Systems and Computing - Computational Intelligence in Data Mining ◽

10.1007/978-981-13-8676-3_59 ◽

2019 ◽

pp. 701-708

Author(s):

Manjushree Nayak ◽

Bhavana Narain

Keyword(s):

Data Mining ◽

Big Data ◽

Product Price ◽

Online Analysis ◽

Big Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

Acoustic Signature Based Weld Quality Monitoring for SMAW Process Using Data Mining Algorithms

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.813-814.1104 ◽

2015 ◽

Vol 813-814 ◽

pp. 1104-1113 ◽

Cited By ~ 5

Author(s):

A. Sumesh ◽

Dinu Thomas Thekkuden ◽

Binoy B. Nair ◽

K. Rameshkumar ◽

K. Mohandas

Keyword(s):

Neural Network ◽

Data Mining ◽

Welding Process ◽

Machine Learning Algorithms ◽

Steel Plates ◽

Support Vector ◽

Welding Parameters ◽

Process Data ◽

Data Mining Algorithms ◽

Mining Algorithms

The quality of weld depends upon welding parameters and exposed environment conditions. Improper selection of welding process parameter is one of the important reasons for the occurrence of weld defect. In this work, arc sound signals are captured during the welding of carbon steel plates. Statistical features of the sound signals are extracted during the welding process. Data mining algorithms such as Naive Bayes, Support Vector Machines and Neural Network were used to classify the weld conditions according to the features of the sound signal. Two weld conditions namely good weld and weld with defects namely lack of fusion, and burn through were considered in this study. Classification efficiencies of machine learning algorithms were compared. Neural network is found to be producing better classification efficiency comparing with other algorithms considered in this study.

Download Full-text

Parallel Primitives for Vendor-Agnostic Implementation of Big Data Mining Algorithms

2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA) ◽

10.1109/waina.2018.00118 ◽

2018 ◽

Author(s):

Cesare Bandirali ◽

Stefano Lodi ◽

Gianluca Moro ◽

Andrea Pagliarani ◽

Claudio Sartori

Keyword(s):

Data Mining ◽

Big Data ◽

Big Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

An Effective K-Means Clustering Based SVM Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.333-335.1344 ◽

2013 ◽

Vol 333-335 ◽

pp. 1344-1348

Author(s):

Yu Kai Yao ◽

Yang Liu ◽

Zhao Li ◽

Xiao Yun Chen

Keyword(s):

Support Vector ◽

Svm Classifier ◽

Small Subset ◽

Training Set ◽

Data Mining Algorithms ◽

Svm Algorithm ◽

Svm Model ◽

Separating Hyperplane ◽

Regression Problems ◽

Mining Algorithms

Support Vector Machine (SVM) is one of the most popular and effective data mining algorithms which can be used to resolve classification or regression problems, and has attracted much attention these years. SVM could find the optimal separating hyperplane between classes, which afford outstanding generalization ability with it. Usually all the labeled records are used as training set. However, the optimal separating hyperplane only depends on a few crucial samples (Support Vectors, SVs), we neednt train SVM model on the whole training set. In this paper a novel SVM model based on K-means clustering is presented, in which only a small subset of the original training set is selected to constitute the final training set, and the SVM classifier is built through training on these selected samples. This greatly decrease the scale of the training set, and effectively saves the training and predicting cost of SVM, meanwhile guarantees its generalization performance.

Download Full-text