scholarly journals Study of SVM algorithm for Data Mining in Big Data

2021 ◽  
Vol 58 (1) ◽  
pp. 4296-4301
Author(s):  
V. Nanda Kumar, Vinoth N. A. S. , Mohamed Sanjarkhan

Data mining is a process which finds useful patterns from large amount of data.These days, there are an excessive number of Data Mining Algorithms present. Support Vector Machine (SVM) is assuming a crucial function as it gives strategies so as to acquire brings about a viable route and with an elevated level of value. In this paper, we examine about the function of SVM calculation in large information from information mining viewpoint undertakings like order, bunching, expectation, estimating and others applications. In current situation world is comprised of "huge information". The principle point of this paper is to unmistakably comprehend the premise of SVM procedures in different zones. In our perspective, we have assessed the quantity of exploration distributions that have been advanced in various rumored diaries for the information mining applications and furthermore recommended a potential number of issues of SVM.

2019 ◽  
Vol 16 (9) ◽  
pp. 3849-3853
Author(s):  
Dar Masroof Amin ◽  
Atul Garg

The globalisation of Internet is creating enormous amount of data on servers. The data created during last two years is itself equivalent to the data created during all these years. This exponential creation of data is due to the easy access to devices based on Internet of things. This information has become a source of predictive analysis for future happenings. The versatile use of computing devices is creating data of diverse nature and the analysts are predicting the future trend using data of their respective domain. The technology used to analyse the data has become a bottleneck over the time. The main reason behind this is that the rate with which the data is getting created is much more than the technology used to access the same. There are various mining techniques used to explore the useful information. In this research there is detailed analysis of how data is used and perceived by various data mining algorithms. Mining algorithms like Naïve Bayes, Support Vector Machines, Linear Discriminant Analysis Algorithm, Artificial Neural Networks, C4.5, C5.0, K-Nearest Neighbour are analysed. The input data used in these algorithms is big data files. This research mainly focuses on how the existing data algorithms are interacting with big data files. The research has been done on twitter comments.


2021 ◽  
Vol 3 (2) ◽  
pp. 1-9
Author(s):  
Yosra Mohammed ◽  
Sherko Murad ◽  
Brzu Tahir

Climate change has a historical impact at universal and local levels over the past era. Climate change is one of the greatest challenge issues in the globe for meteorological research. Air temperature estimation, in particular, has been measured as a significant feature in weather impression studies on industrial sectors, environmental, ecological, and agricultural. Accurately predicting air temperature guides to measure lifestyle, perform a key character for the government, industries, and public in development activities. In this paper, we investigate the use of various data mining approaches such as Support Vector Machine (SVM), Decision tree (DT), and Naïve Bayes for air temperature prediction within Sulaymaniyah City in Kurdistan, IRAQ. The metrological data is collected from the local Weather Forecast Department in the city within the range 2013 to 2018 inclusive. A dataset for the metrological data was developed and used to train the data mining algorithms. The proposed data mining algorithms were tested on the dataset to predict the air temperature and the performance of these algorithms were compared using standard performance metrics. Support vector machine has accomplished promising performance among using algorithms


The healthcare industry assembles massive volume of healthcare information or data that circulate the information into useful data. In everyday life several factors that affect the human diseases. Hospitals are producing large amount of information related to patients. This paper describes the various data mining algorithms such as neural network, support vector machine, KNN, decision tree etc. and provides an overall brief of the existing work. The major advantage of using data mining is that to identify the structures.


Author(s):  
Efat Jabarpour ◽  
Amin Abedini ◽  
Abbasali Keshtkar

Introduction: Osteoporosis is a disease that reduces bone density and loses the quality of bone microstructure leading to an increased risk of fractures. It is one of the major causes of inability and death in elderly people. The current study aims at determining the factors influencing the incidence of osteoporosis and providing a predictive model for the disease diagnosis to increase the diagnostic speed and reduce diagnostic costs. Methods: An Individual's data including personal information, lifestyle, and disease information were reviewed. A new model has been presented based on the Cross-Industry Standard Process CRISP methodology. Besides, Support Vector Machine (SVM) and Bayes methods (Tree Augmented Naïve Bayes (TAN)) and Clementine12 have been used as data mining tools. Results: Some features have been detected to affect this disease. The rules have been extracted that can be used as a pattern for the prediction of the patients' status. Classification precision was calculated to be 88.39% for SVM, and 91.29% for  (TAN) when the precision of  TAN  is higher comparing to other methods. Conclusion: The most effective factors concerning osteoporosis are detected and can be used for a new sample with defined characteristics to predict the possibility of osteoporosis in a person.  


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Thiago Cesar de Oliveira ◽  
Lúcio de Medeiros ◽  
Daniel Henrique Marco Detzel

Purpose Real estate appraisals are becoming an increasingly important means of backing up financial operations based on the values of these kinds of assets. However, in very large databases, there is a reduction in the predictive capacity when traditional methods, such as multiple linear regression (MLR), are used. This paper aims to determine whether in these cases the application of data mining algorithms can achieve superior statistical results. First, real estate appraisal databases from five towns and cities in the State of Paraná, Brazil, were obtained from Caixa Econômica Federal bank. Design/methodology/approach After initial validations, additional databases were generated with both real, transformed and nominal values, in clean and raw data. Each was assisted by the application of a wide range of data mining algorithms (multilayer perceptron, support vector regression, K-star, M5Rules and random forest), either isolated or combined (regression by discretization – logistic, bagging and stacking), with the use of 10-fold cross-validation in Weka software. Findings The results showed more varied incremental statistical results with the use of algorithms than those obtained by MLR, especially when combined algorithms were used. The largest increments were obtained in databases with a large amount of data and in those where minor initial data cleaning was carried out. The paper also conducts a further analysis, including an algorithmic ranking based on the number of significant results obtained. Originality/value The authors did not find similar studies or research studies conducted in Brazil.


2015 ◽  
Vol 813-814 ◽  
pp. 1104-1113 ◽  
Author(s):  
A. Sumesh ◽  
Dinu Thomas Thekkuden ◽  
Binoy B. Nair ◽  
K. Rameshkumar ◽  
K. Mohandas

The quality of weld depends upon welding parameters and exposed environment conditions. Improper selection of welding process parameter is one of the important reasons for the occurrence of weld defect. In this work, arc sound signals are captured during the welding of carbon steel plates. Statistical features of the sound signals are extracted during the welding process. Data mining algorithms such as Naive Bayes, Support Vector Machines and Neural Network were used to classify the weld conditions according to the features of the sound signal. Two weld conditions namely good weld and weld with defects namely lack of fusion, and burn through were considered in this study. Classification efficiencies of machine learning algorithms were compared. Neural network is found to be producing better classification efficiency comparing with other algorithms considered in this study.


2013 ◽  
Vol 333-335 ◽  
pp. 1344-1348
Author(s):  
Yu Kai Yao ◽  
Yang Liu ◽  
Zhao Li ◽  
Xiao Yun Chen

Support Vector Machine (SVM) is one of the most popular and effective data mining algorithms which can be used to resolve classification or regression problems, and has attracted much attention these years. SVM could find the optimal separating hyperplane between classes, which afford outstanding generalization ability with it. Usually all the labeled records are used as training set. However, the optimal separating hyperplane only depends on a few crucial samples (Support Vectors, SVs), we neednt train SVM model on the whole training set. In this paper a novel SVM model based on K-means clustering is presented, in which only a small subset of the original training set is selected to constitute the final training set, and the SVM classifier is built through training on these selected samples. This greatly decrease the scale of the training set, and effectively saves the training and predicting cost of SVM, meanwhile guarantees its generalization performance.


Sign in / Sign up

Export Citation Format

Share Document