Analysis of rainfall classification over Tanah Laut disrict based on global climate indicators using support vector machine method

Abstract The Support Vector Machine (SVM) classification method can be applied in various fields, one of which is meteorology and climatology in rainfall forecasting. Thus, a study was conducted by classifying rainfall to recognize the relationship between global phenomena and rainfall and the results of applying the classification using the SVM method to rainfall in the Tanah Laut Regency. The analysis is carried out using the SVM Multiclass concept with 4 categories of rainfall classification: low, medium, high, and Extreme. The kernel used in SVM is the RBF kernel with optimization parameters used, namely Cost (C) 1,5,10,15 and Gamma (γ) 1,5,10,15. The dataset formed is based on the annual period, climatic conditions, and seasonality. The Spearman Rank correlation test describes the relationship between global phenomena and rainfall with a correlation range of (−0.1456 ) − (0.43144) for the entire dataset. The implementation of the SVM classification method shows that the Cost (C) 10 and Gamma (γ) ≥ 5 parameters obtained the highest accuracy of 100% on the training data. In contrast, in testing the data testing, the accuracy was good, namely the accuracy of 78.00% in La Nina and 81.38% in seasonal periods.

Download Full-text

Detection Of Spam Comments On Instagram Using Complementary Naïve Bayes

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.47046 ◽

2019 ◽

Vol 13 (3) ◽

pp. 263

Author(s):

Nur Azizul Haqimi ◽

Nur Rokhman ◽

Sigit Priyanta

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Test Data ◽

Training Data ◽

Classification Method ◽

Support Vector ◽

Test Results ◽

Imbalanced Dataset ◽

Web Based ◽

F Measure

Instagram (IG) is a web-based and mobile social media application where users can share photos or videos with available features. Upload photos or videos with captions that contain an explanation of the photo or video that can reap spam comments. Comments on spam containing comments that are not relevant to the caption and photos. The problem that arises when identifying spam is non-spam comments are more dominant than spam comments so that it leads to the problem of the imbalanced dataset. A balanced dataset can influence the performance of a classification method. This is the focus of research related to the implementation of the CNB method in dealing with imbalance datasets for the detection of Instagram spam comments. The study used TF-IDF weighting with Support Vector Machine (SVM) as a comparison classification. Based on the test results with 2500 training data and 100 test data on the imbalanced dataset (25% spam and 75% non-spam), the CNB accuracy was 92%, precision 86% and f-measure 93%. Whereas SVM produces 87% accuracy, 79% precision, 88% f-measure. In conclusion, the CNB method is more suitable for detecting spam comments in cases of imbalanced datasets.

Download Full-text

Efficient Data-Mining Algorithm for Predicting Heart Disease Based on an Angiographic Test

Malaysian Journal of Medical Sciences ◽

10.21315/mjms2021.28.5.12 ◽

2021 ◽

Vol 28 (5) ◽

pp. 118-129

Author(s):

Alabi Waheed Banjoko ◽

◽

Kawthar Opeyemi Abdulazeez ◽

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Heart Disease ◽

Cross Validation ◽

Classification Method ◽

Support Vector ◽

Data Mining Algorithm ◽

Machine Method ◽

Mining Algorithm ◽

Splitting Ratio

Background: The computerised classification and prediction of heart disease can be useful for medical personnel for the purpose of fast diagnosis with accurate results. This study presents an efficient classification method for predicting heart disease using a data-mining algorithm. Methods: The algorithm utilises the weighted support vector machine method for efficient classification of heart disease based on a binary response that indicates the presence or absence of heart disease as the result of an angiographic test. The optimal values of the support vector machine and the Radial Basis Function kernel parameters for the heart disease classification were determined via a 10-fold cross-validation method. The heart disease data was partitioned into training and testing sets using different percentages of the splitting ratio. Each of the training sets was used in training the classification method while the predictive power of the method was evaluated on each of the test sets using the Monte-Carlo cross-validation resampling technique. The effect of different percentages of the splitting ratio on the method was also observed. Results: The misclassification error rate was used to compare the performance of the method with three selected machine learning methods and was observed that the proposed method performs best over others in all cases considered. Conclusion: Finally, the results illustrate that the classification algorithm presented can effectively predict the heart disease status of an individual based on the results of an angiographic test.

Download Full-text

Analysis Sentiment Based on IMDB Aspects from Movie Reviews using SVM

SinkrOn ◽

10.33395/sinkron.v7i1.11204 ◽

2022 ◽

Vol 7 (1) ◽

pp. 39-45

Author(s):

Nur Ghaniaviyanto Ramadhan ◽

Teguh Ikhlas Ramadhan

Keyword(s):

Support Vector Machine ◽

Support Vector ◽

The Internet ◽

Science And Engineering ◽

Machine Method ◽

Support Vector Machine Method ◽

Svm Classification ◽

Star Rating ◽

Svm Model ◽

Classification Pattern

A movie is a spectacle that can be done at a relaxed time. Currently, there are many movies that can be watched via the internet or cinema. Movies that are watched on the internet are sometimes charged to watch so that potential viewers before watching a movie will read comments from users who have watched the movie. The website that is often used to view movie comments today is IMDB. Movie comments are many and varied on the IMDB website, we can see comments based on the star rating aspect. This causes users to have difficulty analyzing other users' comments. So, this study aims to analyze the sentiment of opinions from several comments from IMDB website users using the star rating aspect and will be classified using the support vector machine method (SVM). Sentiment analysis is a classification process to understand the opinions, interactions, and emotions of a document or text. SVM is very efficient for many applications in science and engineering, especially for classification (pattern recognition) problems. In addition to the SVM method, the TF-IDF technique is also used to change the shape of the document into several words. The results obtained by applying the SVM model are 79% accuracy, 75% precision, and 87% recall. The SVM classification is also superior to other methods, namely logistic regression.

Download Full-text

A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

PROMET - Traffic&Transportation ◽

10.7307/ptt.v28i2.1643 ◽

2016 ◽

Vol 28 (2) ◽

pp. 117-124 ◽

Cited By ~ 2

Author(s):

Hongzhuan Zhao ◽

Dihua Sun ◽

Min Zhao ◽

Senlin Cheng

Keyword(s):

Support Vector Machine ◽

Information Fusion ◽

Binary Tree ◽

Classification Method ◽

Traffic Information ◽

Support Vector ◽

Cyber Physical System ◽

Svm Classification ◽

Forecasting Performance ◽

Multi Classification

With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS). Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM) classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.

Download Full-text

DNS Tunneling Detection Method Based on Multilabel Support Vector Machine

Security and Communication Networks ◽

10.1155/2018/6137098 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 8

Author(s):

Ahmed Almusawi ◽

Haleh Amintoosi

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Detection Method ◽

Binary Classification ◽

Experimental Results ◽

Classification Method ◽

Support Vector ◽

Class Label ◽

Svm Classification ◽

Different Types

DNS tunneling is a method used by malicious users who intend to bypass the firewall to send or receive commands and data. This has a significant impact on revealing or releasing classified information. Several researchers have examined the use of machine learning in terms of detecting DNS tunneling. However, these studies have treated the problem of DNS tunneling as a binary classification where the class label is either legitimate or tunnel. In fact, there are different types of DNS tunneling such as FTP-DNS tunneling, HTTP-DNS tunneling, HTTPS-DNS tunneling, and POP3-DNS tunneling. Therefore, there is a vital demand to not only detect the DNS tunneling but rather classify such tunnel. This study aims to propose a multilabel support vector machine in order to detect and classify the DNS tunneling. The proposed method has been evaluated using a benchmark dataset that contains numerous DNS queries and is compared with a multilabel Bayesian classifier based on the number of corrected classified DNS tunneling instances. Experimental results demonstrate the efficacy of the proposed SVM classification method by obtaining an f-measure of 0.80.

Download Full-text

Separation of Convective and Stratiform Precipitation Using Polarimetric Radar Data with A Support Vector Machine Method

10.5194/amt-2019-324 ◽

2019 ◽

Author(s):

Yadong Wang ◽

Lin Tang ◽

Pao-Liang Chang ◽

Yu-Shuang Tang

Keyword(s):

Support Vector Machine ◽

Radar Data ◽

Training Data ◽

Precipitation Event ◽

Support Vector ◽

Polarimetric Radar ◽

Machine Method ◽

Support Vector Machine Method ◽

Separation Index ◽

Stratiform Precipitation

Abstract. A precipitation separation approach using support vector machine method was developed and tested on a C-band polarimetric radar located in Taiwan (RCMK). Different from some existing separation methods that require a whole volume radar data, the proposed approach utilizes the polarimetric radar data from the lowest tilt to classify precipitation echoes into either stratiform or convective type. Through a support vector machine method, the inputs of radar reflectivity, differential reflectivity, and the separation index are utilized in the classification. The feature vector and weight vector in the support vector machine were optimized using well-classified training data. The proposed approach was tested with multiple precipitation events including two widespread mixture of stratiform and convective events, a tropical typhoon precipitation event, and a stratiform precipitation event. In the evaluation, the results from the multi-radar-multi-sensor (MRMS) precipitation classification approach were used as the ground truth, and the performances from proposed approach were further compared with the approach using separation index only with different thresholds. It was found that the proposed method can accurately identify the convective cells from stratiform storms with the radar data only from the lowest scanning tilt. It can produce better results than using the separation index only.

Download Full-text

Analisis Kinerja Support Vector Machine dalam Mengidentifikasi Komentar Perundungan pada Jejaring Sosial

JURNAL MEDIA INFORMATIKA BUDIDARMA ◽

10.30865/mib.v5i2.2923 ◽

2021 ◽

Vol 5 (2) ◽

pp. 475

Author(s):

Ade Clinton Sitepu ◽

Wanayumini Wanayumini ◽

Zakarias Situmorang

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Confusion Matrix ◽

Process Research ◽

Training Data ◽

Support Vector ◽

Media Technology ◽

Svm Classification ◽

Svm Algorithm ◽

Using Data

Cyberbullying is the same as bullying but it is done through media technology. Bullying has often occurred along with the development of social media technology in society. Some technique are needed to filter out bully comments because it will indirectly affect the psychological condition of the reader, morover it is aimed at the person concerned. By using data mining techniques, the system is expected to be able to classify information circulating in the community. This research uses the Support Vector Machine (SVM) classification because the algorithm is good at performing the classification process. Research using about 1000 dataset comments. Data are grouped manually first into the labels "bully" and "not bully" then the data divide into training data and test data. To test the system capability, data is analyzed using confusion matrix. The results showed that the SVM Algorithm was able to classify with an level of accuracy 87.75%, 89% precision and 91% Recal. The SVM algorithm is able to formulate training data with level of accuracy 98.3%

Download Full-text

Support Vector Machine (SVM) Classification: Comparison of Linkage Techniques Using a Clustering-Based Method for Training Data Selection

GIScience & Remote Sensing ◽

10.2747/1548-1603.46.4.411 ◽

2009 ◽

Vol 46 (4) ◽

pp. 411-423 ◽

Cited By ~ 13

Author(s):

Lihong Su ◽

Yuxia Huang

Keyword(s):

Support Vector Machine ◽

Training Data ◽

Data Selection ◽

Support Vector ◽

Svm Classification ◽

Training Data Selection

Download Full-text

Smooth Support Vector Machine for Suicide-Related Behaviours Prediction

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i5.pp3399-3406 ◽

2018 ◽

Vol 8 (5) ◽

pp. 3399

Author(s):

G. Indrawan ◽

I K P Sudiarsa ◽

K. Agustini ◽

Sariyasa Sariyasa

Keyword(s):

Support Vector Machine ◽

Psychiatric Hospital ◽

Medical Records ◽

Training Data ◽

Support Vector ◽

Accuracy Evaluation ◽

Machine Method ◽

Support Vector Machine Method ◽

Further Development ◽

Smooth Support Vector Machine

Suicide-related behaviours need to be prevented on psychiatric patients. Prediction of those behaviours based on patient medical records would be very useful for the prevention by the psychiatric hospital. This research focused on developing this prediction at the only one psychiatric hospital of Bali Province by using Smooth Support Vector Machine method, as the further development of Support Vector Machine. The method used 30.660 patient medical records from the last five years. Data cleaning gave 2665 relevant data for this research, includes 111 patients that have suicide-related behaviours and under active treatment. Those cleaned data then were transformed into ten predictor variables and a response variable. Splitting training and testing data on those transformed data were done for building and accuracy evaluation of the method model. Based on the experiment, the best average accuracy at 63% can be obtained by using 30% of relevant data as data testing and by using training data which has one-to-one ratio in number between patients that have suicide-related behaviours and patients that have no such behaviours. In the future work, accuracy improvement need to be confirmed by using Reduced Support Vector Machine method, as the further development of Smooth Support Vector Machine.

Download Full-text

Optimization of Support Vector Machine Method Using Feature Selection to Improve Classification Results

JISA(Jurnal Informatika dan Sains) ◽

10.31326/jisa.v4i1.881 ◽

2021 ◽

Vol 4 (1) ◽

pp. 22-27

Author(s):

Saikin Saikin ◽

◽

Sofiansyah Fadli ◽

Maulana Ashari ◽

◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Training Data ◽

Support Vector ◽

Feature Selection Technique ◽

Validation Data ◽

Machine Method ◽

Testing Data ◽

The Impact

The performance of the organizations or companiesare based on the qualities possessed by their employee. Both of good or bad employee performance will have an impact on productivity and the impact of profits obtained by the company. Support Vector Machine (SVM) is a machine learning method based on statistical learning theory and can solve high non-linearity, regression, etc. In machine learning, the optimization model is a part for improving the accuracy of the model for data learning. Several techniques are used, one of which is feature selection, namely reducing data dimensions so that it can reduce computation in data modeling. This study aims to apply the method of machine learning to the employee data of the Bank Rakyat Indonesia (BRI) company. The method used is SVM method by increasing the accuracy of learning data by using a feature selection technique using a wrapper algorithm. From the results of the classification test, the average accuracy obtained is 72 percent with a precision value of 71 and the recall value is rounded off to 72 percent, with a combination of SVM and cross-validation. Data obtained from Kaggle data, which consists of training data and testing data. each consisting of 30 columns and 22005 rows in the training data and testing data consisting of 29 col-umns and 6000 rows. The results of this study get a classification score of 82 percent. The precision value obtained is rounded off to 82 percent, a recall of 86 percent and an f1-score of 81 percent.

Download Full-text