Deteksi Penyakit Dengue Hemorrhagic Fever dengan Pendekatan One Class Classification

Abstrak— Pada kasus deteksi penderita penyakit demam berdarah (Dengue Hemorrhagic Fever- DHF), data training yang tersedia umumnya hanya data pasien penderita positif. Sedangkan data orang normal (data negatif) tidak tersedia secara khusus. Pada makalah ini dipaparkan pembangunan model klasifikasi untuk deteksi DHF dengan pendekatan One Class Classification (OCC). Data yang digunakan pada penelitian ini adalah hasil uji darah dari laboratorium dari pasien penderita penyakit demam berdarah. Metode yang diteliti adalah One-class Support Vector Machine dan K-Means. Hasil yang diperoleh pada penelitian ini adalah untuk metode SVM memiliki nilai precision = 1,0, recall = 0,993, f-1 score = 0,997, dan tingkat akurasi sebesar 99,7% sedangkan dengan metode K-Means diperoleh nilai precision = 0,901, recall = 0,973, f-1 score = 0,936, dan tingkat akurasi sebesar 93,3%. Hal ini menunjukkan bahwa metode SVM sedikit lebih unggul dibandingkan dengan K-Means untuk kasus ini. Kata Kunci— demam berdarah, Dengue Hemorrhagic Fever, K-Means, One Class Classification, OSVMAbstract— Two class classification problem maps input into two target classes. In certain cases, training data is available only in the form of a single class, as in the case of Dengue Hemorrhagic Fever (DHF) patients, where only data of positive patients is available. In this paper, we report our experiment in building a classification model for detecting DHF infection using One Class Classification (OCC) approach. Data from this study is sourced from laboratory tests of patients with dengue fever. The OCC methods compared are One-Class Support Vector Machine and One-Class K-Means. The result shows SVM method obtained precision value = 1.0, recall = 0.993, f-1 score = 0.997, and accuracy of 99.7% while the K-Means method obtained precision value = 0.901, recall = 0.973, f- 1 score = 0.936, and accuracy of 93.3%. This indicates that the SVM method is slightly superior to K-Means for One-Class Classification of DHF patients. Keywords— Dengue Hemorrhagic Fever, K-Means, One Class Classification, OSVM

Download Full-text

Analisis Sentimen Twitter untuk Teks Berbahasa Indonesia dengan Maximum Entropy dan Support Vector Machine

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.3499 ◽

2014 ◽

Vol 8 (1) ◽

pp. 91 ◽

Cited By ~ 5

Author(s):

Noviah Dwi Putranti ◽

Edi Winarko

Keyword(s):

Support Vector Machine ◽

Maximum Entropy ◽

Social Networking Site ◽

Training Data ◽

Classification Model ◽

Support Vector ◽

Public Sentiment ◽

Pos Tagger ◽

Negative Sentiment ◽

Bahasa Indonesia

AbstrakAnalisis sentimen dalam penelitian ini merupakan proses klasifikasi dokumen tekstual ke dalam dua kelas, yaitu kelas sentimen positif dan negatif. Data opini diperoleh dari jejaring sosial Twitter berdasarkan query dalam Bahasa Indonesia. Penelitian ini bertujuan untuk menentukan sentimen publik terhadap objek tertentu yang disampaikan di Twitter dalam bahasa Indonesia, sehingga membantu usaha untuk melakukan riset pasar atas opini publik. Data yang sudah terkumpul dilakukan proses preprocessing dan POS tagger untuk menghasilkan model klasifikasi melalui proses pelatihan. Teknik pengumpulan kata yang memiliki sentimen dilakukan dengan pendekatan berdasarkan kamus, yang dihasilkan dalam penelitian ini berjumlah 18.069 kata. Algoritma Maximum Entropy digunakan untuk POS tagger dan algoritma yang digunakan untuk membangun model klasifikasi atas data pelatihan dalam penelitian ini adalah Support Vector Machine. Fitur yang digunakan adalah unigram dengan fitur pembobotan TFIDF. Implementasi klasifikasi diperoleh akurasi 86,81 % pada pengujian 7 fold cross validation untuk tipe kernel Sigmoid. Pelabelan kelas secara manual dengan POS tagger menghasilkan akurasi 81,67%. Kata kunci—analisis sentimen, klasifikasi, maximum entropy POS tagger, support vector machine, twitter. AbstractSentiment analysis in this research classified textual documents into two classes, positive and negative sentiment. Opinion data obtained a query from social networking site Twitter of Indonesian tweet. This research uses Indonesian tweets. This study aims to determine public sentiment toward a particular object presented in Twitter businesses conduct market. Collected data then prepocessed to help POS tagged to generate classification models through the training process. Sentiment word collection has done the dictionary based approach, which is generated in this study consists 18.069 words. Maximum Entropy algorithm is used for POS tagger and the algorithms used to build the classification model on the training data is Support Vector Machine. The unigram features used are the features of TFIDF weighting.Classification implementation 86,81 % accuration at examination of 7 validation cross fold for the type of kernel of Sigmoid. Class labeling manually with POS tagger yield accuration 81,67 %. Keywords—sentiment analysis, classification, maximum entropy POS tagger, support vector machine, twitter.

Download Full-text

Peringkasan dan Support Vector Machine pada Klasifikasi Dokumen

JURNAL INFOTEL ◽

10.20895/infotel.v9i4.312 ◽

2017 ◽

Vol 9 (4) ◽

pp. 416 ◽

Cited By ~ 1

Author(s):

Nelly Indriani Widiastuti ◽

Ednawati Rainarli ◽

Kania Evita Dewi

Keyword(s):

Support Vector Machine ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Support Vector ◽

Good Reputation ◽

Multiclass Support Vector Machine ◽

Simple Logistic ◽

Better Than

Classification is the process of grouping objects that have the same features or characteristics into several classes. The automatic documents classification use words frequency that appears on training data as features. The large number of documents cause the number of words that appears as a feature will increase. Therefore, summaries are chosen to reduce the number of words that used in classification. The classification uses multiclass Support Vector Machine (SVM) method. SVM was considered to have a good reputation in the classification. This research tests the effect of summary as selection features into documents classification. The summaries reduce text into 50%. A result obtained that the summaries did not affect value accuracy of classification of documents that use SVM. But, summaries improve the accuracy of Simple Logistic Classifier. The classification testing shows that the accuracy of Naïve Bayes Multinomial (NBM) better than SVM

Download Full-text

An one-class classification support vector machine model by interval-valued training data

Knowledge-Based Systems ◽

10.1016/j.knosys.2016.12.022 ◽

2017 ◽

Vol 120 ◽

pp. 43-56 ◽

Cited By ~ 14

Author(s):

Lev V. Utkin ◽

Yulia A. Zhuk

Keyword(s):

Support Vector Machine ◽

Support Vector Machine Model ◽

Training Data ◽

Support Vector ◽

Machine Model ◽

One Class Classification ◽

Interval Valued

Download Full-text

Support Vector Machine Classifier with WHM Offset for Unbalanced Data

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2008.p0094 ◽

2008 ◽

Vol 12 (1) ◽

pp. 94-101 ◽

Cited By ~ 3

Author(s):

Boyang Li ◽

◽

Jinglu Hu ◽

Kotaro Hirasawa

Keyword(s):

Support Vector Machine ◽

Real World ◽

Support Vector Machine Classifier ◽

Classification Problem ◽

Training Data ◽

Support Vector ◽

Svm Classifier ◽

Real World Data ◽

Unbalanced Classification ◽

Improved Support Vector Machine

We propose an improved support vector machine (SVM) classifier by introducing a new offset, for solving the real-world unbalanced classification problem. The new offset is calculated based on the unbalanced support vectors resulting from the unbalanced training data. We developed a weighted harmonic mean (WHM) algorithm to further reduce the effects of noise on offset calculation. We apply the proposed approach to classify real-world data. Results of simulation demonstrate the effectiveness of our proposed approach.

Download Full-text

An improved one-class support vector machine classifier for outlier detection

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1177/0954406214537475 ◽

2014 ◽

Vol 229 (3) ◽

pp. 580-588 ◽

Cited By ~ 7

Author(s):

Wenjuan An ◽

Mangui Liang ◽

He Liu

Keyword(s):

Outlier Detection ◽

Support Vector Machine Classifier ◽

Dimensional Space ◽

Classification Problem ◽

Training Data ◽

Support Vector ◽

Svm Classifier ◽

Higher Dimensional ◽

Real World Datasets ◽

One Class Classification

Outlier detection, as a type of one-class classification problem, is one of important research topics in data mining and machine learning. Its task is to identify sample points markedly deviating from the normal data. A reliable outlier detector needs to build a model which encloses the normal data tightly. In this paper, an improved one-class SVM (OC-SVM) classifier is proposed for outlier detection problems. We name this method OC-SVM with minimum within-class scatter (OC-WCSSVM), which exploits the inner-class structure of the training set via minimizing the within-class scatter of the training data. This can construct a more accurate hyperplane for outlier detection, such that the margin between the training data and the origin in a higher dimensional space is as large as possible, while at the same time the decision boundary around the normal data is as tight as possible. Experimental results on a synthetic dataset and 10 real-world datasets demonstrate that our proposed OC-WCSSVM algorithm is effective and superior to the compared algorithms.

Download Full-text

Multi-Class Support Vector Machine via Maximizing Multi-Class Margins

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/440 ◽

2017 ◽

Cited By ~ 8

Author(s):

Jie Xu ◽

Xianglong Liu ◽

Zhouyuan Huo ◽

Cheng Deng ◽

Feiping Nie ◽

...

Keyword(s):

Support Vector Machine ◽

Binary Classification ◽

Training Data ◽

Classification Model ◽

Support Vector ◽

Frobenius Norm ◽

Great Success ◽

Svm Model ◽

Binary Classifiers ◽

And Training

Support Vector Machine (SVM) is originally proposed as a binary classification model, and it has already achieved great success in different applications. In reality, it is more often to solve a problem which has more than two classes. So, it is natural to extend SVM to a multi-class classifier. There have been many works proposed to construct a multi-class classifier based on binary SVM, such as one versus all strategy, one versus one strategy and Weston's multi-class SVM. One versus all strategy and one versus one strategy split the multi-class problem to multiple binary classification subproblems, and we need to train multiple binary classifiers. Weston's multi-class SVM is formed by ensuring risk constraints and imposing a specific regularization, like Frobenius norm. It is not derived by maximizing the margin between hyperplane and training data which is the motivation in SVM. In this paper, we propose a multi-class SVM model from the perspective of maximizing margin between training points and hyperplane, and analyze the relation between our model and other related methods. In the experiment, it shows that our model can get better or compared results when comparing with other related methods.

Download Full-text

Phishing Website Classification using Least Square Twin Support Vector Machine

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a3905.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 2063-2068

Keyword(s):

Support Vector Machine ◽

Cyber Security ◽

Classification Accuracy ◽

Personal Information ◽

Classification Problem ◽

Least Square ◽

Twin Support Vector Machine ◽

Support Vector ◽

Security Issue

Phishing is one among the luring procedures used by phishing attackers in the means to abuse the personal details of clients. Phishing is earnest cyber security issue that includes facsimileing legitimate website to apostatize online users so as to purloin their personal information. Phishing can be viewed as special type of classification problem where the classifier is built from substantial number of website's features. It is required to identify the best features for improving classifiers accuracy. This study, highlights on the important features of websites that are used to classify the phishing website and form the legitimate ones by presenting a scheme Decision Tree Least Square Twin Support Vector Machine (DT-LST-SVM) for the classification of phishing website. UCI public domain benchmark website phishing dataset was used to conduct the experiment on the proposed classifier with different kernel function and calculate the classification accuracy of the classifiers. Computational results show that DT-LST-SVM scheme yield the better classification accuracy with phishing websites classification dataset

Download Full-text

Classification of Spam Text using SVM

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/08437 ◽

2021 ◽

Vol 23 (08) ◽

pp. 616-624

Author(s):

Gaddam Akhil Reddy ◽

◽

Dr. B. Indira Reddy ◽

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Text Messages ◽

Training Data ◽

Support Vector ◽

Spam Detection ◽

Advantages And Disadvantages ◽

Machine Learning Model ◽

Model Support

The necessity for spam detection is particularly pertinent nowadays, as there is no quality control over social media, and users have the ability to distribute unverified material, therefore facilitating fraud and deceit. Spam detection can aid in the prevention of such fraud. This scenario has developed mostly as a result of the distribution of disparate, unconfirmed information via shopping websites, emails, and text messages (SMS). There are several ways of categorising and identifying spam. Each of them has certain advantages and disadvantages. The machine learning model “Support Vector Machine” is employed to detect spam in this case. SVM is a basic concept: the method proposes a line or hyperplane to classify the data. The model can categorise any type of text into a given category after being fed a set of labelled training data for each category.

Download Full-text

ANALISIS SENTIMEN PUBLIK PADA MEDIA SOSIAL TWITTER TERHADAP PELAKSANAAN PILKADA SERENTAK MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE

CCIT Journal ◽

10.33050/ccit.v10i2.539 ◽

2017 ◽

Vol 10 (2) ◽

pp. 197-206

Author(s):

Atika Rahmawati ◽

Aris Marjuni ◽

Junta Zeniarja

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Training Data ◽

Support Vector ◽

Public Response ◽

The People ◽

Word Detection ◽

Twitter Users ◽

Media Sentiment

Pilkada Serentak is a very important event for the future viability regions and countries. Through this election people can cast their vote and elect representatives of the people according to their choice. Public respond can be expressed through twitter social media. Using twitter social media sentiment analysis can then be made about the public response to the implementation of the election simultaneously. The classification process can be detected via text tweeted by twitter users. In this study, the classification of responses detected by text because it is easily obtained and applied. This study determined the classification of the response to the Indonesian language text and increase accuracy by using SVM.Tweet classification method used by the categorical approach is divided into two classes tweet basic level: positive and negative. Data collected from Indonesian twitter tweet as much as 3000. The labeling is not done manually but using clustering method that divides the 3000 data into two groups. Cluster 1 as a group of positive tweets and Cluster 2 as a negative group tweet.2700 for training data and 300 for the test data. The stage of pre-processing the data includetokenization, casenormalization, stop word detection, and stemming. The process of classification using Support Vector Machine (SVM). Accuracy of SVM showed the highest yield that is 91% compared to the k-means clustering with the results of 82%.

Download Full-text

Organ-Based Medical Image Classification Using Support Vector Machine

International Journal of Synthetic Emotions ◽

10.4018/ijse.2017010102 ◽

2017 ◽

Vol 8 (1) ◽

pp. 18-30 ◽

Cited By ~ 13

Author(s):

Monali Y. Khachane

Keyword(s):

Support Vector Machine ◽

Image Classification ◽

Medical Image ◽

Brain Mri ◽

Texture Features ◽

Classification Problem ◽

Training Data ◽

Support Vector ◽

Medical Image Classification ◽

Occurrence Matrix

Computer-Aided Detection/Diagnosis (CAD) through artificial Intelligence is emerging ara in Medical Image processing and health care to make the expert systems more and more intelligent. The aim of this paper is to analyze the performance of different feature extraction techniques for medical image classification problem. Efforts are made to classify Brain MRI and Knee MRI medical images. Gray Level Co-occurrence Matrix (GLCM) based texture features, DWT and DCT transform features and Invariant Moments are used to classify the data. Experimental results shown that the proposed system produced better results however the training data is less than testing data. Support Vector Machine classifier with linear kernel produced higher accuracy 100% when used with texture features.

Download Full-text