Pengelompokan Dan Klasifikasi Pada Data Hepatitis Dengan Menggunakan Support Vector Machine (SVM), Classification And Regression Tree (Cart) Dan Regresi Logistik Biner

Hepatitis adalah peradangan pada hati karena toxin, seperti kimia atauobat ataupun agen penyebab infeksi. Hepatitis yang berlangsung kurang dari 6 bulan disebut "hepatitis akut", hepatitis yang berlangsung lebih dari 6 bulan disebut "hepatitis kronis".Hepatitis biasanya terjadi karena virus, terutama salah satu dari kelima virus hepatitis, yaitu A, B, C, D atau E. Hepatitis juga bisa terjadi karena infeksi virus lainnya, seperti mononukleosis infeksiosa, demam kuning dan infeksi sitomegalovirus. Penyebab hepatitis non-virus yang utama adalah alkohol dan obat-obatan.Dalam penelitian ini dilakukan tes terhadap 155 pasien dengan respon meninggal atau hidup. Untuk itu penerapan Data Mining akan dilakukan pada kasus diatas, memanfaatkan salah satu teknik yaitu Data Classification, sejumlah data testing yang tersedia akan di analisis serta dibandingkan dengan data training untuk dilakukan prediksi meninggal atau hidup.Hasil ketepatan klasifikasi antara data training dengan data testing dengan analisis regresi logistik adalah 79,4% sedangkan dengan menggunakan SVM diperoleh sebesar 80%. Pengelompokan dengan menggunakan K-Means dan Kernel K-Means menghasilkan ketepatan pengelompokan yang berbeda. Ini menunjukkan bahwa data hepatitis memiliki pengelompokan yang baik. Kemudian hasil pengelompokan pada Kernel K-Means dibandingkan dengan data aktual yang diklasifikasikan dengan menggunakan regresi logistik, SVM dan CART dimana dihasilkan bahwa data hasil dari Kernel K-Means memiliki ketepatan klasifikasi yang lebih baik dibandingkan dengan hasil klasifikasi pada data aktual.

Download Full-text

Quad-polarized synthetic aperture radar and multispectral data classification using classification and regression tree and support vector machine–based data fusion system

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.11.016007 ◽

2017 ◽

Vol 11 (1) ◽

pp. 016007 ◽

Cited By ~ 2

Author(s):

Behnaz Bigdeli ◽

Parham Pahlavani

Keyword(s):

Support Vector Machine ◽

Data Fusion ◽

Synthetic Aperture Radar ◽

Regression Tree ◽

Classification And Regression Tree ◽

Synthetic Aperture ◽

Support Vector ◽

Fusion System ◽

Multispectral Data ◽

Classification And Regression

Download Full-text

Penerapan Metode Machine Learning dalam Klasifikasi Risiko Kejadian Berat Badan Lahir Rendah di Indonesia

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer ◽

10.30812/matrik.v20i2.1174 ◽

2021 ◽

Vol 20 (2) ◽

pp. 417-426

Author(s):

Pardomuan Robinson Sihombing ◽

Istiqomatul Fajriyah Yuliati

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Naive Bayes ◽

Regression Tree ◽

Imbalanced Data ◽

Classification And Regression Tree ◽

Support Vector ◽

Classification And Regression ◽

Sensitivity Specificity

Penelitian ini akan mengkaji penerapan beberapa metode machine learning dengan memperhatikan kasus imbalanced data dalam pemodelan klasifikasi untuk penentuan risiko kejadian bayi dengan BBLR yang diharapkan dapat menjadi solusi dalam menurunkan kelahiran bayi dengan BBLR di Indonesia. Adapun metode meachine learning yang digunakan adalah Classification and Regression Tree (CART), Naïve Bayes, Random Forest dan Support Vector Machine (SVM). Pemodelan klasifikasi dengan menggunakan teknik resample pada kasus imbalanced data dan set data besar terbukti mampu meningkatkan ketepatan klasifikasi khususnya terhadap kelas minoritas yang dapat diihat dari nilai sensitivity yang tinggi dibandingkan data asli (tanpa treatment). Selanjutnya, dari kelima model klasifikasi yang iuji menunjukkan bahwa model random forest memberikan kinerja terbaik berdasarkan nilai sensitivity, specificity, G-mean dan AUC tertinggi. Variabel terpenting/paling berpengaruh dalam klasifikasi resiko kejadian BBLR adalah jarak dan urutan kelahiran, pemeriksaan kehamilan, dan umur ibu

Download Full-text

PENERAPAN MODEL KLASIFIKASI REGRESI LOGISTIK, SUPPORT VECTOR MACHINE , CLASSIFICATION AND REGRESSION TREE TERHADAP DATA KEJADIAN DIFTERI DI PROVINSI JAWA BARAT

Euclid ◽

10.33603/e.v5i2.1121 ◽

2018 ◽

Vol 5 (2) ◽

pp. 20

Author(s):

Hilman Dwi Anggana

Keyword(s):

Support Vector Machine ◽

Regression Tree ◽

Classification And Regression Tree ◽

Support Vector ◽

Support Vector Machine Classification ◽

Classification And Regression

Download Full-text

Prediction of Body Weight of Turkish Tazi Dogs using Data Mining Techniques: Classification and Regression Tree (CART) and Multivariate Adaptive Regression Splines (MARS)

Pakistan Journal of Zoology ◽

10.17582/journal.pjz/2018.50.2.575.583 ◽

2018 ◽

Vol 50 (2) ◽

Cited By ~ 4

Author(s):

Senol Celik ◽

Orhan Yilmaz

Keyword(s):

Data Mining ◽

Body Weight ◽

Regression Tree ◽

Multivariate Adaptive Regression Splines ◽

Classification And Regression Tree ◽

Regression Splines ◽

Adaptive Regression ◽

Classification And Regression ◽

Using Data ◽

Adaptive Regression Splines

Download Full-text

A Novel Design of Classification of Coronary Artery Disease Using Deep Learning and Data Mining Algorithms

Revue d intelligence artificielle ◽

10.18280/ria.350304 ◽

2021 ◽

Vol 35 (3) ◽

pp. 209-215

Author(s):

Pratibha Verma ◽

Vineet Kumar Awasthi ◽

Sanat Kumar Sahu

Keyword(s):

Neural Network ◽

Data Mining ◽

Deep Learning ◽

Regression Tree ◽

Principal Component ◽

Classification And Regression Tree ◽

Support Vector ◽

Data Mining Algorithms ◽

R Programming ◽

Hidden Layer

Data mining techniques are included with Ensemble learning and deep learning for the classification. The methods used for classification are, Single C5.0 Tree (C5.0), Classification and Regression Tree (CART), kernel-based Support Vector Machine (SVM) with linear kernel, ensemble (CART, SVM, C5.0), Neural Network-based Fit single-hidden-layer neural network (NN), Neural Networks with Principal Component Analysis (PCA-NN), deep learning-based H2OBinomialModel-Deeplearning (HBM-DNN) and Enhanced H2OBinomialModel-Deeplearning (EHBM-DNN). In this study, experiments were conducted on pre-processed datasets using R programming and 10-fold cross-validation technique. The findings show that the ensemble model (CART, SVM and C5.0) and EHBM-DNN are more accurate for classification, compared with other methods.

Download Full-text

Determination of Seroprevalence of Contagious Caprine Pleuropneumonia and Associated Risk Factors in Goats and Sheep Using Classification and Regression Tree

Animals ◽

10.3390/ani11041165 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1165

Author(s):

Abdelfattah Selim ◽

Ameer Megahed ◽

Sahar Kandeel ◽

Abdullah D. Alanazi ◽

Hamdan I. Almohammed

Keyword(s):

Risk Factors ◽

Data Mining ◽

Logistic Regression ◽

Regression Tree ◽

Classification And Regression Tree ◽

Flock Size ◽

Factors Associated ◽

Contagious Caprine Pleuropneumonia ◽

Classification And Regression ◽

Communal Feeding

Classification and Regression Tree (CART) analysis is a potentially powerful tool for identifying risk factors associated with contagious caprine pleuropneumonia (CCPP) and the important interactions between them. Our objective was therefore to determine the seroprevalence and identify the risk factors associated with CCPP using CART data mining modeling in the most densely sheep- and goat-populated governorates. A cross-sectional study was conducted on 620 animals (390 sheep, 230 goats) distributed over four governorates in the Nile Delta of Egypt in 2019. The randomly selected sheep and goats from different geographical study areas were serologically tested for CCPP, and the animals’ information was obtained from flock men and farm owners. Six variables (geographic location, species, flock size, age, gender, and communal feeding and watering) were used for risk analysis. Multiple stepwise logistic regression and CART modeling were used for data analysis. A total of 124 (20%) serum samples were serologically positive for CCPP. The highest prevalence of CCPP was between aged animals (>4 y; 48.7%) raised in a flock size ≥200 (100%) having communal feeding and watering (28.2%). Based on logistic regression modeling (area under the curve, AUC = 0.89; 95% CI 0.86 to 0.91), communal feeding and watering showed the highest prevalence odds ratios (POR) of CCPP (POR = 3.7, 95% CI 1.9 to 7.3), followed by age (POR = 2.1, 95% CI 1.6 to 2.8) and flock size (POR = 1.1, 95% CI 1.0 to 1.2). However, higher-accuracy CART modeling (AUC = 0.92, 95% CI 0.90 to 0.95) showed that a flock size >100 animals is the most important risk factor (importance score = 8.9), followed by age >4 y (5.3) followed by communal feeding and watering (3.1). Our results strongly suggest that the CCPP is most likely to be found in animals raised in a flock size >100 animals and with age >4 y having communal feeding and watering. Additionally, sheep seem to have an important role in the CCPP epidemiology. The CART data mining modeling showed better accuracy than the traditional logistic regression.

Download Full-text

TEXT SUMMARIZER USING CLUSTERING TECHNIQUES AND ANOMALIES DETECTION ON SVM

EPRA International Journal of Research & Development (IJRD) ◽

10.36713/10.36713/epra3758 ◽

2019 ◽

pp. 1-7

Author(s):

Oyinloye Oghenerukevwe Elohor ◽

Adesoji susan ◽

Akinbohun Folake

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Clustering Algorithm ◽

Text Summarization ◽

Support Vector ◽

Trade Off ◽

Clustering Techniques ◽

Svm Classification ◽

Knowledge Based ◽

Current Affairs

The study is aimed at developing a text summarizer using clustering and anomalies detection with SVM classification. A text summarization approach is proposed which uses the SVM clustering algorithm. The proposed project can be used to summarize articles from fields as diverse as politics, sports, current affairs, finance and any other explanatory document. However, it does cause a trade-off between domain independence and a knowledge-based summary which would provide data in a form more easily understandable to the user. A bundle of libraries and software’s was utilized for proper text summary of alphanumeric entering. KEYWORDS— Anomalies detection, SVM (support vector machine), clustering, text summarization, data mining

Download Full-text

Classification and Regression Trees

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch031 ◽

2011 ◽

pp. 192-195 ◽

Cited By ~ 1

Author(s):

Johannes Gehrke

Keyword(s):

Data Mining ◽

Linear Models ◽

Regression Tree ◽

Classification Problem ◽

Classification And Regression Tree ◽

Construction Methods ◽

Classification And Regression ◽

Log Linear ◽

Mining Model ◽

Decision Tables

It is the goal of classification and regression to build a data mining model that can be used for prediction. To construct such a model, we are given a set of training records, each having several attributes. These attributes can either be numerical (for example, age or salary) or categorical (for example, profession or gender). There is one distinguished attribute, the dependent attribute; the other attributes are called predictor attributes. If the dependent attribute is categorical, the problem is a classification problem. If the dependent attribute is numerical, the problem is a regression problem. It is the goal of classification and regression to construct a data mining model that predicts the (unknown) value for a record where the value of the dependent attribute is unknown. (We call such a record an unlabeled record.) Classification and regression have a wide range of applications, including scientific experiments, medical diagnosis, fraud detection, credit approval, and target marketing (Hand, 1997). Many classification and regression models have been proposed in the literature, among the more popular models are neural networks, genetic algorithms, Bayesian methods, linear and log-linear models and other statistical methods, decision tables, and tree-structured models, the focus of this chapter (Breiman, Friedman, Olshen, & Stone, 1984). Tree-structured models, socalled decision trees, are easy to understand, they are non-parametric and thus do not rely on assumptions about the data distribution, and they have fast construction methods even for large training datasets (Lim, Loh, & Shih, 2000). Most data mining suites include tools for classification and regression tree construction (Goebel & Gruenwald, 1999).

Download Full-text