scholarly journals Pengelompokan Dan Klasifikasi Pada Data Hepatitis Dengan Menggunakan Support Vector Machine (SVM), Classification And Regression Tree (Cart) Dan Regresi Logistik Biner

2017 ◽  
Vol 1 (3) ◽  
pp. 183
Author(s):  
Gede Suwardika

Hepatitis adalah peradangan pada hati karena toxin, seperti kimia atauobat ataupun agen penyebab infeksi. Hepatitis yang berlangsung kurang dari 6 bulan disebut "hepatitis akut", hepatitis yang berlangsung lebih dari 6 bulan disebut "hepatitis kronis".Hepatitis biasanya terjadi karena virus, terutama salah satu dari kelima virus hepatitis, yaitu A, B, C, D atau E. Hepatitis juga bisa terjadi karena infeksi virus lainnya, seperti mononukleosis infeksiosa, demam kuning dan infeksi sitomegalovirus. Penyebab hepatitis non-virus yang utama adalah alkohol dan obat-obatan.Dalam penelitian ini dilakukan tes terhadap  155 pasien dengan respon meninggal atau hidup.  Untuk itu penerapan Data Mining akan dilakukan pada kasus diatas, memanfaatkan salah satu teknik yaitu Data Classification, sejumlah data testing yang tersedia akan di analisis serta dibandingkan dengan data training untuk dilakukan prediksi meninggal atau hidup.Hasil ketepatan klasifikasi antara data training dengan data testing dengan analisis regresi logistik adalah 79,4% sedangkan dengan menggunakan SVM diperoleh sebesar 80%. Pengelompokan dengan menggunakan K-Means dan Kernel K-Means menghasilkan ketepatan pengelompokan yang berbeda. Ini menunjukkan bahwa data hepatitis memiliki pengelompokan yang baik. Kemudian hasil pengelompokan pada Kernel K-Means dibandingkan dengan data aktual yang diklasifikasikan dengan menggunakan regresi logistik, SVM dan CART dimana dihasilkan bahwa data hasil dari Kernel K-Means memiliki ketepatan klasifikasi yang lebih baik dibandingkan dengan hasil klasifikasi pada data aktual.

Author(s):  
Pardomuan Robinson Sihombing ◽  
Istiqomatul Fajriyah Yuliati

Penelitian ini akan mengkaji penerapan beberapa metode machine learning dengan memperhatikan kasus imbalanced data dalam pemodelan klasifikasi untuk penentuan risiko kejadian bayi dengan BBLR yang diharapkan dapat menjadi solusi dalam menurunkan kelahiran bayi dengan BBLR di Indonesia. Adapun metode meachine learning yang digunakan adalah Classification and Regression Tree (CART), Naïve Bayes, Random Forest dan Support Vector Machine (SVM). Pemodelan klasifikasi dengan menggunakan teknik resample pada kasus imbalanced data dan set data besar terbukti mampu meningkatkan ketepatan klasifikasi khususnya terhadap kelas minoritas yang dapat diihat dari nilai sensitivity yang tinggi dibandingkan data asli (tanpa treatment). Selanjutnya, dari kelima model klasifikasi yang iuji menunjukkan bahwa model random forest memberikan kinerja terbaik berdasarkan nilai sensitivity, specificity, G-mean dan AUC tertinggi. Variabel terpenting/paling berpengaruh dalam klasifikasi resiko kejadian BBLR adalah jarak dan urutan kelahiran, pemeriksaan kehamilan, dan umur ibu


2021 ◽  
Vol 35 (3) ◽  
pp. 209-215
Author(s):  
Pratibha Verma ◽  
Vineet Kumar Awasthi ◽  
Sanat Kumar Sahu

Data mining techniques are included with Ensemble learning and deep learning for the classification. The methods used for classification are, Single C5.0 Tree (C5.0), Classification and Regression Tree (CART), kernel-based Support Vector Machine (SVM) with linear kernel, ensemble (CART, SVM, C5.0), Neural Network-based Fit single-hidden-layer neural network (NN), Neural Networks with Principal Component Analysis (PCA-NN), deep learning-based H2OBinomialModel-Deeplearning (HBM-DNN) and Enhanced H2OBinomialModel-Deeplearning (EHBM-DNN). In this study, experiments were conducted on pre-processed datasets using R programming and 10-fold cross-validation technique. The findings show that the ensemble model (CART, SVM and C5.0) and EHBM-DNN are more accurate for classification, compared with other methods.


Animals ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 1165
Author(s):  
Abdelfattah Selim ◽  
Ameer Megahed ◽  
Sahar Kandeel ◽  
Abdullah D. Alanazi ◽  
Hamdan I. Almohammed

Classification and Regression Tree (CART) analysis is a potentially powerful tool for identifying risk factors associated with contagious caprine pleuropneumonia (CCPP) and the important interactions between them. Our objective was therefore to determine the seroprevalence and identify the risk factors associated with CCPP using CART data mining modeling in the most densely sheep- and goat-populated governorates. A cross-sectional study was conducted on 620 animals (390 sheep, 230 goats) distributed over four governorates in the Nile Delta of Egypt in 2019. The randomly selected sheep and goats from different geographical study areas were serologically tested for CCPP, and the animals’ information was obtained from flock men and farm owners. Six variables (geographic location, species, flock size, age, gender, and communal feeding and watering) were used for risk analysis. Multiple stepwise logistic regression and CART modeling were used for data analysis. A total of 124 (20%) serum samples were serologically positive for CCPP. The highest prevalence of CCPP was between aged animals (>4 y; 48.7%) raised in a flock size ≥200 (100%) having communal feeding and watering (28.2%). Based on logistic regression modeling (area under the curve, AUC = 0.89; 95% CI 0.86 to 0.91), communal feeding and watering showed the highest prevalence odds ratios (POR) of CCPP (POR = 3.7, 95% CI 1.9 to 7.3), followed by age (POR = 2.1, 95% CI 1.6 to 2.8) and flock size (POR = 1.1, 95% CI 1.0 to 1.2). However, higher-accuracy CART modeling (AUC = 0.92, 95% CI 0.90 to 0.95) showed that a flock size >100 animals is the most important risk factor (importance score = 8.9), followed by age >4 y (5.3) followed by communal feeding and watering (3.1). Our results strongly suggest that the CCPP is most likely to be found in animals raised in a flock size >100 animals and with age >4 y having communal feeding and watering. Additionally, sheep seem to have an important role in the CCPP epidemiology. The CART data mining modeling showed better accuracy than the traditional logistic regression.


Author(s):  
Oyinloye Oghenerukevwe Elohor ◽  
Adesoji susan ◽  
Akinbohun Folake

The study is aimed at developing a text summarizer using clustering and anomalies detection with SVM classification. A text summarization approach is proposed which uses the SVM clustering algorithm. The proposed project can be used to summarize articles from fields as diverse as politics, sports, current affairs, finance and any other explanatory document. However, it does cause a trade-off between domain independence and a knowledge-based summary which would provide data in a form more easily understandable to the user. A bundle of libraries and software’s was utilized for proper text summary of alphanumeric entering. KEYWORDS— Anomalies detection, SVM (support vector machine), clustering, text summarization, data mining


Author(s):  
Johannes Gehrke

It is the goal of classification and regression to build a data mining model that can be used for prediction. To construct such a model, we are given a set of training records, each having several attributes. These attributes can either be numerical (for example, age or salary) or categorical (for example, profession or gender). There is one distinguished attribute, the dependent attribute; the other attributes are called predictor attributes. If the dependent attribute is categorical, the problem is a classification problem. If the dependent attribute is numerical, the problem is a regression problem. It is the goal of classification and regression to construct a data mining model that predicts the (unknown) value for a record where the value of the dependent attribute is unknown. (We call such a record an unlabeled record.) Classification and regression have a wide range of applications, including scientific experiments, medical diagnosis, fraud detection, credit approval, and target marketing (Hand, 1997). Many classification and regression models have been proposed in the literature, among the more popular models are neural networks, genetic algorithms, Bayesian methods, linear and log-linear models and other statistical methods, decision tables, and tree-structured models, the focus of this chapter (Breiman, Friedman, Olshen, & Stone, 1984). Tree-structured models, socalled decision trees, are easy to understand, they are non-parametric and thus do not rely on assumptions about the data distribution, and they have fast construction methods even for large training datasets (Lim, Loh, & Shih, 2000). Most data mining suites include tools for classification and regression tree construction (Goebel & Gruenwald, 1999).


2013 ◽  
Author(s):  
Srimoyee Bhattacharya ◽  
Marko Maucec ◽  
Jeffrey Marc Yarus ◽  
Dwight David Fulton ◽  
Jon Matthew Orth ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document