scholarly journals Implementasi Metode Naive Bayes Untuk Memprediksi Resiko Penyakit Jantung

2020 ◽  
Vol 4 (2) ◽  
pp. 66-70
Author(s):  
firman Tempola

Kematian akibat penyakit jantung terus meningkat dan tak mengenal usia muda dan tua. World Health Organization menyebutkan 7,3 juta penduduk dunia meninggal akibat dari penyakit jantung. Bahkan disebutkan penyakit jantung adalah salah penyakit nomor satu paling mematikan. Untuk itu penting diketahui resiko dari pentakit jantung dengan menerapkan model-model yang ada pada machine learning. Tujuan dari penelitian ini yaitu untuk mengimplementasikan metode Naive Bayes untuk memprediksi penyakit jantung, serta dilakukan ujii kinerja algoritma dengan menghitung presisi, recall dan akurasi. Adapun Kriteria-kriteria yang digunakan pada penelitian ini yaitu umur, jenis kelamin, jenis sakit dada, tekanan darah, kolestrol, kadar gula, elektrokardiografi, tekanan jantung, angina induksi, old-peak, segmen_st, Fluoroskopi, denyut jantung. Sedangkan class yang diprediksi ada 2 beresiko dan tidak beresiko. Hasil dalam penelitian ini menunjukan bahwa metode berhasil memprediksi atau mengklasifikasi pasien beresiko penyakit jantung dan tidak beresiko penyakit jantung dengan persentase precision 90%, recall 100% serta mendapatkan akurasi 92.85% dan termaksuk exellent classification.

2020 ◽  
Vol 6 (2) ◽  
pp. 223-229
Author(s):  
Muhammad Dwison Alizah ◽  
Arifin Nugroho ◽  
Ummu Radiyah ◽  
Windu Gata

Abstract:  Covid-19 has been set as a Pandemic by the World Health Organization (WHO). The very large impact and the infection that is fast enough are the reasons for making Covid-19 as a pandemic and efforts to overcome. One anticipation that can be done is to do lockdown. Making the decision to carry out a lockdown is intended to reduce the spread that occurs. Lockdown is certainly not a 100% good solution for all of individual. There are individual who agree that the lockdown will be implemented, also there are those who think that the lockdown is better not to be carried out considering the negative impacts that can occur. Therefore in this study will be presented the predictive modeling for sentiment analysis related to "lockdown" specially on social media Twitter. The method used to labeled was using Vader then the tweets are extracted using TF-IDF, and modeling is made for the prediction of sentiment using Naïve Bayes and Support Vector Machine. The results obtained from the two algorithms are more than 80%. Keywords: Covid-19, lockdown, TF-IDF, Naïve Bayes, Support Vector Machine Abstrak: Covid-19 telah ditetapkan sebagia Pandemi oleh World Health Organization (WHO). Dampak yang sangat besar dan penyebaran yang cukup cepat menjadi alsan untuk menjadikan Covid-19 sebagai Pandemi dan perlu dilakukan upaya penanggulangan. Salah satu upaya yang bisa dilakukan adalah dengan melakukan lockdown. Pengambilan keputusan untuk melakukan lockdown diperuntukan guna mengurangi penyebaran yang terjadi. Lockdown tentunya bukanlah solusi yang 100% baik bagi segala pihak. Terdapat pihak - pihak yang menyetujui akan dilaksanakannya lockdown, ada pula yang beranggapan bahwa lockdown lebih baik tidak dilaksanakan dengan pertimbangan dampak negatif yang bisa terjadi. Oleh karena itu, pada penelitian ini akan disampaikan mengenai pembuatan pemodelan prediksi terkait analisa sentimen terkait “Lockdown” yang dikhususkan pada media sosial Twitter. Metode yang digunakan adalah dengan melakukan labeling menggunakan Vader dan selanjutnya tweet tersebut dilakukan ekstraksi menggunakan TF-IDF, dan dibuatkan pemodelan untuk prediksi sentimen menggunakan Naïve Bayes dan Support Vector Machine. Hasil evaluasi yang didapat dari kedua algoritma tersebut ialah mencapai lebih dari 80%. Kata kunci: Covid-19, lockdown, TF-IDF, Naïve Bayes, Support Vector Machine Abstract:  Covid-19 has been set as a Pandemic by the World Health Organization (WHO). The very large impact and the infection that is fast enough are the reasons for making Covid-19 as a pandemic and efforts to overcome. One anticipation that can be done is to do lockdown. Making the decision to carry out a lockdown is intended to reduce the spread that occurs. Lockdown is certainly not a 100% good solution for all of individual. There are individual who agree that the lockdown will be implemented, also there are those who think that the lockdown is better not to be carried out considering the negative impacts that can occur. Therefore in this study will be presented the predictive modeling for sentiment analysis related to "lockdown" specially on social media Twitter. The method used to labeled was using Vader then the tweets are extracted using TF-IDF, and modeling is made for the prediction of sentiment using Naïve Bayes and Support Vector Machine. The results obtained from the two algorithms are more than 80%. Keywords:Covid-19, lockdown, TF-IDF, Naïve Bayes, Support Vector Machine Abstrak: Covid-19 telah ditetapkan sebagia Pandemi oleh World Health Organization (WHO). Dampak yang sangat besar dan penyebaran yang cukup cepat menjadi alsan untuk menjadikan Covid-19 sebagai Pandemi dan perlu dilakukan upaya penanggulangan. Salah satu upaya yang bisa dilakukan adalah dengan melakukan lockdown. Pengambilan keputusan untuk melakukan lockdown diperuntukan guna mengurangi penyebaran yang terjadi. Lockdown tentunya bukanlah solusi yang 100% baik bagi segala pihak. Terdapat pihak - pihak yang menyetujui akan dilaksanakannya lockdown, ada pula yang beranggapan bahwa lockdown lebih baik tidak dilaksanakan dengan pertimbangan dampak negatif yang bisa terjadi. Oleh karena itu, pada penelitian ini akan disampaikan mengenai pembuatan pemodelan prediksi terkait analisa sentimen terkait “Lockdown” yang dikhususkan pada media sosial Twitter. Metode yang digunakan adalah dengan melakukan labeling menggunakan Vader dan selanjutnya tweet tersebut dilakukan ekstraksi menggunakan TF-IDF, dan dibuatkan pemodelan untuk prediksi sentimen menggunakan Naïve Bayes dan Support Vector Machine. Hasil evaluasi yang didapat dari kedua algoritma tersebut ialah mencapai lebih dari 80%. Kata kunci: Covid-19, lockdown, TF-IDF, Naïve Bayes, Support Vector Machine


2020 ◽  
Vol 17 (9) ◽  
pp. 3999-4002
Author(s):  
A. C. Bhavani ◽  
K. Aditya Shastry ◽  
K. Deepika ◽  
Nithya N. Shanbag ◽  
G. C. Akshatha

The world health organization (WHO) has assessed that the death of around 12 million people across the globe is observed each year because of diseases related to cardiovascular. The dangers associated with the cardiovascular disease can be identified effectively using machine learning techniques. As per survey, around 30% of the patient suffers no symptoms during heart attacks. But the bloodstream contains unique indications of the attack for days. The medical diagnosis of a patient remains a complex task due to several factors. The accurate medical diagnosis of a patient’s heart disease is critical as it significantly leads to the saving of millions of human lives. In this regard, the automation of the medical diagnosis is significant. The goal of this work is the development of a system for predicting the disease related to coronary artery in a patient with high accuracy utilizing machine learning (ML) techniques. Several algorithms like Naïve Bayes (NB), Support Vector Machine (SVM), and Decision Tree (DT) classifiers were implemented for predicting the disease. Extensive experiments demonstrated that the naïve Bayes achieved higher accuracy than the DT and SVM with regards to accuracy, precision, F-Measure, Recall, and receiver operating characteristic (ROC) performance metrics.


2020 ◽  
Vol 6 (2) ◽  
pp. 1-9
Author(s):  
Annisa Putri Ayudhitama ◽  
Utomo Pujianto

Hati merupakan salah satu organ penting dalam tubuh manusia yang berfungsi untuk detoksifikasi racun atau penetral racun dari segala sesuatu yang masuk ke dalam tubuh kita, sehingga tubuh menjadi lebih sehat. Hati dapat terserang suatu penyakit yang mampu mengganggu tugasnya, apabila penyakit hati sudah menyerang maka racun akan tersebar ke seluruh tubuh dan membuat tubuh menjadi tidak sehat. Penyakit liver merupakan penyakit hati yang disebabkan oleh virus, alkohol, pola hidup dan lainnya. Menurut data WHO (World Health Organization) menunjukkan hampir 1,2 juta orang per tahun khususnya di Asia Tenggara dan Afrika mengalami kematian akibat terserang penyakit liver. Seseorang sering tidak menyadari atau terlambat mengetahui penyakit liver sehingga ketika diperiksa penyakit liver sudah parah, akan lebih baik apabila dilakukan penanganan lebih awal dengan mengetahui gejala-gejala yang diderita. Data mining mampu membantu diagnosa penyakit liver dengan lebih mudah terutama untuk membantu para dokter dalam menentukan apakah pasien menderita penyakit liver atau tidak, dengan gejala hampir mendekati penyakit liver. Proses diagnosa penyakit liver dilakukan dengan proses klasifikasi dan hasilnya berupa pasien tersebut menderita liver atau tidak. Penelitian ini menggunakan 4 algoritma data mining yaitu Naïve Bayes, K-Nearest Neighbor (KNN), Decision Tree dan Neural Network. Dataset yang digunakan yaitu Indian Liver Patient Dataset (ILPD) dari website UCI Machine Learning Repository. Keempat algoritma tersebut dibandingkan manakah yang lebih baik akurasinya untuk kasus diagnosa penyakit liver. Hasilnya menunjukkan bahwa algoritma Naïve Bayes memiliki akurasi 55,75%, algoritma K-Nearest Neigbor memiliki akurasi 66,36%, algoritma Decision Tree memiliki akurasi 67,04%, dan algoritma Neural Network memiliki akurasi 70,50%. Akurasi tersebut tergolong rendah karena kelas atau label antara pasien penyakit liver dan pasien tidak memiliki liver tidaklah seimbang, kelas pasien penyakit liver lebih banyak dibandingkan pasien tidak memiliki liver, sehingga banyak data yang diklasifikasikan sebagai pasien penyakit liver. Keywords— Data Mining, Decision Tree, Klasifikasi, KNN, Liver, Naïve Bayes, Neural Network


Author(s):  
Muhammad Dwison Alizah ◽  
Arifin Nugroho ◽  
Ummu Radiyah ◽  
Windu Gata

<em>Covid-19 telah ditetapkan sebagia Pandemi oleh World Health Organization (WHO). Salah satu antisipasi yang bisa dilakukan adalah dengan melakukan lockdown. Pada penelitian ini, akan disampaikan mengenai pembuatan pemodelan prediksi terkait analisa sentimen terkait “Lockdown” pada media sosial Twitter. Metode yang digunakan adalah dengan melakukan labeling menggunakan Vader dan selanjutnya tweet dilakukan ekstraksi menggunakan TF-IDF, dan dibuatkan pemodelan untuk prediksi sentimen menggunakan Naïve Bayes dan Support Vector Machine. Hasilnya yang didapat dari kedua algoritma tersebut ialah lebih dari 80%.</em><em> </em>


2021 ◽  
Vol 7 (1) ◽  
pp. 63-68
Author(s):  
Nurlaelatul Maulidah ◽  
Riki Supriyadi ◽  
Dwi Yuni Utami ◽  
Fuad Nur Hasan ◽  
Ahmad Fauzi ◽  
...  

Diabetes melitus adalah penyakit metabolik yang ditandai terjadinya kenaikan gula darah yang disebabkan oleh terganggunya hormon insulin yang memiliki fungsi sebagai hormon dalam menjaga homeostatis tubuh menggunakan cara penurunan kadar gula darah (American Diabetes Association, 2017). World Health Organization (WHO) memperkirakan jumlah penderita diabetes melitus orang dewasa diatas 18 tahun dalam tahun 2014 berjumlah 422 juta (WHO, 2016:25). Prevalensi diabetes melitus Asia Tenggara sudah berkembang dalam tahun 1980 sebanyak 4,1% dan tahun 2014 menjadi sebanyak 8,6%. Menurut Riset Kementerian Kesehatan pada tahun 2018, Prevalensi diabetes Indonesia sebanyak 2,0%, sedangkan di Provinsi Jawa Timur sebanyak 2,6% pada penduduk umur diatas 15 tahun (KEMENKES RI, 2019). Penelitian ini dikembangkan melalui pengolahan data sekunder database kesehatan Dataset Diabetes yang diambil dari dataset Kaggle dan dapat diakses melalui https://www.kaggle.com/johndasilva/diabetes. Dimana datanya sendiri terdiri dari 2000 record dengan beberapa variabel prediktor medik (Pregnancies/Kehamilan, Glucose/Glukosa, BloodPressure/Tekanan Darah, SkinThickness/Ketebalan Kulit, Insulin, BMI/Indeks Masa Tubuh, DiabetesPedigreeFunction/Keturunan, Age/Umur and Outcome/Hasil). Kemudian data tersebut akan diolah dengan menggunakan metode Support Vector Machine dan metode Naive Bayes untuk mengetahui akurasi hasil diagnosa diabetes. Berdasarkan hasil dari penelitian yang sudah dilakukan metode Support Vector Machine memiliki nilai akurasi yang jauh lebih tinggi dibandingkan dengan menggunakan metode Naive Bayes. Nilai akurasi untuk model metode Support Vector Machine adalah 78,04% dan nilai akurasi untuk metode Naive Bayes 76,98%. Berdasarkan nilai ini, perbedaan akurasinya adalah 1,06%. Sehingga dapat disimpulkan bahwa penerapan metode Support Vector Machine mampu menghasilkan tingkat akurasi diagnosis diabetes yang lebih baik dibandingkan dengan menggunakan metode Naive Bayes.


2020 ◽  
Vol 4 (1) ◽  
pp. 76-85
Author(s):  
Dwi Yuni Utami ◽  
Elah Nurlelah ◽  
Noer Hikmah

Liver disease is an inflammatory disease of the liver and can cause the liver to be unable to function as usual and even cause death. According to WHO (World Health Organization) data, almost 1.2 million people per year, especially in Southeast Asia and Africa, have died from liver disease. The problem that usually occurs is the difficulty of recognizing liver disease early on, even when the disease has spread. This study aims to compare and evaluate Naive Bayes algorithm as a selected algorithm and Naive Bayes algorithm based on Genetic Algorithm (GA) and Bagging to find out which algorithm has a higher accuracy in predicting liver disease by processing a dataset taken from the UCI Machine Learning Repository database (GA). University of California Invene). From the results of testing by evaluating both the confusion matrix and the ROC curve, it was proven that the testing carried out by the Naive Bayes Optimization algorithm using Algortima Genetics and Bagging has a higher accuracy value than only using the Naive Bayes algorithm. The accuracy value for the Naive Bayes algorithm model is 66.66% and the accuracy value for the Naive Bayes model with attribute selection using Genetic Algorithms and Bagging is 72.02%. Based on this value, the difference in accuracy is 5.36%.Keywords: Liver Disease, Naïve Bayes, Genetic Agorithms, Bagging.


2020 ◽  
Vol 4 (3) ◽  
pp. 117
Author(s):  
Hardian Oktavianto ◽  
Rahman Puji Handri

Breast cancer is one of the highest causes of death among women, this disease ranks second cause of death after lung cancer. According to the world health organization, 1 million women get a diagnosis of breast cancer every year and half of them die, in general this is due to early treatment and slow treatment resulting in new cancers being detected after entering the final stage. In the field of health and medicine, machine learning-based classification has been carried out to help doctors and health professionals in classifying the types of cancer, to determine which treatment measures should be performed. In this study breast cancer classification will be carried out using the Naive Bayes algorithm to group the types of cancer. The dataset used is from the Wisconsin breast cancer database. The results of this study are the ability of the Naive Bayes algorithm for the classification of breast cancer produces a good value, where the average percentage of correctly classified data reaches 96.9% and the average percentage of data is classified as incorrect only 3.1%. While the level of effectiveness of classification with naive bayes is high, where the average value of precision and recall is around 0.96. The highest precision and recall values are when the test data uses a percentage split of 40% with the respective values reaching 0.974 and 0.973.


2020 ◽  
Vol 2020 ◽  
pp. 1-17 ◽  
Author(s):  
Anita Ramachandran ◽  
Anupama Karuppiah

With advances in medicine and healthcare systems, the average life expectancy of human beings has increased to more than 80 yrs. As a result, the demographic old-age dependency ratio (people aged 65 or above relative to those aged 15–64) is expected to increase, by 2060, from ∼28% to ∼50% in the European Union and from ∼33% to ∼45% in Asia (Ageing Report European Economy, 2015). Therefore, the percentage of people who need additional care is also expected to increase. For instance, per studies conducted by the National Program for Health Care of the Elderly (NPHCE), elderly population in India will increase to 12% of the national population by 2025 with 8%–10% requiring utmost care. Geriatric healthcare has gained a lot of prominence in recent years, with specific focus on fall detection systems (FDSs) because of their impact on public lives. According to a World Health Organization report, the frequency of falls increases with increase in age and frailty. Older people living in nursing homes fall more often than those living in the community and 40% of them experience recurrent falls (World Health Organization, 2007). Machine learning (ML) has found its application in geriatric healthcare systems, especially in FDSs. In this paper, we examine the requirements of a typical FDS. Then we present a survey of the recent work in the area of fall detection systems, with focus on the application of machine learning. We also analyze the challenges in FDS systems based on the literature survey.


2021 ◽  
Vol 5 (1) ◽  
pp. 19-25
Author(s):  
Frizka Fitriana ◽  
Ema Utami ◽  
Hanif Al Fatta

The corona virus outbreak, commonly referred to as COVID-19, has been officially designated a global pandemic by the World Health Organization (WHO). To minimize the impact caused by the virus, one of the right steps is to develop a vaccine, however, with the vaccination for the Indonesian people, it is controversial so that it invites many people to give an opinion assessment, but the limited space makes it difficult for the public to express their opinion, because Therefore, people choose social media as a place to channel public opinion. Support vector machine algorithm has better performance in terms of accuracy, precision and recall with values ​​of 90.47%, 90.23%, 90.78% with performance values ​​on the Bayes algorithm, namely 88.64%, 87.32%, 88, 13%, with a difference of 1.83% accuracy, 2.91% precision and 2.65% recall, while for time the Naive Bayes algorithm has a better performance level with a value of 8.1 seconds and the Support vector machine algorithm gets a time speed of 11 seconds with a difference of 2, 9 seconds. With the results of sentiment analysis neutral 8.76%, negative 42.92% and positive 48.32% for Bayes and neutral 10.56%, negative 41.28% and positive 48.16% for SVM.


2021 ◽  
Vol 13 (4) ◽  
pp. 24-37
Author(s):  
Avijit Kumar Chaudhuri ◽  
◽  
Dilip K. Banerjee ◽  
Anirban Das

World Health Organisation declared breast cancer (BC) as the most frequent suffering among women and accounted for 15 percent of all cancer deaths. Its accurate prediction is of utmost significance as it not only prevents deaths but also stops mistreatments. The conventional way of diagnosis includes the estimation of the tumor size as a sign of plausible cancer. Machine learning (ML) techniques have shown the effectiveness of predicting disease. However, the ML methods have been method centric rather than being dataset centric. In this paper, the authors introduce a dataset centric approach(DCA) deploying a genetic algorithm (GA) method to identify the features and a learning ensemble classifier algorithm to predict using the right features. Adaboost is such an approach that trains the model assigning weights to individual records rather than experimenting on the splitting of datasets alone and perform hyper-parameter optimization. The authors simulate the results by varying base classifiers i.e, using logistic regression (LR), decision tree (DT), support vector machine (SVM), naive bayes (NB), random forest (RF), and 10-fold crossvalidations with a different split of the dataset as training and testing. The proposed DCA model with RF and 10-fold cross-validations demonstrated its potential with almost 100% performance in the classification results that no research could suggest so far. The DCA satisfies the underlying principles of data mining: the principle of parsimony, the principle of inclusion, the principle of discrimination, and the principle of optimality. This DCA is a democratic and unbiased ensemble approach as it allows all features and methods in the start to compete, but filters out the most reliable chain (of steps and combinations) that give the highest accuracy. With fewer characteristics and splits of 50-50, 66-34, and 10 fold cross-validations, the Stacked model achieves 97 % accuracy. These values and the reduction of features improve upon prior research works. Further, the proposed classifier is compared with some state-of-the-art machine-learning classifiers, namely random forest, naive Bayes, support-vector machine with radial basis function kernel, and decision tree. For testing the classifiers, different performance metrics have been employed – accuracy, detection rate, sensitivity, specificity, receiver operating characteristic, area under the curve, and some statistical tests such as the Wilcoxon signed-rank test and kappa statistics – to check the strength of the proposed DCA classifier. Various splits of training and testing data –namely, 50–50%, 66–34%, 80–20% and 10-fold cross-validation – have been incorporated in this research to test the credibility of the classification models in handling the unbalanced data. Finally, the proposed DCA model demonstrated its potential with almost 100% performance in the classification results. The output results have also been compared with other research on the same dataset where the proposed classifiers were found to be best across all the performance dimensions.


Sign in / Sign up

Export Citation Format

Share Document