scholarly journals Algoritma Naïve Bayes Untuk Memprediksi Kredit Macet Pada Koperasi Simpan Pinjam

2019 ◽  
Vol 4 (2) ◽  
Author(s):  
Diah Puspitasari ◽  
Syifa Sintia Al Khautsar ◽  
Wida Prima Mustika

Cooperatives are a forum that can help people, especially small and medium-sized communities. Cooperatives play an important role in the economic growth of the community such as the price of basic commodities which are relatively cheap and there are also cooperatives that offer borrowing and storing money for the community. Constraints that have been felt by this cooperative are that borrowers find it difficult to repay loan installments, causing bad credit. Because the cooperative in conducting credit analysis is carried out in a personal manner, namely by filling out the loan application form along with the requirements and conducting a field survey. Therefore there is a need for an evaluation to be carried out in lending to borrowers. To minimize these problems, it is necessary to detect customer criteria that are used to predict bad loans and to determine whether or not the elites are eligible to take credit using data mining. The data mining technique used is classification with the Naive Bayes method. Based on testing the accuracy of the resulting model obtained accuracy level of 59%, sensitivity (True Positive Rate (TP Rate) or Recall) of 46.80%, specificity (False Negative Rate (FN Rate or Precision) of 69.81%, Positive Predictive Value (PPV) of 57.89%, and Negative Predictive Value (NPV) of 59.67%.

Data mining usually specifies the discovery of specific pattern or analysis of data from a large dataset. Classification is one of an efficient data mining technique, in which class the data are classified are already predefined using the existing datasets. The classification of medical records in terms of its symptoms using computerized method and storing the predicted information in the digital format is of great importance in the diagnosis of various diseases in the medical field. In this paper, finding the algorithm with highest accuracy range is concentrated so that a cost-effective algorithm can be found. Here the data mining classification algorithms are compared with their accuracy of finding exact data according to the diagnosis report and their execution rate to identify how fast the records are classified. The classification technique based algorithms used in this study are the Naive Bayes Classifier, the C4.5 tree classifier and the K-Nearest Neighbor (KNN) to predict which algorithm is the best suited for classifying any kind of medical dataset. Here the datasets such as Breast Cancer, Iris and Hypothyroid are used to predict which of the three algorithms is suitable for classifying the datasets with highest accuracy of finding the records of patients with the particular health problems. The experimental results represented in the form of table and graph shows the performance and the importance of Naïve Bayes, C4.5 and K-Nearest Neighbor algorithms. From the performance outcome of the three algorithms the C4.5 algorithm is a lot better than the Naïve Bayes and the K-Nearest Neighbor algorithm.


Author(s):  
T R Stella Mary ◽  
Shoney Sebastian

<span>Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.</span>


Author(s):  
T R Stella Mary ◽  
Shoney Sebastian

<span lang="EN-US">Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.</span>


2020 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Irkham Widhi Saputro ◽  
Bety Wulan Sari

Universitas AMIKOM Yogyakarta adalah salah satu perguruan tinggi yang memiliki ribuan mahasiswa baru khususnya pada prodi Informatika. Pada tahun 2012 tercatat ada 1009 mahasiswa baru, dan pada tahun 2013 juga tercatat ada sebanyak 859 mahasiswa baru. Namun sayangnya, dari sekian banyak mahasiswa hanya sekitar 50% saja yang dapat lulus dengan tepat waktu. Data tersebut untuk membuat sistem klasifikasi menggunakan teknik data mining dengan metode Naïve Bayes. Dataset yang akan digunakan sebanyak 300 data yang bersumber dari data alumni angkatan 2012, dan 2013 dengan masing-masing data sebanyak 150. Data yang diperoleh memiliki 144 mahasiswa dengan keterangan lulus tepat waktu, dan 156 mahasiswa dengan keterangan lulus tidak tepat waktu. Proses pengujian akan dilakukan menggunakan metode 10-Fold Cross Validation, dan Confusion Matrix. Hasil pengujian menunjukkan bahwa rata-rata performa dari model Naïve Bayes mempunyai nilai akurasi sebesar 68%, nilai precision sebesar 61.3%, nilai recall sebesar 65.3%, dan nilai f1-score sebesar 61%. Nilai performa dari model dapat dipengaruhi oleh dataset yang digunakan untuk pembuatan model.Kata Kunci — data mining, Naïve Bayes, K-Fold Cross Validation, Confusion MatrixAMIKOM Yogyakarta University is one of the colleges that has thousands of new students, especially in the Informatics study program. In 2012 there were 1009 new students, and in 2013 there were 859 new students. But unfortunately, of the many students only around 50% can graduate on time. The data is to make the classification system using data mining techniques with the Naïve Bayes method. The dataset will be used as much as 300 data sourced from alumni data of 2012, and 2013 with each data as much as 150. The data obtained has 144 students with information passed on time, and 156 students with graduation information not on time. The testing process will be carried out using the 10-Fold Cross Validation, and Confusion Matrix method. The test results show that the average performance of the Naïve Bayes model has an accuracy value of 68%, precision value is 61.3%, recall value is 65.3%, and f1-score is 61%. The performance value of the model can be influenced by the dataset used for modeling.Keywords — data mining, classification, Naïve Bayes, graduation time


In this proposed research work we use a profound Data mining technique which is an automated procedure of discovering interesting patterns by means of comprehensible predictive models from large data sets by grouping them. Predicting a student's academic performance is very crucial especially for universities. Educational Data Mining (EDM) is an approach for extricating useful data that could possibly affect a firm. Nowadays student’s performance is swayed by a lot of aspects. These aspects might involve the academic performance of a student. This subject evaluates numerous factors probably suspected to alter a student’s empirical performance in scholastic, and discover a subjective design which classifies and forecast the student’s learning outcomes. The intention of this research is to conduct a case study on factors swayed by the student’s academic achievements and to dictate greater impact factors. In this paper we focus on the academic achievement evaluation on the basis of correct instances and incorrect instances by means of Naive Bayes and Random Forest algorithms. This paper intends to make a metaphorical assessment of Naive Bayes and random Forest classifier on student data and dictate the best algorithm.


2010 ◽  
Vol 5 (1-2) ◽  
pp. 229-233
Author(s):  
György Hampel ◽  
Zoltán Fabulya ◽  
Elemérné Nagy

Using a simple data mining technique, the Analyze Key Influencers, in Excel 2007 Data Mining Add-ins, we searched for relationship among the seat (county and town), the form of business, the main activity, the number of employees and the annual income of the Hungarian companies. This technique uses the Naive Bayes algorithm. According to the used method the seat has no influencers. Most of the main activities have no influencers, but some activities (82 out of 495) have relationship with the other criteria, mainly with the form of business. The form of business (all 30 categories), the number of employees (17 of 18 categories) and the annual income (all 9 categories) are each others key influencers. Cramer's association was used to check the results of the data mining. The Cramer contin-gency coefficient showed similar results as the data mining, but the results also indicated that the strength of the association was less than moderate in all cases. The highest associa-tion were between the annual income and the number of employees (0.46, moderate asso-ciation), the main activity and form of business (0.36, moderate association) and the annual income and the form of business (0.27, low association).


2021 ◽  
Vol In Press (In Press) ◽  
Author(s):  
Farshid Khorasani ◽  
Ramak Roohi poor ◽  
Afsar Dastjani Farahani ◽  
Azam Orooji ◽  
Mohammad Reza Zarkesh

Background: Nowadays with advanced improvement in NICUs, more preterm infants are surviving with more risks related to ROP. Objectives: The aim of the present study was to collect ROP risk factors and design data mining techniques to suggest a predictive ROP treatment-requiring model. Methods: A cross-sectional study was carried out in an Iranian hospital (2014 - 2018). The population study consisted of 76 preterm neonates with ROP diagnosis. Of all, retinopathy was treated in 35 cases and others had not received any treatment associated with retinopathy. The pre-set questionnaire was used to extract the risk factors leading to treatment-requiring retinopathy. Then specific software models were designed for predicting ROP treatment-requiring model. In order to compare the performance of data mining methods, several performance metrics such as accuracy, precision, sensitivity, specificity, and F-measure have been used. Results: Seventy neonates with ROP entered the study. Results have shown that among four models, Naive Bayes had the best performance with the highest accuracy (87.14), precision (96.43), sensitivity (77.14) and F-measure (85.71). Confusion matrix for Naive Bayes classifier showed that positive predictive value and negative predictive value were 0.7714 and 0.9714, respectively. Overall 87.14% of all data were correctly classified. Moreover, of all data mining techniques, decision tree model could indicate understandable findings as follow; if oxygen therapy continues more than 16 days or blood infusion is > 6 units of packed cells then patients need treatment. Conclusions: The results of the present study have demonstrated that data mining techniques could be effectively implemented in ROP screening programs.


Author(s):  
Gusti Made Trisetya Putra ◽  
Muhammad Rusli

In the modern soccer era, soccer is already considered as an entertainment, even modern soccer already become as an industry or a business that considered can bring a great profit to the club owner. One of the most important factor in building a team is young age soccer player development. Right young age soccer player development method, can be very helpful in establish a good team. A professional team must have acoach, for the first team or junior team. The duties of a coach is determine a right position for soccer player in the game, this duties sometimes make a coach is hard to making a right decision. This research will discussabout how to design a decision support system for determine soccer player using naive bayes technique. Data mining used naive bayes technique for find a prediction for soccer player based on the player skill test result. From this research result, it can be seen that by using decision support system using data mining with naive bayes technique can be help coach performance in determine position for soccer player especially for young age soccer player development so that can help coach in the making right decision effectively and efficiently.


Sign in / Sign up

Export Citation Format

Share Document