Recommender System for Term Deposit Likelihood Prediction using Cross-validated Neural Network

For enhancing the maximized profit from bank as well as customer perspective, term deposit can accelerate finance fields. This paper focuses on likelihood of term deposit subscription taken by the customers. Bank campaign efforts and customer details are influential while considering possibilities of taking term deposit subscription. An automated system is provided in this paper that approaches towards prediction of term deposit investment possibilities in advance. Neural network along with stratified 10-fold cross-validation methodology is proposed as predictive model which is later compared with other benchmark classifiers such as k-Nearest Neighbor (k-NN), Decision tree classifier (DT), and Multi-layer perceptron classifier (MLP). Experimental study concluded that proposed model provides significant prediction results over other baseline models with an accuracy of 88.32% and MSE of 0.1168.

Download Full-text

Recommender System for Term Deposit Likelihood Prediction Using Cross-Validated Neural Network

10.20944/preprints202006.0360.v1 ◽

2020 ◽

Author(s):

Shawni Dutta ◽

Samir Kumar Bandyopadhyay

Keyword(s):

Neural Network ◽

Nearest Neighbor ◽

Mean Squared Error ◽

Automated System ◽

K Nearest Neighbor ◽

Decision Tree Classifier ◽

Squared Error ◽

Proposed Model ◽

Tree Classifier ◽

Customer Perspective

For enhancing the maximized profit from bank as well as customer perspective, term deposit can accelerate finance fields. This paper focuses on likelihood of term deposit subscription taken by the customers. Bank campaign efforts and customer details are influential while considering possibilities of taking term deposit subscription. An automated system is provided in this paper that approaches towards prediction of term deposit investment possibilities in advance. Neural network(NN) along with stratified 10-fold cross-validation methodology is proposed as predictive model which is later compared with other benchmark classifiers such as k-Nearest Neighbor (k-NN), Decision tree classifier (DT), and Multi-layer perceptron classifier (MLP). Experimental study concluded that proposed model provides significant prediction results over other baseline models with an accuracy of 88.32% and Mean Squared Error (MSE) of 0.1168.

Download Full-text

Applying Convolutional-GRU for Term Deposit Likelihood Prediction

10.20944/preprints202007.0101.v1 ◽

2020 ◽

Author(s):

Shawni Dutta ◽

Samir Bandyopadhyay

Keyword(s):

Predictive Model ◽

Nearest Neighbor ◽

Automated System ◽

K Nearest Neighbor ◽

Decision Tree Classifier ◽

Saving Account ◽

Proposed Model ◽

Tree Classifier ◽

Customer Perspective ◽

Gated Recurrent Unit

Banks are normally offered two kinds of deposit accounts. It consists of deposits like current/saving account and term deposits like fixed or recurring deposits. For enhancing the maximized profit from bank as well as customer perspective, term deposit can accelerate uplifting of finance fields. This paper focuses on likelihood of term deposit subscription taken by the customers. Bank campaign efforts and customer detail analysis can influence term deposit subscription chances. An automated system is approached in this paper that works towards prediction of term deposit investment possibilities in advance. This paper proposes deep learning based hybrid model that stacks Convolutional layers and Recurrent Neural Network (RNN) layers as predictive model. For RNN, Gated Recurrent Unit (GRU) is employed. The proposed predictive model is later compared with other benchmark classifiers such as k-Nearest Neighbor (k-NN), Decision tree classifier (DT), and Multi-layer perceptron classifier (MLP). Experimental study concludes that proposed model attains an accuracy of 89.59% and MSE of 0.1041 which outperform well other baseline models.

Download Full-text

Applying Convolutional-GRU for Term Deposit Likelihood Prediction

International Journal of Engineering and Management Research ◽

10.31033/ijemr.11.3.42 ◽

2021 ◽

Vol 11 (3) ◽

Author(s):

Shawni Dutta ◽

Payal Bose ◽

Vishal Goyal ◽

Samir Kumar Bandyopadhyay

Keyword(s):

Predictive Model ◽

Nearest Neighbor ◽

Automated System ◽

K Nearest Neighbor ◽

Decision Tree Classifier ◽

Saving Account ◽

Proposed Model ◽

Tree Classifier ◽

Customer Perspective ◽

Gated Recurrent Unit

Banks are normally offered two kinds of deposit accounts. It consists of deposits like current/saving account and term deposits like fixed or recurring deposits.For enhancing the maximized profit from bank as well as customer perspective, term deposit can accelerate uplifting of finance fields. This paper focuses on likelihood of term deposit subscription taken by the customers. Bank campaign efforts and customer detail analysis caninfluence term deposit subscription chances. An automated system is approached in this paper that works towards prediction of term deposit investment possibilities in advance. This paper proposes deep learning based hybrid model that stacks Convolutional layers and Recurrent Neural Network (RNN) layers as predictive model. For RNN, Gated Recurrent Unit (GRU) is employed. The proposed predictive model is later compared with other benchmark classifiers such as k-Nearest Neighbor (k-NN), Decision tree classifier (DT), and Multi-layer perceptron classifier (MLP). Experimental study concludesthat proposed model attainsan accuracy of 89.59% and MSE of 0.1041 which outperform wellother baseline models.

Download Full-text

A hybrid cost-sensitive ensemble for heart disease prediction

10.21203/rs.2.22946/v1 ◽

2020 ◽

Author(s):

Zhenya Qi ◽

Zuoru Zhang

Keyword(s):

Heart Disease ◽

Cross Validation ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Misclassification Cost ◽

Proposed Model ◽

Learning Machine ◽

Fold Cross Validation ◽

Very High

Abstract Heart disease is the primary cause of morbidity and mortality in the world. It includes numerous problems and symptoms. The diagnosis of heart disease is difficult because there are too many factors to analyze. What's more, the misclassification cost could be very high. In this paper, I firstly propose a cost-sensitive ensemble model to improve the accuracy of diagnosis and reduce the misclassification cost. The proposed model contains five heterogeneous classifiers: random forest, logistic regression, support vector machine, extreme learning machine and k-nearest neighbor. Then, experiments are done on three datasets from UCI machine learning repository. The highest classification accuracy of 91.74%, highest G-mean of 90.55%, highest precision of 96.11%, highest recall of 89.61% and lowest misclassification cost of 30.32% are achieved by the proposed model according to ten-fold cross validation. The results demonstrate that the performance of the proposed model is superior to those of previously reported classification techniques.

Download Full-text

PREDIKSI TINGKAT KESUKSESAN PROMOSI BANK DENGAN ALGORITMA DNN

Jurnal Informatika ◽

10.30873/ji.v21i1.2866 ◽

2021 ◽

Vol 21 (1) ◽

pp. 23-33

Author(s):

Oscar Oscar ◽

Nurlaelatul Maulidah ◽

Annida Purnamawati ◽

Destiana Putri ◽

Hilman F Pardede

Keyword(s):

Neural Network ◽

Random Forest ◽

Success Rate ◽

Deep Neural Network ◽

Cross Validation ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Bank Marketing ◽

Fold Cross Validation ◽

Better Than

Telemarketing is one effective way for promoting products. However, it is often difficult to measure the success of telemarketing. Therefore, a way to predict the success rate of telemarketing, and hence strategies could be planned to increase the success rate. In this study, we evaluate several implementations of machine learning for prediction the success of telemarketing. The evaluated methods are Deep Neural Network (DNN), Random Forest, and K-nearest neighbor (K-NN). We validate our experiments using 10-fold cross validation and our experiments show that DNN with 3 hidden layers outperforms other methods. Accuracy of 90% is achieved with the DNN. It is better than Random Forest and KNN that achieve accuracies of algorithm and 88% and 89%.Keywords— Bank Marketing, DNN, KNN, Random Forest.

Download Full-text

KLASIFIKASI STATUS PEMBAYARAN PREMI MENGGUNAKAN ALGORITMA NEIGHBOR WEIGHTED K-NEAREST NEIGHBOR (NWKNN) (STUDI KASUS: PT. BUMIPUTERA KOTA SAMARINDA)

VARIANCE : Journal of Statistics and Its Applications ◽

10.30598/variancevol1iss2page56-63 ◽

2020 ◽

Vol 1 (2) ◽

pp. 56-63

Author(s):

Grassella Gunsyang ◽

Ika Purnamasari ◽

Fidia Deny Tisna Amijaya

Keyword(s):

Cross Validation ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Fold Cross Validation

Algoritma Neighbor Weighted K-Nearest Neighbor (NWKNN) merupakan pengembangan dari algoritma K-Nearest Neighbor (KNN), dengan memberikan bobot pada setiap kelas yang akan diklasifikasikan. Penelitian ini membahas tentang klasifikasi menggunakan algoritma NWKNN yang diaplikasikan pada data status pembayaran premi. Tujuannya untuk mengetahui nilai eksponen (E) dan nilai ketetanggaan (K) yang optimal, serta nilai akurasi dari klasifikasi data status pembayaran Premi di PT. Bumiputera Kota Samarinda. Tahapan dalam penelitian ini yaitu menentukan nilai E dan nilai K menggunakan k-fold cross validation, menghitung jarak euclidean, menghitung bobot dan skor setiap kelas, melihat nilai skor terbesar untuk menentukan hasil klasifikasi, kemudian menghitung nilai akurasi klasifikasi. Hasil penelitian menunjukkan bahwa nilai K dan nilai E yang optimal untuk klasifikasi status pembayaran premi di PT. Bumiputera Kota Samarinda menggunakan NWKNN sebesar K=3 dan E=6 dengan nilai akurasi sebesar 75%.

Download Full-text

Phishing Website Detection Using Machine Learning Classifiers Optimized by Feature Selection

Traitement du signal ◽

10.18280/ts.370403 ◽

2020 ◽

Vol 37 (4) ◽

pp. 563-569

Author(s):

Dželila Mehanović ◽

Jasmin Kevrić

Keyword(s):

Feature Selection ◽

Random Forest ◽

Cross Validation ◽

Nearest Neighbor ◽

Security Threats ◽

Selection Methods ◽

K Nearest Neighbor ◽

Machine Learning Classifiers ◽

Time To Build ◽

Fold Cross Validation

Security is one of the most actual topics in the online world. Lists of security threats are constantly updated. One of those threats are phishing websites. In this work, we address the problem of phishing websites classification. Three classifiers were used: K-Nearest Neighbor, Decision Tree and Random Forest with the feature selection methods from Weka. Achieved accuracy was 100% and number of features was decreased to seven. Moreover, when we decreased the number of features, we decreased time to build models too. Time for Random Forest was decreased from the initial 2.88s and 3.05s for percentage split and 10-fold cross validation to 0.02s and 0.16s respectively.

Download Full-text

Perbandingan Akurasi dan Waktu Proses Algoritma K-NN dan SVM dalam Analisis Sentimen Twitter

Jurnal Informatika ◽

10.31311/ji.v6i2.5129 ◽

2019 ◽

Vol 6 (2) ◽

pp. 226-235

Author(s):

Muhammad Rangga Aziz Nasution ◽

Mardhiya Hayaty

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Unsupervised Learning ◽

Supervised Learning ◽

Cross Validation ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Fold Cross Validation

Salah satu cabang ilmu komputer yaitu pembelajaran mesin (machine learning) menjadi tren dalam beberapa waktu terakhir. Pembelajaran mesin bekerja dengan memanfaatkan data dan algoritma untuk membuat model dengan pola dari kumpulan data tersebut. Selain itu, pembelajaran mesin juga mempelajari bagaimama model yang telah dibuat dapat memprediksi keluaran (output) berdasarkan pola yang ada. Terdapat dua jenis metode pembelajaran mesin yang dapat digunakan untuk analisis sentimen: supervised learning dan unsupervised learning. Penelitian ini akan membandingkan dua algoritma klasifikasi yang termasuk dari supervised learning: algoritma K-Nearest Neighbor dan Support Vector Machine, dengan cara membuat model dari masing-masing algoritma dengan objek teks sentimen. Perbandingan dilakukan untuk mengetahui algoritma mana lebih baik dalam segi akurasi dan waktu proses. Hasil pada perhitungan akurasi menunjukkan bahwa metode Support Vector Machine lebih unggul dengan nilai 89,70% tanpa K-Fold Cross Validation dan 88,76% dengan K-Fold Cross Validation. Sedangkan pada perhitungan waktu proses metode K-Nearest Neighbor lebih unggul dengan waktu proses 0.0160s tanpa K-Fold Cross Validation dan 0.1505s dengan K-Fold Cross Validation.

Download Full-text

Analisis Komparatif Evaluasi Performa Algoritma Klasifikasi pada Readmisi Pasien Diabetes

Jurnal Buana Informatika ◽

10.24002/jbi.v7i4.770 ◽

2016 ◽

Vol 7 (4) ◽

Author(s):

Mochammad Yusa ◽

Ema Utami ◽

Emha T. Luthfi

Keyword(s):

Data Mining ◽

Decision Tree ◽

Cross Validation ◽

Nearest Neighbor ◽

Naive Bayes ◽

Kappa Statistic ◽

Naïve Bayes ◽

Validation Dataset ◽

K Nearest Neighbor ◽

Fold Cross Validation

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi

Download Full-text

Analisis Perbandingan Algoritma Klasifikasi Citra Chest X-ray Untuk Deteksi Covid-19

Teknika ◽

10.34148/teknika.v10i2.331 ◽

2021 ◽

Vol 10 (2) ◽

pp. 96-103

Author(s):

Mohammad Farid Naufal ◽

Selvia Ferdiana Kusuma ◽

Kevin Christian Tanus ◽

Raynaldy Valentino Sukiwun ◽

Joseph Kristiano ◽

...

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Cross Validation ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Support Vector ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

X Ray ◽

Chest X Ray

Kondisi pandemi global Covid-19 yang muncul diakhir tahun 2019 telah menjadi permasalahan utama seluruh negara di dunia. Covid-19 merupakan virus yang menyerang organ paru-paru dan dapat mengakibatkan kematian. Pasien Covid-19 banyak yang telah dirawat di rumah sakit sehingga terdapat data citra chest X-ray paru-paru pasien yang terjangkit Covid-19. Saat ini sudah banyak peneltian yang melakukan klasifikasi citra chest X-ray menggunakan Convolutional Neural Network (CNN) untuk membedakan paru-paru sehat, terinfeksi covid-19, dan penyakit paru-paru lainnya, namun belum ada penelitian yang mencoba membandingkan performa algoritma CNN dan machine learning klasik seperti Support Vector Machine (SVM), dan K-Nearest Neighbor (KNN) untuk mengetahui gap performa dan waktu eksekusi yang dibutuhkan. Penelitian ini bertujuan untuk membandingkan performa dan waktu eksekusi algoritma klasifikasi K-Nearest Neighbors (KNN), Support Vector Machine (SVM), dan CNN untuk mendeteksi Covid-19 berdasarkan citra chest X-Ray. Berdasarkan hasil pengujian menggunakan 5 Cross Validation, CNN merupakan algoritma yang memiliki rata-rata performa terbaik yaitu akurasi 0,9591, precision 0,9592, recall 0,9591, dan F1 Score 0,959 dengan waktu eksekusi rata-rata sebesar 3102,562 detik.

Download Full-text