Bank Deposit Prediction Using Ensemble Learning

2021 ◽  
pp. 42-51
Author(s):  
Muhammed J. A. Patwary ◽  
S. Akter ◽  
M. S. Bin Alam ◽  
A. N. M. Rezaul Karim

Bank deposit is one of the vital issues for any financial institution. It is very challenging to predict a customer if he/she can be a depositor by analyzing related information. Some recent reports demonstrate that economic depression and the continuous decline of the economy negatively impact business organizations and banking sectors. Due to such economic depression, banks cannot attract a customer's attention. Thus, marketing is preferred to be a handy tool for the banking sector to draw customers' attention for a term deposit. The purpose of this paper is to study the performance of ensemble learning algorithms which is a novel approach to predict whether a new customer will have a term deposit or not. A Portuguese retail bank data is used for our study, containing 45,211 phone contacts with 16 input attributes and one decision attribute. The data are preprocessed by using the Discretization technique. 40,690 samples are used for training the classifiers, and 4,521 samples are used for testing. In this work, the performance of the three mostly used classification algorithms named Support Vector Machine (SVM), Neural Network (NN), and Naive Bayes (NB) are analyzed. Then the ability of ensemble methods to improve the efficiency of basic classification algorithms is investigated and experimentally demonstrated. Experimental results exhibit that the performance metrics of Neural Network (Bagging) is higher than other ensemble methods. Its accuracy, sensitivity, and specificity are 96.62%, 97.14%, and 99.08%, respectively. Although all input attributes are considered in the classification method, in the end, a descriptive analysis has shown that some input attributes have more importance for this classification. Overall, it is shown that ensemble methods outperformed the traditional algorithms in this domain. We believe our contribution can be used as a depositor prediction system to provide additional support for bank deposit prediction.

2021 ◽  
Author(s):  
jorge cabrera Alvargonzalez ◽  
Ana Larranaga Janeiro ◽  
Sonia Perez ◽  
Javier Martinez Torres ◽  
Lucia martinez lamas ◽  
...  

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been and remains one of the major challenges humanity has faced thus far. Over the past few months, large amounts of information have been collected that are only now beginning to be assimilated. In the present work, the existence of residual information in the massive numbers of rRT-PCRs that tested positive out of the almost half a million tests that were performed during the pandemic is investigated. This residual information is believed to be highly related to a pattern in the number of cycles that are necessary to detect positive samples as such. Thus, a database of more than 20,000 positive samples was collected, and two supervised classification algorithms (a support vector machine and a neural network) were trained to temporally locate each sample based solely and exclusively on the number of cycles determined in the rRT-PCR of each individual. Finally, the results obtained from the classification show how the appearance of each wave is coincident with the surge of each of the variants present in the region of Galicia (Spain) during the development of the SARS-CoV-2 pandemic and clearly identified with the classification algorithm.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-6
Author(s):  
Ruixia Yan ◽  
Zhijie Xia ◽  
Yanxi Xie ◽  
Xiaoli Wang ◽  
Zukang Song

The product online review text contains a large number of opinions and emotions. In order to identify the public’s emotional and tendentious information, we present reinforcement learning models in which sentiment classification algorithms of product online review corpus are discussed in this paper. In order to explore the classification effect of different sentiment classification algorithms, we conducted a research on Naive Bayesian algorithm, support vector machine algorithm, and neural network algorithm and carried out some comparison using a concrete example. The evaluation indexes and the three algorithms are compared in different lengths of sentence and word vector dimensions. The results present that neural network algorithm is effective in the sentiment classification of product online review corpus.


2021 ◽  
Vol 8 (2) ◽  
pp. 311
Author(s):  
Mohammad Farid Naufal

<p class="Abstrak">Cuaca merupakan faktor penting yang dipertimbangkan untuk berbagai pengambilan keputusan. Klasifikasi cuaca manual oleh manusia membutuhkan waktu yang lama dan inkonsistensi. <em>Computer vision</em> adalah cabang ilmu yang digunakan komputer untuk mengenali atau melakukan klasifikasi citra. Hal ini dapat membantu pengembangan <em>self autonomous machine</em> agar tidak bergantung pada koneksi internet dan dapat melakukan kalkulasi sendiri secara <em>real time</em>. Terdapat beberapa algoritma klasifikasi citra populer yaitu K-Nearest Neighbors (KNN), Support Vector Machine (SVM), dan Convolutional Neural Network (CNN). KNN dan SVM merupakan algoritma klasifikasi dari <em>Machine Learning</em> sedangkan CNN merupakan algoritma klasifikasi dari Deep Neural Network. Penelitian ini bertujuan untuk membandingkan performa dari tiga algoritma tersebut sehingga diketahui berapa gap performa diantara ketiganya. Arsitektur uji coba yang dilakukan adalah menggunakan 5 cross validation. Beberapa parameter digunakan untuk mengkonfigurasikan algoritma KNN, SVM, dan CNN. Dari hasil uji coba yang dilakukan CNN memiliki performa terbaik dengan akurasi 0.942, precision 0.943, recall 0.942, dan F1 Score 0.942.</p><p class="Abstrak"> </p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Weather is an important factor that is considered for various decision making. Manual weather classification by humans is time consuming and inconsistent. Computer vision is a branch of science that computers use to recognize or classify images. This can help develop self-autonomous machines so that they are not dependent on an internet connection and can perform their own calculations in real time. There are several popular image classification algorithms, namely K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Convolutional Neural Network (CNN). KNN and SVM are Machine Learning classification algorithms, while CNN is a Deep Neural Networks classification algorithm. This study aims to compare the performance of that three algorithms so that the performance gap between the three is known. The test architecture is using 5 cross validation. Several parameters are used to configure the KNN, SVM, and CNN algorithms. From the test results conducted by CNN, it has the best performance with 0.942 accuracy, 0.943 precision, 0.942 recall, and F1 Score 0.942.</em></p><p class="Abstrak"><em><strong><br /></strong></em></p>


Author(s):  
Maria Morgan ◽  
Carla Blank ◽  
Raed Seetan

<p>This paper investigates the capability of six existing classification algorithms (Artificial Neural Network, Naïve Bayes, k-Nearest Neighbor, Support Vector Machine, Decision Tree and Random Forest) in classifying and predicting diseases in soybean and mushroom datasets using datasets with numerical or categorical attributes. While many similar studies have been conducted on datasets of images to predict plant diseases, the main objective of this study is to suggest classification methods that can be used for disease classification and prediction in datasets that contain raw measurements instead of images. A fungus and a plant dataset, which had many differences, were chosen so that the findings in this paper could be applied to future research for disease prediction and classification in a variety of datasets which contain raw measurements. A key difference between the two datasets, other than one being a fungus and one being a plant, is that the mushroom dataset is balanced and only contained two classes while the soybean dataset is imbalanced and contained eighteen classes. All six algorithms performed well on the mushroom dataset, while the Artificial Neural Network and k-Nearest Neighbor algorithms performed best on the soybean dataset. The findings of this paper can be applied to future research on disease classification and prediction in a variety of dataset types such as fungi, plants, humans, and animals.</p>


Author(s):  
Donghyun Kim

In this paper, we propose methods for brain tumor detection in MRI images based on ensemble learning. We build upon prior research on ensemble methods by testing the concatenation of pre-trained models: features extracted via transfer learning are merged and segmented by classification algorithms or a stacked ensemble of those algorithms. The proposed approach achieved accuracy scores of 0.98 , outperforming a benchmark VGG-16 model. Considerations to granular computing are given in the paper as well.


2020 ◽  
Vol 39 (5) ◽  
pp. 6073-6087
Author(s):  
Meltem Yontar ◽  
Özge Hüsniye Namli ◽  
Seda Yanik

Customer behavior prediction is gaining more importance in the banking sector like in any other sector recently. This study aims to propose a model to predict whether credit card users will pay their debts or not. Using the proposed model, potential unpaid risks can be predicted and necessary actions can be taken in time. For the prediction of customers’ payment status of next months, we use Artificial Neural Network (ANN), Support Vector Machine (SVM), Classification and Regression Tree (CART) and C4.5, which are widely used artificial intelligence and decision tree algorithms. Our dataset includes 10713 customer’s records obtained from a well-known bank in Taiwan. These records consist of customer information such as the amount of credit, gender, education level, marital status, age, past payment records, invoice amount and amount of credit card payments. We apply cross validation and hold-out methods to divide our dataset into two parts as training and test sets. Then we evaluate the algorithms with the proposed performance metrics. We also optimize the parameters of the algorithms to improve the performance of prediction. The results show that the model built with the CART algorithm, one of the decision tree algorithm, provides high accuracy (about 86%) to predict the customers’ payment status for next month. When the algorithm parameters are optimized, classification accuracy and performance are increased.


Electronics ◽  
2019 ◽  
Vol 8 (2) ◽  
pp. 122 ◽  
Author(s):  
Maheen Zahid ◽  
Fahad Ahmed ◽  
Nadeem Javaid ◽  
Raza Abbasi ◽  
Hafiza Zainab Kazmi ◽  
...  

Short-Term Electricity Load Forecasting (STELF) through Data Analytics (DA) is an emerging and active research area. Forecasting about electricity load and price provides future trends and patterns of consumption. There is a loss in generation and use of electricity. So, multiple strategies are used to solve the aforementioned problems. Day-ahead electricity price and load forecasting are beneficial for both suppliers and consumers. In this paper, Deep Learning (DL) and data mining techniques are used for electricity load and price forecasting. XG-Boost (XGB), Decision Tree (DT), Recursive Feature Elimination (RFE) and Random Forest (RF) are used for feature selection and feature extraction. Enhanced Convolutional Neural Network (ECNN) and Enhanced Support Vector Regression (ESVR) are used as classifiers. Grid Search (GS) is used for tuning of the parameters of classifiers to increase their performance. The risk of over-fitting is mitigated by adding multiple layers in ECNN. Finally, the proposed models are compared with different benchmark schemes for stability analysis. The performance metrics MSE, RMSE, MAE, and MAPE are used to evaluate the performance of the proposed models. The experimental results show that the proposed models outperformed other benchmark schemes. ECNN performed well with threshold 0.08 for load forecasting. While ESVR performed better with threshold value 0.15 for price forecasting. ECNN achieved almost 2% better accuracy than CNN. Furthermore, ESVR achieved almost 1% better accuracy than the existing scheme (SVR).


2019 ◽  
Vol 51 (1) ◽  
pp. 19-33 ◽  
Author(s):  
Jobin T. Philip ◽  
S. Thomas George

Brain-computer interfaces are sophisticated signal processing systems, which directly operate on neuronal signals to identify specific human intents. These systems can be applied to overcome certain disabilities or to enhance the natural capabilities of human beings. The visual P300 mind-speller is a prominent one among them, which has opened up tremendous possibilities in movement and communication applications. Today, there exist many state-of-the-art visual P300 mind-speller implementations in the literature as a result of numerous researches in this domain over the past 2 decades. Each of these systems can be evaluated in terms of performance metrics like classification accuracy, information transfer rate, and processing time. Various classification techniques associated with these systems, which include but are not limited to discriminant analysis, support vector machine, neural network, distance-based and ensemble of classifiers, have major roles in determining the overall system performances. The significance of a proper review on the recent developments in visual P300 mind-spellers with proper emphasis on their classification algorithms is the key insight for this work. This article is organized with a brief introduction to P300, concepts of visual P300 mind-spellers, the survey of literature with special focus on classification algorithms, followed by the discussion of various challenges and future directions.


2018 ◽  
Vol 7 (3.6) ◽  
pp. 154
Author(s):  
S K. Sajan ◽  
M Germanus Alex

Breast cancer is a major threat humans are facing irrespective of geographical limits. The awareness about breast cancer has increased during the last decade and many preventive measures were in practice to detect the breast cancer before the symptoms were felt. Mammography is a screening methodology currently in practice. In this paper the mammogram image is analyzed using automated system. The automated system is designed to be capable of distinguishing the mammogram image into a normal or malignant. This process involves image enhancement and image segmentation at preprocessing level. Histogram equalization technique is used to transform low contrast region of the mammogram into region with higher contrast and Fuzzy C Means (FCM) algorithm is used to segment the mammogram image into regions suitable for further analysis. After enhancement and segmentation at preprocessing level the classification is done using three classification algorithms like decision tree classifier, Neural Network classifier and Support Vector Machine (SVM). The performance of the classification algorithms is evaluated using the following criteria like speed, flexibility, robustness, scalability, interpretability, Time complexity and also based on accuracy, sensitivity and specificity. The results obtained in classification are compared with other classification algorithms. It is found that the neural network classifier approach produces better results compared to other classifiers.The average accuracy in diagnosis by Neural Network approach classifier is around 91%.  Also it is found that the decision tree approach is much flexible and easy to use compared to other approaches.  


Author(s):  
Yu. M. Beketnova

The results of solving the classification problem of credit organizations from the point of view of possible involvement in the money laundering processes are presented. A comparative analysis of the results obtained using various modern classification algorithms is carried out. When analyzing credit institutions, Rosfinmonitoring analysts have to operate with large amounts of information. The actual need for the number of objects to be analyzed is in many times greater than the capabilities of analysts. This problematic situation requires prioritization of inspections. The heterogeneous nature of information resources and their significant volume exclude the possibility of their manual processing. It is necessary to move from successive expert examinations of individual objects to parallel mass automated checks, taking into account modern methodological and instrumental possibilities in the context of digital transformation of public administration. A comparative analysis of the results of processing data on the activities of credit organizations by classification methods – logistic regression, decision trees (algorithms of Two-Class Boosted Decision Forest, AdaBoost), the method of support vectors (algorithm of Two-Class Support Vector Machine), neural network methods (algorithm of Two-Class Neural Network), Bayesian networks (the algorithm of Two-Class Bayes Pointmachine) carried out. Of the classification algorithms considered, the most accurate results were shown by the algorithm of Two-Class Boosted Decision Forest (AdaBoost). The results obtained are of great practical importance and may allow Rosfinmonitoring analysts, as well as experts of the Bank of Russia, to identify deviant credit institutions potentially involved in money laundering processes.


Sign in / Sign up

Export Citation Format

Share Document