scholarly journals Credit Card Fraud Detection using Random Forest Algorithm

Author(s):  
Gokula Krishnan. V
Author(s):  
Prof. Teena Varma ◽  
Mahesh Poojari ◽  
Jobin Joseph ◽  
Ainsley Cardozo

This research focused mainly on detecting credit card fraud in real world. We must collect the credit card data sets initially for qualified data set. Then provide queries on the user's credit card to test the data set. After random forest algorithm classification method using the already evaluated data set and providing current data set[1]. Finally, the accuracy of the results data is optimised. Then the processing of a number of attributes will be implemented, so that affecting fraud detection can be found in viewing the representation of the graphical model. The techniques efficiency is measured based on accuracy, flexibility, and specificity, precision. The results obtained with the use of the Random Forest Algorithm have proved much more effective


Author(s):  
Upasana Mukherjee ◽  
Vandana Thakkar ◽  
Shawni Dutta ◽  
Utsab Mukherjee ◽  
Samir Kumar Bandyopadhyay

The growth of regularly generated data from many financial activities has significant implications for every corner of financial modelling. This study has investigated the utilization of these continuous growing data by a means of an automated process. The automated process can be developed by using Machine learning based techniques that analyze the data and gain experience from the underlying data. Different important domains of financial fields such as Credit card fraud detection, bankruptcy detection, loan default prediction, investment prediction, marketing and many more can be modelled by implementing machine learning methods. Among several machine learning based techniques, the use of parametric and non-parametric based methods are approached by this research. Two parametric models namely Logistic Regression, Gaussian Naive Bayes models and two non-parametric methods such as Random Forest, Decision Tree are implemented in this paper. All the mentioned models are developed and implemented in the field of Credit card fraud detection, bankruptcy detection, loan default prediction. In each of the aforementioned cases, the comparative study among the classification techniques is drawn and the best model is identified. The performance of each classifier on each considered domain is evaluated by various performance metrics such as accuracy, F1-score and mean squared error. In the credit card fraud detection model the decision tree classifier performs the best with an accuracy of 99.1% and, in the loan default prediction and bankruptcy detection model, the random forest classifier gives the best accuracy of  97% and 96.84% respectively.


2021 ◽  
Vol 23 (06) ◽  
pp. 318-344
Author(s):  
Amit Pundir ◽  
◽  
Rajesh Pandey ◽  

Misrepresentation of money is a developing issue in monetary business with far-reaching consequences and keeping in mind that many processes have been found. Data quality management with data mining has been effectively applied to data sets to mechanize the investigation of massive amounts of complex information. Data mining has likewise played a notable role in identifying credit card fraud in online exchanges. Fraud detection in credit cards is a data quality management issue that considered under data mining, tested for two important reasons — first, the profiles of ordinary and false practices habitually change, and also because of the explanation that charge card fraud information is exceptionally slow. This research paper examines the performance of Decision Trees, Logistics Regression, and Random Forest rely strategically on profoundly skewed credit card fraud data. The dataset of credit card transactions is sourced from Kaggle (a publically accessible dataset repository) with 284,807 transactions. These methods are applied to raw data values and data preprocessing techniques. Assessment of the performance of techniques depends on accuracy, sensitivity, specificity, precision, and recall. Results indicate the optimal accuracy for the decision trees, logistics regression, and random forest classifiers with 90.8%, 98.5%, and 99.1% respectively.


2021 ◽  
Vol 11 (1) ◽  
pp. 34-39
Author(s):  
Chenglong Li ◽  
◽  
Ning Ding ◽  
Haoyun Dong ◽  
Yiming Zhai ◽  
...  

With the development of e-commerce, credit card fraud is also increasing. At the same time, the way of credit card fraud is also constantly innovating. Support Vector Machine, Logical Regression, Random Forest, Naive Bayes and other algorithms are often used in credit card fraud identification. However, the current fraud detection technology is not accurate, and may cause significant economic losses to cardholders and banks. This paper will introduce an innovative method to optimize the support vector machine by cuckoo search algorithm to improve its ability of identifying credit card fraud. Cuckoo search algorithm improves classification performance by optimizing the parameters of support vector machine kernel function (C, g). The results demonstrate that CS-SVM is superior to SVM in Accuracy, Precision, Recall, F1-score, AUC, and superior to Logistic. Regression, Random Forest, Decision Tree, Naive Bayes, whose accuracy is 98%.


Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2683
Author(s):  
Tzu-Hsuan Lin ◽  
Jehn-Ruey Jiang

This paper proposes a method, called autoencoder with probabilistic random forest (AEPRF), for detecting credit card frauds. The proposed AE-PRF method first utilizes the autoencoder to extract features of low-dimensionality from credit card transaction data features of high-dimensionality. It then relies on the random forest, an ensemble learning mechanism using the bootstrap aggregating (bagging) concept, with probabilistic classification to classify data as fraudulent or normal. The credit card fraud detection (CCFD) dataset is applied to AE-PRF for performance evaluation and comparison. The CCFD dataset contains large numbers of credit card transactions of European cardholders; it is highly imbalanced since its normal transactions far outnumber fraudulent transactions. Data resampling schemes like the synthetic minority oversampling technique (SMOTE), adaptive synthetic (ADASYN), and Tomek link (T-Link) are applied to the CCFD dataset to balance the numbers of normal and fraudulent transactions for improving AE-PRF performance. Experimental results show that the performance of AE-PRF does not vary much whether resampling schemes are applied to the dataset or not. This indicates that AE-PRF is naturally suitable for dealing with imbalanced datasets. When compared with related methods, AE-PRF has relatively excellent performance in terms of accuracy, the true positive rate, the true negative rate, the Matthews correlation coefficient, and the area under the receiver operating characteristic curve.


Sign in / Sign up

Export Citation Format

Share Document