scholarly journals FraudMiner: A Novel Credit Card Fraud Detection Model Based on Frequent Itemset Mining

2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
K. R. Seeja ◽  
Masoumeh Zareapoor

This paper proposes an intelligent credit card fraud detection model for detecting fraud from highly imbalanced and anonymous credit card transaction datasets. The class imbalance problem is handled by finding legal as well as fraud transaction patterns for each customer by usingfrequent itemset mining.A matching algorithm is also proposed to find to which pattern (legal or fraud) the incoming transaction of a particular customer is closer and a decision is made accordingly. In order to handle the anonymous nature of the data, no preference is given to any of the attributes and each attribute is considered equally for finding the patterns. The performance evaluation of the proposed model is done on UCSD Data Mining Contest 2009 Dataset (anonymous and imbalanced) and it is found that the proposed model has very high fraud detection rate, balanced classification rate, Matthews correlation coefficient, and very less false alarm rate than other state-of-the-art classifiers.

Author(s):  
Upasana Mukherjee ◽  
Vandana Thakkar ◽  
Shawni Dutta ◽  
Utsab Mukherjee ◽  
Samir Kumar Bandyopadhyay

The growth of regularly generated data from many financial activities has significant implications for every corner of financial modelling. This study has investigated the utilization of these continuous growing data by a means of an automated process. The automated process can be developed by using Machine learning based techniques that analyze the data and gain experience from the underlying data. Different important domains of financial fields such as Credit card fraud detection, bankruptcy detection, loan default prediction, investment prediction, marketing and many more can be modelled by implementing machine learning methods. Among several machine learning based techniques, the use of parametric and non-parametric based methods are approached by this research. Two parametric models namely Logistic Regression, Gaussian Naive Bayes models and two non-parametric methods such as Random Forest, Decision Tree are implemented in this paper. All the mentioned models are developed and implemented in the field of Credit card fraud detection, bankruptcy detection, loan default prediction. In each of the aforementioned cases, the comparative study among the classification techniques is drawn and the best model is identified. The performance of each classifier on each considered domain is evaluated by various performance metrics such as accuracy, F1-score and mean squared error. In the credit card fraud detection model the decision tree classifier performs the best with an accuracy of 99.1% and, in the loan default prediction and bankruptcy detection model, the random forest classifier gives the best accuracy of  97% and 96.84% respectively.


2013 ◽  
Vol 10 (5) ◽  
pp. 1580-1586
Author(s):  
V.sidda Reddy ◽  
Dr T.V. Rao ◽  
Dr A. Govardhan

Data Stream Mining algorithms performs under constraints called space used and time taken, which is due to the streaming property. The relaxation in these constraints is inversely proportional to the streaming speed of the data. Since the caching and mining the streaming-data is sensitive, here in this paper a scalable, memory efficient caching and frequent itemset mining model is devised. The proposed model is an incremental approach that builds single level multi node trees called bushes from each window of the streaming data; henceforth we refer this proposed algorithm as a Tree (bush) based Incremental Frequent Itemset Mining (TIFIM) over data streams.


Author(s):  
Arti Jain ◽  
Archana Purwar ◽  
Divakar Yadav

Machine learning (ML) proven to be an emerging technology from small-scale to large-scale industries. One of the important industries is banking, where ML is being adapted all over the world by employing online banking. The online banking is using ML techniques in detecting fraudulent transactions like credit card fraud detection, etc. Hence, in this chapter, a Credit card Fraud Detection (CFD) system is devised using Luhn's algorithm and k-means clustering. Moreover, CFD system is also developed using Fuzzy C-Means (FCM) clustering instead of k-means clustering. Performance of CFD using both clustering techniques is compared using precision, recall and f-measure. The FCM gives better results in comparison to k-means clustering. Further, other evaluation metrics such as fraud catching rate, false alarm rate, balanced classification rate, and Mathews correlation coefficient are also calculated to show how well the CFD system works in the presence of skewed data.


Sign in / Sign up

Export Citation Format

Share Document