scholarly journals Predictive Analytics with Machine Learning for Fraud Detection

Author(s):  
A Sampath Abhishek

Abstract: The popularity of online shopping is growing day by day. In financial year 2021, over 40 billion digital transactions worth more than a quadrillion Indian rupees were recorded across the country. As the number of credit card users rise world- wide, the opportunities for attackers to steal credit card details and subsequently, commit fraud are also increasing. Since humans tend to exhibit specific behavioristic profiles, every cardholder can be represented by a set of patterns containing information about the typical purchase category, the time since the last purchase, the amount of money spent etc. So these frauds can be detected through various algorithms mainly random forest and logistic regression. To enhance the boost and build model with much more efficiency adaboost is also added. Keywords: Fraud detection, behavioristic profile, random forest, logistic regression, adaboost

Author(s):  
Upasana Mukherjee ◽  
Vandana Thakkar ◽  
Shawni Dutta ◽  
Utsab Mukherjee ◽  
Samir Kumar Bandyopadhyay

The growth of regularly generated data from many financial activities has significant implications for every corner of financial modelling. This study has investigated the utilization of these continuous growing data by a means of an automated process. The automated process can be developed by using Machine learning based techniques that analyze the data and gain experience from the underlying data. Different important domains of financial fields such as Credit card fraud detection, bankruptcy detection, loan default prediction, investment prediction, marketing and many more can be modelled by implementing machine learning methods. Among several machine learning based techniques, the use of parametric and non-parametric based methods are approached by this research. Two parametric models namely Logistic Regression, Gaussian Naive Bayes models and two non-parametric methods such as Random Forest, Decision Tree are implemented in this paper. All the mentioned models are developed and implemented in the field of Credit card fraud detection, bankruptcy detection, loan default prediction. In each of the aforementioned cases, the comparative study among the classification techniques is drawn and the best model is identified. The performance of each classifier on each considered domain is evaluated by various performance metrics such as accuracy, F1-score and mean squared error. In the credit card fraud detection model the decision tree classifier performs the best with an accuracy of 99.1% and, in the loan default prediction and bankruptcy detection model, the random forest classifier gives the best accuracy of  97% and 96.84% respectively.


In the digital world, Recently growth of online shopping site for purchasing clothes, electronic items, glossary etc and online transaction for transfer money is increasing day by day . At the same time, criminals have become able to doing fault and earning money through wrong ways .that’s why fraud grows. With the development of Machine Learning in the field of Computer Science and Engineering, its application in the different domain also in fields like Medical, Marketing, Telecommunication, finance, etc. The reason for the popularity of Machine Learning in these domains is due to its high accuracy prediction. That’s why over many years, machine learning has been used in fraud detection. With the advancement of technology in online transactions, fraud is the greatest issue for businesses and has become difficult to recognize than the traditional form of this crime. Historically, the area of Fraud Detection is interrelated to Data Mining & Text Mining. Due to the sudden growth of fraud whose outcome is loss of trillions of rupees worldwide every year, various modern techniques in detecting fraud were proposed that are progressed without interruption and applied to many business fields. Bank frauds worth ₹2.05 trillion happened in the last 11 years, among which there were overall 53,334 fraud issues in the usage of RBI data. The principle purpose behind this write up is to review different methods in identifying frauds corresponding to the unusualness in the transactions. The supervised and unsupervised machine learning algorithms will be used to identify fraud and the best first search optimization will be analyzed to compare both results, i.e., before and after optimization


2018 ◽  
Vol 7 (2) ◽  
pp. 917
Author(s):  
S Venkata Suryanarayana ◽  
G N. Balaji ◽  
G Venkateswara Rao

With the extensive use of credit cards, fraud appears as a major issue in the credit card business. It is hard to have some figures on the impact of fraud, since companies and banks do not like to disclose the amount of losses due to frauds. At the same time, public data are scarcely available for confidentiality issues, leaving unanswered many questions about what is the best strategy. Another problem in credit-card fraud loss estimation is that we can measure the loss of only those frauds that have been detected, and it is not possible to assess the size of unreported/undetected frauds. Fraud patterns are changing rapidly where fraud detection needs to be re-evaluated from a reactive to a proactive approach. In recent years, machine learning has gained lot of popularity in image analysis, natural language processing and speech recognition. In this regard, implementation of efficient fraud detection algorithms using machine-learning techniques is key for reducing these losses, and to assist fraud investigators. In this paper logistic regression, based machine learning approach is utilized to detect credit card fraud. The results show logistic regression based approaches outperforms with the highest accuracy and it can be effectively used for fraud investigators.  


2019 ◽  
Vol 8 (4) ◽  
pp. 1477-1483

With the fast moving technological advancement, the internet usage has been increased rapidly in all the fields. The money transactions for all the applications like online shopping, banking transactions, bill settlement in any industries, online ticket booking for travel and hotels, Fees payment for educational organization, Payment for treatment to hospitals, Payment for super market and variety of applications are using online credit card transactions. This leads to the fraud usage of other accounts and transaction that result in the loss of service and profit to the institution. With this background, this paper focuses on predicting the fraudulent credit card transaction. The Credit Card Transaction dataset from KAGGLE machine learning Repository is used for prediction analysis. The analysis of fraudulent credit card transaction is achieved in four ways. Firstly, the relationship between the variables of the dataset is identified and represented by the graphical notations. Secondly, the feature importance of the dataset is identified using Random Forest, Ada boost, Logistic Regression, Decision Tree, Extra Tree, Gradient Boosting and Naive Bayes classifiers. Thirdly, the extracted feature importance if the credit card transaction dataset is fitted to Random Forest classifier, Ada boost classifier, Logistic Regression classifier, Decision Tree classifier, Extra Tree classifier, Gradient Boosting classifier and Naive Bayes classifier. Fourth, the Performance Analysis is done by analyzing the performance metrics like Accuracy, FScore, AUC Score, Precision and Recall. The implementation is done by python in Anaconda Spyder Navigator Integrated Development Environment. Experimental Results shows that the Decision Tree classifier have achieved the effective prediction with the precision of 1.0, recall of 1.0, FScore of 1.0 , AUC Score of 89.09 and Accuracy of 99.92%.


Author(s):  
Samir Bandyopadhyay ◽  
Vandana Thakkar ◽  
Upasana Mukherjee ◽  
Shawni Dutta

The growth of regularly generated data from many financial activities has significant implications for every corner of financial modeling. This study has investigated the utilization of these continuous growing data by a means of an automated process. The automated process can be developed by using Machine learning based techniques that analyze the data and gain experience from the underlying data. Different important domains of financial fields such as Credit card fraud detection, bankruptcy detection, loan default prediction, investment prediction, marketing and many other financial models can be modeled by implementing machine learning models. Among several machine learning based techniques, the use of parametric and non-parametric based methods are approached by this research. Two parametric models namely Logistic Regression, Gaussian Naive Bayes models and two non-parametric methods such as Random Forest, Decision Tree are implemented in this paper. All the mentioned models are developed and implemented in the field of Credit card fraud detection, bankruptcy detection, loan default prediction. In each of the aforementioned cases, the comparative study among the classification techniques is drawn and the best model is identified. The performance of each classifier on each considered domain is evaluated by various performance metrics such as accuracy, recall, precision, F1-score and mean squared error. In the credit card fraud detection model the decision tree classifier performs the best with an accuracy of 99.1% and, in the loan default prediction and bankruptcy detection model, the random forest classifier gives the best accuracy of 97% and 96.84% respectively.


Author(s):  
G.Bhargav Chowdari

One of the most serious ethical challenges in the credit card industry is fraud. Our paper’s major goal is to identify credit card theft and offer a reasonable solution to the problem. Credit card fraud has cost customers and banks billions of dollars around the world. Fraudsters are constantly attempting to come up with new ways and tricks to commit fraud, despite the fact that there are several measures in place to prevent it. Fraud detection is extremely important in the banking and finance industries. For detection purposes, we will use an artificial neural network. As a result, in order to prevent it, we will develop a system that will not only detect fraud, but will also detect it before it occurs. In order to detect new scams, our system will learn from previous frauds. Mining algorithms were used to detect fraud, but they failed miserably. We use machine learning methods to detect fraud in credit card transactions in our paper. The research employs supervised learning methods that are applied to a kaggle dataset that is severely skewed and imbalanced. We used robust scalar to balance the set, resulting in 51 percent non-fraud cases and 49 percent fraud ones. Logistic regression, random forest, decision tree, and KNN have all been implemented, with additional learning curves displaying which algorithm performs best. Accuracy, specificity, precision, and sensitivity are the evaluation criteria, and a comparative chart is created to show the comparative analysis of various supervised learning algorithms. KEYWORDS: KNN,Neural network,Logistic regression,Random forest,Decision tree


With the advent of modern transaction technology, many are using online transactions to transfer money from one person to another. Credit Card Fraud, a rising problem in the financial department goes unnoticed most of the time. A lot of research is going on in this area.The Credit Card Fraud Detection project is developed to spot whether a new transaction is fraudulent or not with the knowledge of previousdata. We use various predictive models to ascertain how accurate they are in predicting whether a transaction is abnormalor regular. Techniques like Decision Tree, Logistic Regression, SVMand Naïve Bayes are the classification algorithms to detect non-fraud and fraud transactions.


Entropy ◽  
2019 ◽  
Vol 21 (11) ◽  
pp. 1087 ◽  
Author(s):  
Diego C. Nascimento ◽  
Bruno Barbosa ◽  
André M. Perez ◽  
Daniel O. Caires ◽  
Edgar Hirama ◽  
...  

This work aimed to develop business intelligence towards fraud detection using buyer-placed information combined with the sound analysis from a confirmation purchase call. We used a dataset of 789 orders in 2018, provided by different e-commerce websites and calls fulfilled from every Brazilian state. Nine acoustic index features were used, through entropy in sound and vibration, summarizing the audio plus 6 extra features related, added by 12 customer features to compose two different classifiers (Logistic Regression and Random Forest). The acoustic indexes were, in fact, capable of providing better accuracy of the models, showing a probability associated with the voice characteristics, helping decision-making in credit card fraud.


Sign in / Sign up

Export Citation Format

Share Document