Leveraging Machine Learning in Financial Fraud Forensics in the Age of Cybersecurity

2022 ◽  
pp. 220-249
Author(s):  
Md Ariful Haque ◽  
Sachin Shetty

Financial sectors are lucrative cyber-attack targets because of their immediate financial gain. As a result, financial institutions face challenges in developing systems that can automatically identify security breaches and separate fraudulent transactions from legitimate transactions. Today, organizations widely use machine learning techniques to identify any fraudulent behavior in customers' transactions. However, machine learning techniques are often challenging because of financial institutions' confidentiality policy, leading to not sharing the customer transaction data. This chapter discusses some crucial challenges of handling cybersecurity and fraud in the financial industry and building machine learning-based models to address those challenges. The authors utilize an open-source e-commerce transaction dataset to illustrate the forensic processes by creating a machine learning model to classify fraudulent transactions. Overall, the chapter focuses on how the machine learning models can help detect and prevent fraudulent activities in the financial sector in the age of cybersecurity.

2020 ◽  
Vol 7 (1) ◽  
pp. 205395172092655 ◽  
Author(s):  
Kristian Bondo Hansen

Machine learning models are becoming increasingly prevalent in algorithmic trading and investment management. The spread of machine learning in finance challenges existing practices of modelling and model use and creates a demand for practical solutions for how to manage the complexity pertaining to these techniques. Drawing on interviews with quants applying machine learning techniques to financial problems, the article examines how these people manage model complexity in the process of devising machine learning-powered trading algorithms. The analysis shows that machine learning quants use Ockham’s razor – things should not be multiplied without necessity – as a heuristic tool to prevent excess model complexity and secure a certain level of human control and interpretability in the modelling process. I argue that understanding the way quants handle the complexity of learning models is a key to grasping the transformation of the human’s role in contemporary data and model-driven finance. The study contributes to social studies of finance research on the human–model interplay by exploring it in the context of machine learning model use.


2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


2021 ◽  
Vol 11 (3) ◽  
pp. 1323
Author(s):  
Medard Edmund Mswahili ◽  
Min-Jeong Lee ◽  
Gati Lother Martin ◽  
Junghyun Kim ◽  
Paul Kim ◽  
...  

Cocrystals are of much interest in industrial application as well as academic research, and screening of suitable coformers for active pharmaceutical ingredients is the most crucial and challenging step in cocrystal development. Recently, machine learning techniques are attracting researchers in many fields including pharmaceutical research such as quantitative structure-activity/property relationship. In this paper, we develop machine learning models to predict cocrystal formation. We extract descriptor values from simplified molecular-input line-entry system (SMILES) of compounds and compare the machine learning models by experiments with our collected data of 1476 instances. As a result, we found that artificial neural network shows great potential as it has the best accuracy, sensitivity, and F1 score. We also found that the model achieved comparable performance with about half of the descriptors chosen by feature selection algorithms. We believe that this will contribute to faster and more accurate cocrystal development.


student performance measured in CO-PO (Course Outcome and Program Outcome) attainment for OMR based answer sheet automation playing very curtail role in pupil concert analysis in this approach. In the proposed work, marks evaluation sheet is consider as input image, then apply frame cropping technique to extract the marks filled table by subdividing into cells as individual images by frame cropping technique. In order to recognition of hand written digit in each frame, various machine learning models are adopted, trained. Experimental results from proposed work show that convolutional neural network excels higher in identification digits from frames. The outputs are then converted to CSV version, which is used to evaluate CO-PO attainment for each learner. The experiments have been conducted and tested in proposed work on various machine learning techniques and compared the results to pick the optimal model


Author(s):  
Christos Floros ◽  
Panagiotis Ballas

Crises around the world reveal a generally unstable environment in the last decades within which banks and financial institutions operate. Risk is an inherent characteristic of financial institutions and is a multifaceted phenomenon. Everyday business practice involves decisions, which requires the use of information regarding various types of threats involved together with an evaluation of their impact on future performance, concluding to combinations of types of risks and projected returns for decision makers to choose from. Moreover, financial institutions process a massive amount of data, collected either internally or externally, in an effort to continuously analyse trends of the economy they operate in and decode global economic conditions. Even though research has been performed in the field of accounting and finance, the authors explore the application of machine learning techniques to facilitate decision making by top management of contemporary financial institutions improving the quality of their accounting disclosure.


2020 ◽  
Vol 9 (6) ◽  
pp. 379 ◽  
Author(s):  
Eleonora Grilli ◽  
Fabio Remondino

The use of machine learning techniques for point cloud classification has been investigated extensively in the last decade in the geospatial community, while in the cultural heritage field it has only recently started to be explored. The high complexity and heterogeneity of 3D heritage data, the diversity of the possible scenarios, and the different classification purposes that each case study might present, makes it difficult to realise a large training dataset for learning purposes. An important practical issue that has not been explored yet, is the application of a single machine learning model across large and different architectural datasets. This paper tackles this issue presenting a methodology able to successfully generalise to unseen scenarios a random forest model trained on a specific dataset. This is achieved looking for the best features suitable to identify the classes of interest (e.g., wall, windows, roof and columns).


Artificial intelligence (AI) can be implemented using Machine Learning which allows the computing to potentially robotically study and improve from its previous experiences without being manually typed. Data can be accessed and used by the computer programs developed using Machine learning. This paper mainly focused on implementation of machine learning in the arena of sports to predict the captivating team of an IPL match. Cricket is a popular uncertain sport, particularly the T-20 format, there’s a possibility of the complete game play to change with the effect of any single over. Millions of spectators watch the Indian Premier League (IPL) every year, hence it becomes a real-time problem to compose a technique that will forecast the conclusion of matches. Many aspects and features determine the result of a cricket match each of which has a weighted impact on the result of a T20 cricket match. This paper describes all those features in detail. A multivariate regression-based approach is proposed to measure the team's points in the league. The past performance of every team determines its probability of winning a match against a particular opponent. Finally, a set of seven factors or attributes is identified that can be used for predicting the IPL match winner. Various machine learning models were trained and used to perform within the time lapse between the toss and initiation of the match, to predict the winner. The performance of the model developed are evaluated with various classification techniques where Random Forest and Decision Tree have given good results.


Author(s):  
Daniel Elton ◽  
Zois Boukouvalas ◽  
Mark S. Butrico ◽  
Mark D. Fuge ◽  
Peter W. Chung

We present a proof of concept that machine learning techniques can be used to predict the properties of CNOHF energetic molecules from their molecular structures. We focus on a small but diverse dataset consisting of 109 molecular structures spread across ten compound classes. Up until now, candidate molecules for energetic materials have been screened using predictions from expensive quantum simulations and thermochemical codes. We present a comprehensive comparison of machine learning models and several molecular featurization methods - sum over bonds, custom descriptors, Coulomb matrices, bag of bonds, and fingerprints. The best featurization was sum over bonds (bond counting), and the best model was kernel ridge regression. Despite having a small data set, we obtain acceptable errors and Pearson correlations for the prediction of detonation pressure, detonation velocity, explosive energy, heat of formation, density, and other properties out of sample. By including another dataset with 309 additional molecules in our training we show how the error can be pushed lower, although the convergence with number of molecules is slow. Our work paves the way for future applications of machine learning in this domain, including automated lead generation and interpreting machine learning models to obtain novel chemical insights.


Author(s):  
Pratik Vyas ◽  
Diptangshu Pandit

The use of machine learning techniques in predictive health care is on the rise with minimal data used for training machine-learning models to derive high accuracy predictions. In this paper, we propose such a system, which utilizes Heart Rate Variability (HRV) as features for training machine learning models. This paper further benchmarks the usefulness of HRV as features calculated from basic heart-rate data using a window shifting method. The benchmarking has been conducted using different machine-learning classifiers such as artificial neural network, decision tree, k-nearest neighbour and naive bays classifier. Empirical results using MIT-BIH Arrhythmia database shows that the proposed system can be used for highly efficient predictability of abnormality in heartbeat data series.


Sign in / Sign up

Export Citation Format

Share Document