Money laundering transaction detection with classification tree models

2021 ◽  
Vol 16 (3) ◽  
pp. 21-25
Author(s):  
Paolo Giudici ◽  
◽  
Giulia Marini ◽  

The detection of money laundering is a very important problem, especially in the financial sector. We propose a mathematical specification of the problem in terms of a classification tree model that ”automates” expert based manual decisions. We operationally validate the model on a concrete application that originates from a large Italian bank. The application of the model to the data shows a good predictive accuracy and, even more importantly, the reduction of false positives, with respect to the ”manual” expert based activity. From an interpretational viewpoint, while some drivers of suspicious laundering activity are in line with the daily business practices of the bank’s anti money laundering operations, some others are new discoveries.

Author(s):  
Elena Ballante ◽  
Marta Galvani ◽  
Pierpaolo Uberti ◽  
Silvia Figini

AbstractIn this paper, a new approach in classification models, called Polarized Classification Tree model, is introduced. From a methodological perspective, a new index of polarization to measure the goodness of splits in the growth of a classification tree is proposed. The new introduced measure tackles weaknesses of the classical ones used in classification trees (Gini and Information Gain), because it does not only measure the impurity but it also reflects the distribution of each covariate in the node, i.e., employing more discriminating covariates to split the data at each node. From a computational prospective, a new algorithm is proposed and implemented employing the new proposed measure in the growth of a tree. In order to show how our proposal works, a simulation exercise has been carried out. The results obtained in the simulation framework suggest that our proposal significantly outperforms impurity measures commonly adopted in classification tree modeling. Moreover, the empirical evidence on real data shows that Polarized Classification Tree models are competitive and sometimes better with respect to classical classification tree models.


Author(s):  
Jou-An Chen ◽  
Chi-Chuan Shih ◽  
Pay-Fan Lin ◽  
Jin-Jong Chen ◽  
Kuan-Chia Lin

Abstract Health-related physical fitness has decreased with age; this is od immense concern to adolescents. School-based health intervention programs can be classified as either population-wide or high-risk approach. Although the population-wide and risk-based approaches adopt different healthcare angles, they all need to focus resources on risk evaluation. In this paper, we describe an exploratory application of cluster analysis and the tree model to collaborative evaluation of students’ health- related physical fitness from a high school sample in Taiwan (n=742). Cluster analysis show that physical fitness can be divided into relatively good, moderate and poor subgroups. There are significant differences in biochemical measurements among these three groups. For the tree model, we used 2004 school-year students as an experimental group and 2005 school-year students as a validation group. The results indicate that if sit-and-reach is shorter than 33 cm, BMI is >25.46 kg/m2, and 1600 m run/walk is >534 s, the predicted probability for the number of metabolic risk factors ≥2 is 100% and the population is 41, both results are the highest. From the risk-based healthcare viewpoint, the cluster analysis can sort out students’ physical fitness data in a short time and then narrow down the scope to recognize the subgroups. A classification tree model specifically shows the discrimination paths between the measurements of physical fitness for metabolic risk and would be helpful for self-management or proper healthcare education targeting different groups. Applying both methods to specific adolescents’ health issues could provide different angles in planning health promotion projects.


2021 ◽  
Author(s):  
Li Lu Wei ◽  
Yu jian

Abstract Background Hypertension is a common chronic disease in the world, and it is also a common basic disease of cardiovascular and brain complications. Overweight and obesity are the high risk factors of hypertension. In this study, three statistical methods, classification tree model, logistic regression model and BP neural network, were used to screen the risk factors of hypertension in overweight and obese population, and the interaction of risk factors was conducted Analysis, for the early detection of hypertension, early diagnosis and treatment, reduce the risk of hypertension complications, have a certain clinical significance.Methods The classification tree model, logistic regression model and BP neural network model were used to screen the risk factors of hypertension in overweight and obese people.The specificity, sensitivity and accuracy of the three models were evaluated by receiver operating characteristic curve (ROC). Finally, the classification tree CRT model was used to screen the related risk factors of overweight and obesity hypertension, and the non conditional logistic regression multiplication model was used to quantitatively analyze the interaction.Results The Youden index of ROC curve of classification tree model, logistic regression model and BP neural network model were 39.20%,37.02% ,34.85%, the sensitivity was 61.63%, 76.59%, 82.85%, the specificity was 77.58%, 60.44%, 52.00%, and the area under curve (AUC) was 0.721, 0.734,0.733, respectively. There was no significant difference in AUC between the three models (P>0.05). Classification tree CRT model and logistic regression multiplication model suggested that the interaction between NAFLD and FPG was closely related to the prevalence of overweight and obese hypertension.Conclusion NAFLD,FPG,age,TG,UA, LDL-C were the risk factors of hypertension in overweight and obese people. The interaction between NAFLD and FPG increased the risk of hypertension.


2021 ◽  
Vol 0 (0) ◽  
pp. 0-0
Author(s):  
Xiaonan Cui ◽  
Marjolein A. Heuvelmans ◽  
Grigory Sidorenkov ◽  
Yingru Zhao ◽  
Shuxuan Fan ◽  
...  

Author(s):  
V. Dudnyk ◽  
O. Grishchyn ◽  
V. Netrebko ◽  
R. Prus ◽  
M. Voloshcuk

An effective mechanism for the synthesis of classification trees based on fixed initial information (in the form of a training sample) for the task of recognizing the technical condition of samples of weapons and military equipment. The constructed algorithmic classification tree (model) will unmistakably classify (recognize) the entire training sample (situational objects) according to which the classification scheme is constructed. And have a minimal structure (structural complexity) and consist of components (modules) - autonomous algorithms for classification and recognition as vertices of the structure (attributes of the tree). The developed method of building models of algorithm trees (classification schemes) allows you to work with training samples of a large amount of different types of information (discrete type). Provides high accuracy, speed and economy of hardware resources in the process of generating the final classification scheme, build classification trees (models) with a predetermined accuracy. The approach of synthesis of new algorithms of recognition (classification) on the basis of library (set) of already known algorithms (schemes) and methods is offered. Based on the proposed concept of algorithmic classification trees, a set of models was built, which provided effective classification and prediction of the technical condition of samples. The paper proposes a set of general indicators (parameters), which allows to effectively present the general characteristics of the classification tree model, it is possible to use it to select the most optimal tree of algorithms from a set based on methods of random classification trees. Practical tests have confirmed the efficiency of mathematical software and models of algorithm trees.


Sign in / Sign up

Export Citation Format

Share Document