Data Mining for Targeted Inspections Against Undeclared Work by Applying the CRISP-DM Methodology

Author(s):  
Eleni Alogogianni ◽  
Maria Virvou
Keyword(s):  
2021 ◽  
pp. 1-27
Author(s):  
Eleni Alogogianni ◽  
Maria Virvou

Addressing undeclared work is a high priority in the labor field for government policymakers since it adversely affects all involved parties and results in significant losses in tax and social security contribution revenues. In the last years, the wide use of ICT in labor inspectorates and the considerable progress in data exchange have resulted in numerous databases dispersed in various units, yet these are not effectively used to increase their functions productivity. This study presents a detailed analysis of a data mining project per the CRISP-DM methodology aiming to assist the labor inspectorates in dealing with undeclared work and other labor law violations. It uses real past inspections data merged with companies characteristics and their employment details and examines the application of two Associative Classification algorithms, the CBA and CBA2, in combination with two types of datasets, a binary and a four-class. The produced models are assessed per the data mining goals and per the initial business objectives, and the research concludes proposing an innovative inspections recommendation tool proved to offer two major benefits: a mechanism for planning targeted inspections of improved efficiency and a knowledge repository for enhancing the inspectors understanding of those features linked with labor law violations.


2020 ◽  
Author(s):  
Mohammed J. Zaki ◽  
Wagner Meira, Jr
Keyword(s):  

2010 ◽  
Vol 24 (2) ◽  
pp. 112-119 ◽  
Author(s):  
F. Riganello ◽  
A. Candelieri ◽  
M. Quintieri ◽  
G. Dolce

The purpose of the study was to identify significant changes in heart rate variability (an emerging descriptor of emotional conditions; HRV) concomitant to complex auditory stimuli with emotional value (music). In healthy controls, traumatic brain injured (TBI) patients, and subjects in the vegetative state (VS) the heart beat was continuously recorded while the subjects were passively listening to each of four music samples of different authorship. The heart rate (parametric and nonparametric) frequency spectra were computed and the spectra descriptors were processed by data-mining procedures. Data-mining sorted the nu_lf (normalized parameter unit of the spectrum low frequency range) as the significant descriptor by which the healthy controls, TBI patients, and VS subjects’ HRV responses to music could be clustered in classes matching those defined by the controls and TBI patients’ subjective reports. These findings promote the potential for HRV to reflect complex emotional stimuli and suggest that residual emotional reactions continue to occur in VS. HRV descriptors and data-mining appear applicable in brain function research in the absence of consciousness.


Author(s):  
Kiran Kumar S V N Madupu

Big Data has terrific influence on scientific discoveries and also value development. This paper presents approaches in data mining and modern technologies in Big Data. Difficulties of data mining as well as data mining with big data are discussed. Some technology development of data mining as well as data mining with big data are additionally presented.


2020 ◽  
Vol 3 (3) ◽  
pp. 187-201
Author(s):  
Sufajar Butsianto ◽  
Nindi Tya Mayangwulan

Penggunaan mobil di Indonesia setiap tahunnya selalu meningkat dan membuat perusahaan otomotif berlomba-lomba dalam peningkatan penjualannya. Tujuan dari penelitian ini untuk mengelompokan data penjualan kedalam sebuah cluster dengan metode Data Mining Algoritma K-Means Clustering. Data Penjualan nantinya akan dikelompokan berdasarkan kemiripan data tersebut sehingga data dengan karakteristik yang sama akan berada dalam satu cluster. Atribut yang digunakan adalah brand dan penjualan. Cluster yang terbentuk setelah dilakukan proses K-Means Clustering terbagi menjadi tiga cluster yaitu Cluster 0 jumlah anggota 235 dengan presentase 26% dikategorikan Laris, Cluster 1 jumlah anggota 604 dengan presentase 67% dikategorikan Kurang Laris, dan Cluster 2 jumlah angota 61 dengan presentase 7% dikategorikan Paling Laris, dari proses clustering diatas dapat diperoleh validasi DBI (Davies Bouldin Index) dengan nilai 0,341


Sign in / Sign up

Export Citation Format

Share Document