A Comprehensive Survey of Dynamic Data Mining Process in Knowledge Discovery from Database

Since the First KDD Workshop back in 1989 when “Knowledge Mining” was recognized as one of the top 5 topics in future database research (Piatetsky-Shapiro 1991), many scientists as well as users in industry and public organizations have considered data mining as highly relevant for their respective professional activities. We have witnessed the development of advanced data mining techniques as well as the successful implementation of knowledge discovery systems in many companies and organizations worldwide. Most of these implementations are static in the sense that they do not contemplate explicitly a changing environment. However, since most analyzed phenomena change over time, the respective systems should be adapted to the new environment in order to provide useful and reliable analyses. If we consider for example a system for credit card fraud detection, we may want to segment our customers, process stream data generated by their transactions, and finally classify them according to their fraud probability where fraud pattern change over time. If our segmentation should group together homogeneous customers using not only their current feature values but also their trajectories, things get even more difficult since we have to cluster vectors of functions instead of vectors of real values. An example for such a trajectory could be the development of our customers’ number of transactions over the past six months or so if such a development tells us more about their behavior than just a single value; e.g., the most recent number of transactions. It is in this kind of applications is where dynamic data mining comes into play! Since data mining is just one step of the iterative KDD (Knowledge Discovery in Databases) process (Han & Kamber, 2001), dynamic elements should be considered also during the other steps. The entire process consists basically of activities that are performed before doing data mining (such as: selection, pre-processing, transformation of data (Famili et al., 1997)), the actual data mining part, and subsequent steps (such as: interpretation, evaluation of results). In subsequent sections we will present the background regarding dynamic data mining by studying existing methodological approaches as well as already performed applications and even patents and tools. Then we will provide the main focus of this chapter by presenting dynamic approaches for each step of the KDD process. Some methodological aspects regarding dynamic data mining will be presented in more detail. After envisioning future trends regarding dynamic data mining we will conclude this chapter.

Download Full-text

Semantic Web in data mining and knowledge discovery: A comprehensive survey

Journal of Web Semantics ◽

10.1016/j.websem.2016.01.001 ◽

2016 ◽

Vol 36 ◽

pp. 1-22 ◽

Cited By ~ 127

Author(s):

Petar Ristoski ◽

Heiko Paulheim

Keyword(s):

Data Mining ◽

Semantic Web ◽

Knowledge Discovery ◽

Comprehensive Survey

Download Full-text

Semantic Web in Data Mining and Knowledge Discovery: A Comprehensive Survey

SSRN Electronic Journal ◽

10.2139/ssrn.3199217 ◽

2016 ◽

Cited By ~ 4

Author(s):

Petar Ristoski ◽

Heiko Paulheim

Keyword(s):

Data Mining ◽

Semantic Web ◽

Knowledge Discovery ◽

Comprehensive Survey

Download Full-text

Analisis Data Pembayaran Kredit Nasabah Bank Menggunakan Metode Data Mining

Jurnal ULTIMA InfoSys ◽

10.31937/si.v4i1.238 ◽

2013 ◽

Vol 4 (1) ◽

pp. 18-27

Author(s):

Ira Melissa ◽

Raymond S. Oetama

Keyword(s):

Data Mining ◽

Knowledge Discovery ◽

Knowledge Discovery In Database

Data mining adalah analisis atau pengamatan terhadap kumpulan data yang besar dengan tujuan untuk menemukan hubungan tak terduga dan untuk meringkas data dengan cara yang lebih mudah dimengerti dan bermanfaat bagi pemilik data. Data mining merupakan proses inti dalam Knowledge Discovery in Database (KDD). Metode data mining digunakan untuk menganalisis data pembayaran kredit peminjam pembayaran kredit. Berdasarkan pola pembayaran kredit peminjam yang dihasilkan, dapat dilihat parameter-parameter kredit yang memiliki keterkaitan dan paling berpengaruh terhadap pembayaran angsuran kredit. Kata kunci—data mining, outlier, multikolonieritas, Anova

Download Full-text

The AI Delusion

10.1093/oso/9780198824305.001.0001 ◽

2018 ◽

Cited By ~ 5

Author(s):

Gary Smith

Keyword(s):

Data Mining ◽

Knowledge Discovery ◽

Industrial Revolution ◽

The Real ◽

Intelligent Machines ◽

Black Boxes ◽

Real Danger ◽

The Way

We live in an incredible period in history. The Computer Revolution may be even more life-changing than the Industrial Revolution. We can do things with computers that could never be done before, and computers can do things for us that could never be done before. But our love of computers should not cloud our thinking about their limitations. We are told that computers are smarter than humans and that data mining can identify previously unknown truths, or make discoveries that will revolutionize our lives. Our lives may well be changed, but not necessarily for the better. Computers are very good at discovering patterns, but are useless in judging whether the unearthed patterns are sensible because computers do not think the way humans think. We fear that super-intelligent machines will decide to protect themselves by enslaving or eliminating humans. But the real danger is not that computers are smarter than us, but that we think computers are smarter than us and, so, trust computers to make important decisions for us. The AI Delusion explains why we should not be intimidated into thinking that computers are infallible, that data-mining is knowledge discovery, and that black boxes should be trusted.

Download Full-text

Particularities of data mining in medicine: lessons learned from patient medical time series data analysis

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-019-1582-2 ◽

2019 ◽

Vol 2019 (1) ◽

Cited By ~ 2

Author(s):

Shadi Aljawarneh ◽

Aurea Anguera ◽

John William Atwood ◽

Juan A. Lara ◽

David Lizcano

Keyword(s):

Data Mining ◽

Time Series ◽

Knowledge Discovery ◽

Time Series Data ◽

Medical Patient ◽

Lessons Learned ◽

Physiological Signals ◽

Knowledge Discovery In Databases ◽

Series Data ◽

Data Mining Techniques

AbstractNowadays, large amounts of data are generated in the medical domain. Various physiological signals generated from different organs can be recorded to extract interesting information about patients’ health. The analysis of physiological signals is a hard task that requires the use of specific approaches such as the Knowledge Discovery in Databases process. The application of such process in the domain of medicine has a series of implications and difficulties, especially regarding the application of data mining techniques to data, mainly time series, gathered from medical examinations of patients. The goal of this paper is to describe the lessons learned and the experience gathered by the authors applying data mining techniques to real medical patient data including time series. In this research, we carried out an exhaustive case study working on data from two medical fields: stabilometry (15 professional basketball players, 18 elite ice skaters) and electroencephalography (100 healthy patients, 100 epileptic patients). We applied a previously proposed knowledge discovery framework for classification purpose obtaining good results in terms of classification accuracy (greater than 99% in both fields). The good results obtained in our research are the groundwork for the lessons learned and recommendations made in this position paper that intends to be a guide for experts who have to face similar medical data mining projects.

Download Full-text