scholarly journals Ontology-Based Knowledge Model for Multi-View KDD Process

Author(s):  
EL Moukhtar Zemmouri ◽  
Hicham Behja ◽  
Abdelaziz Marzak ◽  
Brigitte Trousse

Knowledge Discovery in Databases (KDD) is a highly complex, iterative and interactive process that involves several types of knowledge and expertise. In this paper the authors propose to support users of a multi-view analysis (a KDD process held by several experts who analyze the same data with different viewpoints). Their objective is to enhance both the reusability of the process and coordination between users. To do so, they propose a formalization of viewpoint in KDD and a Knowledge Model that structures domain knowledge involved in a multi-view analysis. The authors’ formalization, using OWL ontologies, of viewpoint notion is based on CRISP-DM standard through the identification of a set of generic criteria that characterize a viewpoint in KDD.

Author(s):  
Edgard Benítez-Guerrero ◽  
Omar Nieva-García

The vast amounts of digital information stored in databases and other repositories represent a challenge for finding useful knowledge. Traditionalmethods for turning data into knowledge based on manual analysis reach their limits in this context, and for this reason, computer-based methods are needed. Knowledge Discovery in Databases (KDD) is the semi-automatic, nontrivial process of identifying valid, novel, potentially useful, and understandable knowledge (in the form of patterns) in data (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy, 1996). KDD is an iterative and interactive process with several steps: understanding the problem domain, data preprocessing, pattern discovery, and pattern evaluation and usage. For discovering patterns, Data Mining (DM) techniques are applied.


Author(s):  
Maribel Yasmina Santos ◽  
Luís Alfredo Amaral

Knowledge discovery in databases is a process that aims at the discovery of associations within data sets. The analysis of geo-referenced data demands a particular approach in this process. This chapter presents a new approach to the process of knowledge discovery, in which qualitative geographic identifiers give the positional aspects of geographic data. Those identifiers are manipulated using qualitative reasoning principles, which allows for the inference of new spatial relations required for the data mining step of the knowledge discovery process. The efficacy and usefulness of the implemented system — Padrão — has been tested with a bank dataset. The results support that traditional knowledge discovery systems, developed for relational databases and not having semantic knowledge linked to spatial data, can be used in the process of knowledge discovery in geo-referenced databases, since some of this semantic knowledge and the principles of qualitative spatial reasoning are available as spatial domain knowledge.


2014 ◽  
Vol 51 ◽  
pp. 605-644 ◽  
Author(s):  
P. Nguyen ◽  
M. Hilario ◽  
A. Kalousis

Knowledge Discovery in Databases is a complex process that involves many different data processing and learning operators. Today's Knowledge Discovery Support Systems can contain several hundred operators. A major challenge is to assist the user in designing workflows which are not only valid but also -- ideally -- optimize some performance measure associated with the user goal. In this paper we present such a system. The system relies on a meta-mining module which analyses past data mining experiments and extracts meta-mining models which associate dataset characteristics with workflow descriptors in view of workflow performance optimization. The meta-mining model is used within a data mining workflow planner, to guide the planner during the workflow planning. We learn the meta-mining models using a similarity learning approach, and extract the workflow descriptors by mining the workflows for generalized relational patterns accounting also for domain knowledge provided by a data mining ontology. We evaluate the quality of the data mining workflows that the system produces on a collection of real world datasets coming from biology and show that it produces workflows that are significantly better than alternative methods that can only do workflow selection and not planning.


2008 ◽  
pp. 880-912 ◽  
Author(s):  
Maribel Yasmina Santos ◽  
Luís Alfredo Amaral

Knowledge discovery in databases is a process that aims at the discovery of associations within data sets. The analysis of geo-referenced data demands a particular approach in this process. This chapter presents a new approach to the process of knowledge discovery, in which qualitative geographic identifiers give the positional aspects of geographic data. Those identifiers are manipulated using qualitative reasoning principles, which allows for the inference of new spatial relations required for the data mining step of the knowledge discovery process. The efficacy and usefulness of the implemented system — Padrão — has been tested with a bank dataset. The results support that traditional knowledge discovery systems, developed for relational databases and not having semantic knowledge linked to spatial data, can be used in the process of knowledge discovery in geo-referenced databases, since some of this semantic knowledge and the principles of qualitative spatial reasoning are available as spatial domain knowledge.


Author(s):  
Shadi Aljawarneh ◽  
Aurea Anguera ◽  
John William Atwood ◽  
Juan A. Lara ◽  
David Lizcano

AbstractNowadays, large amounts of data are generated in the medical domain. Various physiological signals generated from different organs can be recorded to extract interesting information about patients’ health. The analysis of physiological signals is a hard task that requires the use of specific approaches such as the Knowledge Discovery in Databases process. The application of such process in the domain of medicine has a series of implications and difficulties, especially regarding the application of data mining techniques to data, mainly time series, gathered from medical examinations of patients. The goal of this paper is to describe the lessons learned and the experience gathered by the authors applying data mining techniques to real medical patient data including time series. In this research, we carried out an exhaustive case study working on data from two medical fields: stabilometry (15 professional basketball players, 18 elite ice skaters) and electroencephalography (100 healthy patients, 100 epileptic patients). We applied a previously proposed knowledge discovery framework for classification purpose obtaining good results in terms of classification accuracy (greater than 99% in both fields). The good results obtained in our research are the groundwork for the lessons learned and recommendations made in this position paper that intends to be a guide for experts who have to face similar medical data mining projects.


Sign in / Sign up

Export Citation Format

Share Document