VISUALIZATION SUPPORT FOR USER-CENTERED MODEL SELECTION IN KNOWLEDGE DISCOVERY AND DATA MINING

2001 ◽  
Vol 10 (04) ◽  
pp. 691-713 ◽  
Author(s):  
TUBAO HO ◽  
TRONGDUNG NGUYEN ◽  
DUCDUNG NGUYEN ◽  
SAORI KAWASAKI

The problem of model selection in knowledge discovery and data mining—the selection of appropriate discovered patterns/models or algorithms to achieve such patterns/models—is generally a difficult task for the user as it requires meta-knowledge on algorithms/models and model performance metrics. Viewing knowledge discovery as a human-centered process that requires an effective collaboration between the user and the discovery system, our work aims to make model selection in knowledge discovery easier and more effective. For such a collaboration, our solution is to give the user the ability to try easily various alternatives and to compare competing models quantitatively and qualitatively. The basic idea of our solution is to integrate data and knowledge visualization with the knowledge discovery process in order to the support the participation of the user. We introduce the knowledge discovery system D2MS in which several visualization techniques of data and knowledge are developed and integrated into the steps of the knowledge discovery process. The visualizers in D2MS greatly help the user gain better insight in each step of the knowledge discovery process as well the relationship between data and discovered knowledge in the whole process.

Author(s):  
Héctor Oscar Nigro ◽  
Sandra Elizabeth González Císaro

Nowadays one of the most important and challenging problems in Knowledge Discovery Process in Databases (KDD) or Data Mining is the definition of the prior knowledge; this can be originated either from the process or the domain. This contextual information may help select the appropriate information, features or techniques, decrease the space of hypothesis, represent the output in a more comprehensible way and improve the whole process.


2008 ◽  
pp. 2379-2401 ◽  
Author(s):  
Igor Nai Fovino

Intense work in the area of data mining technology and in its applications to several domains has resulted in the development of a large variety of techniques and tools able to automatically and intelligently transform large amounts of data in knowledge relevant to users. However, as with other kinds of useful technologies, the knowledge discovery process can be misused. It can be used, for example, by malicious subjects in order to reconstruct sensitive information for which they do not have an explicit access authorization. This type of “attack” cannot easily be detected, because, usually, the data used to guess the protected information, is freely accessible. For this reason, many research efforts have been recently devoted to addressing the problem of privacy preserving in data mining. The mission of this chapter is therefore to introduce the reader in this new research field and to provide the proper instruments (in term of concepts, techniques and example) in order to allow a critical comprehension of the advantages, the limitations and the open issues of the Privacy Preserving Data Mining Techniques.


Author(s):  
Igor Nai Fovino

Intense work in the area of data mining technology and in its applications to several domains has resulted in the development of a large variety of techniques and tools able to automatically and intelligently transform large amounts of data in knowledge relevant to users. However, as with other kinds of useful technologies, the knowledge discovery process can be misused. It can be used, for example, by malicious subjects in order to reconstruct sensitive information for which they do not have an explicit access authorization. This type of “attack” cannot easily be detected, because, usually, the data used to guess the protected information, is freely accessible. For this reason, many research efforts have been recently devoted to addressing the problem of privacy preserving in data mining. The mission of this chapter is therefore to introduce the reader in this new research field and to provide the proper instruments (in term of concepts, techniques and example) in order to allow a critical comprehension of the advantages, the limitations and the open issues of the Privacy Preserving Data Mining Techniques.


Author(s):  
Sangeetha G ◽  
L. Manjunatha Rao

With the massive proliferation of online applications for the citizens with abundant resources, there is a tremendous hike in usage of e-governance platforms. Right from entrepreneur, players, politicians, students, or anyone who are highly depending on web-based grievance redressal networking sites, which generates loads of massive grievance data that are not only challenging but also highly impossible to understand. The prime reason behind this is grievance data is massive in size and they are highly unstructured. Because of this fact, the proposed system attempts to understand the possibility of performing knowledge discovery process from grievance Data using conventional data mining algorithms. Designed in Java considering massive number of online e-governance framework from civilian’s grievance discussion forums, the proposed system evaluates the effectiveness of performing datamining for Big data.


2011 ◽  
Vol 7 (1) ◽  
pp. 24-45 ◽  
Author(s):  
Roberto Trasarti ◽  
Fosca Giannotti ◽  
Mirco Nanni ◽  
Dino Pedreschi ◽  
Chiara Renso

The technologies of mobile communications and ubiquitous computing pervade society. Wireless networks sense the movement of people and vehicles, generating large volumes of mobility data, such as mobile phone call records and GPS tracks. This data can produce useful knowledge, supporting sustainable mobility and intelligent transportation systems, provided that a suitable knowledge discovery process is enacted for mining this mobility data. In this paper, the authors examine a formal framework, and the associated implementation, for a data mining query language for mobility data, created as a result of a European-wide research project called GeoPKDD (Geographic Privacy-Aware Knowledge Discovery and Delivery). The authors discuss how the system provides comprehensive support for the Mobility Knowledge Discovery process and illustrate its analytical power in unveiling the complexity of urban mobility in a large metropolitan area, based on a massive real life GPS dataset.


2008 ◽  
pp. 1759-1783
Author(s):  
Christian Baumgartner ◽  
Armin Graber

This chapter provides an overview of the knowledge discovery process in metabolomics, a young discipline in the life sciences arena. It introduces two emerging bioanalytical concepts for generating biomolecular information, followed by various data mining and information retrieval procedures such as feature selection, classification, clustering and biochemical interpretation of mined data, illustrated by real examples from preclinical and clinical studies. The authors trust that this chapter will provide an acceptable balance between bioanalytics background information, essential to understanding the complexity of data generation, and information on data mining principals, specific methods and processes, and biomedical application. Thus, this chapter is anticipated to appeal to those with a metabolomics background as well as to basic researchers within the data mining community who are interested in novel life science applications.


2001 ◽  
Vol 10 (01n02) ◽  
pp. 107-135 ◽  
Author(s):  
ISTVAN JONYER ◽  
LAWRENCE B. HOLDER ◽  
DIANE J. COOK

Hierarchical conceptual clustering has proven to be a useful, although greatly under-explored data mining technique. A graph-based representation of structural information combined with a substructure discovery technique has been shown to be successful in knowledge discovery. The SUBDUE substructure discovery system provides the advantages of both approaches. This work presents SUBDUE and the development of its clustering functionalities. Several examples are used to illustrate the validity of the approach both in structured and unstructured domains, as well as compare SUBDUE to earlier clustering algorithms. Results show that SUBDUE successfully discovers hierarchical clusterings in both structured and unstructured data.


Sign in / Sign up

Export Citation Format

Share Document