Domain-Driven Data Mining

2008 ◽  
pp. 831-848 ◽  
Author(s):  
Longbing Cao ◽  
Chengqi Zhang

Extant data mining is based on data-driven methodologies. It either views data mining as an autonomous data-driven, trial-and-error process or only analyzes business issues in an isolated, case-by-case manner. As a result, very often the knowledge discovered generally is not interesting to real business needs. Therefore, this article proposes a practical data mining methodology referred to as domain-driven data mining, which targets actionable knowledge discovery in a constrained environment for satisfying user preference. The domain-driven data mining consists of a DDID-PD framework that considers key components such as constraint-based context, integrating domain knowledge, human-machine cooperation, in-depth mining, actionability enhancement, and iterative refinement process. We also illustrate some examples in mining actionable correlations in Australian Stock Exchange, which show that domain-driven data mining has potential to improve further the actionability of patterns for practical use by industry and business.

Author(s):  
Longbing Cao ◽  
Chengqi Zhang

Quantitative intelligence based traditional data mining is facing grand challenges from real-world enterprise and cross-organization applications. For instance, the usual demonstration of specific algorithms cannot support business users to take actions to their advantage and needs. We think this is due to Quantitative Intelligence focused data-driven philosophy. It either views data mining as an autonomous data-driven, trial-and-error process, or only analyzes business issues in an isolated, case-by-case manner. Based on experience and lessons learnt from real-world data mining and complex systems, this article proposes a practical data mining methodology referred to as Domain-Driven Data Mining. On top of quantitative intelligence and hidden knowledge in data, domain-driven data mining aims to meta-synthesize quantitative intelligence and qualitative intelligence in mining complex applications in which human is in the loop. It targets actionable knowledge discovery in constrained environment for satisfying user preference. Domain-driven methodology consists of key components including understanding constrained environment, business-technical questionnaire, representing and involving domain knowledge, human-mining cooperation and interaction, constructing next-generation mining infrastructure, in-depth pattern mining and postprocessing, business interestingness and actionability enhancement, and loop-closed human-cooperated iterative refinement. Domain-driven data mining complements the data-driven methodology, the metasynthesis of qualitative intelligence and quantitative intelligence has potential to discover knowledge from complex systems, and enhance knowledge actionability for practical use by industry and business.


Author(s):  
LONGBING CAO ◽  
CHENGQI ZHANG

Traditionally, data mining is an autonomous data-driven trial-and-error process. Its typical task is to let data tell a story disclosing hidden information, in which domain intelligence may not be necessary in targeting the demonstration of an algorithm. Often knowledge discovered is not generally interesting to business needs. Comparably, real-world applications rely on knowledge for taking effective actions. In retrospect of the evolution of KDD, this paper briefly introduces domain-driven data mining to complement traditional KDD. Domain intelligence is highlighted towards actionable knowledge discovery, which involves aspects such as domain knowledge, people, environment and evaluation. We illustrate it through mining activity patterns in social security data.


Author(s):  
Iman Barazandeh ◽  
Mohammad Reza Gholamian

The healthcare industry is one of the most attractive domains to realize the actionable knowledge discovery objectives. This chapter studies recent researches on knowledge discovery and data mining applications in the healthcare industry and proposes a new classification of these applications. Studies show that knowledge discovery and data mining applications in the healthcare industry can be classified to three major classes, namely patient view, market view, and system view. Patient view includes papers that performed pure data mining on healthcare industry data. Market view includes papers that saw the patients as customers. System view includes papers that developed a decision support system. The goal of this classification is identifying research opportunities and gaps for researchers interested in this context.


Author(s):  
Longbing Cao

Actionable knowledge discovery is selected as one of the greatest challenges (Ankerst, 2002; Fayyad, Shapiro, & Uthurusamy, 2003) of next-generation knowledge discovery in database (KDD) studies (Han & Kamber, 2006). In the existing data mining, often mined patterns are nonactionable to real user needs. To enhance knowledge actionability, domain-related social intelligence is substantially essential (Cao et al., 2006b). The involvement of domain-related social intelligence into data mining leads to domaindriven data mining (Cao & Zhang, 2006a, 2007a), which complements traditional data-centered mining methodology. Domain-related social intelligence consists of intelligence of human, domain, environment, society and cyberspace, which complements data intelligence. The extension of KDD toward domain-driven data mining involves many challenging but promising research and development issues in KDD. Studies in regard to these issues may promote the paradigm shift of KDD from data-centered interesting pattern mining to domain-driven actionable knowledge discovery, and the deployment shift from simulated data set-based to real-life data and business environment-oriented as widely predicted.


2021 ◽  
Vol 4 ◽  
Author(s):  
Shailesh Tripathi ◽  
David Muhr ◽  
Manuel Brunner ◽  
Herbert Jodlbauer ◽  
Matthias Dehmer ◽  
...  

The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues.


Author(s):  
Aaron Ceglar ◽  
John Roddick ◽  
Paul Calder

Knowledge discovery is the process of eliciting interesting knowledge from data repositories. Due to the inability of computers to understand abstract concepts, present mining algorithms do not adequately constrain the generation of rules to those that are of interest to the user. Interactive mining techniques aim to alleviate this problem by involving the user in the mining process, so that the user’s understanding of abstract semantic concepts and domain knowledge can guide the discovery process, resulting in accelerated mining with improved results. This chapter presents a discussion of the current state of interactive data mining research.


2016 ◽  
pp. 1097-1118 ◽  
Author(s):  
Iman Barazandeh ◽  
Mohammad Reza Gholamian

The healthcare industry is one of the most attractive domains to realize the actionable knowledge discovery objectives. This chapter studies recent researches on knowledge discovery and data mining applications in the healthcare industry and proposes a new classification of these applications. Studies show that knowledge discovery and data mining applications in the healthcare industry can be classified to three major classes, namely patient view, market view, and system view. Patient view includes papers that performed pure data mining on healthcare industry data. Market view includes papers that saw the patients as customers. System view includes papers that developed a decision support system. The goal of this classification is identifying research opportunities and gaps for researchers interested in this context.


2002 ◽  
Vol 01 (04) ◽  
pp. 657-672 ◽  
Author(s):  
BASILIS BOUTSINAS

Data mining is an emerging research area that develops techniques for knowledge discovery in huge volumes of data. Usually, data mining rules can be used either to classify data into predefined classes, or to partition a set of patterns into disjoint and homogeneous clusters, or to reveal frequent dependencies among data. The discovery of data mining rules would not be very useful unless there are mechanisms to help analysts access them in a meaningful way. Actually, documenting and reporting the extracted knowledge is of considerable importance for the successful application of data mining in practice. In this paper, we propose a methodology for accessing data mining rules, which is based on using an expert system. We present how the different types of data mining rules can be transformed into the domain knowledge of any general-purpose expert system. Then, we present how certain attribute values given by the user as facts and/or goals can determine, through a forward and/or backward chaining, the related data mining rules. In this paper, we also present a case study that demonstrates the applicability of the proposed methodology.


Sign in / Sign up

Export Citation Format

Share Document