Medical Domain Knowledge and Associative Classification Rules in Diagnosis

Author(s):  
Sung Ho Ha

Hospital information systems have been frustrated by problems that include congestion, long wait time, and delayed patient care over decades. To solve these problems, data mining techniques have been used in medical research for many years and are known to be effective. Therefore, this study examines building a hybrid data mining methodology, combining medical domain knowledge and associative classification rules. Real world emergency data are collected from a hospital and the methodology is evaluated by comparing it with other techniques. The methodology is expected to help physicians to make rapid and accurate diagnosis of chest diseases.

Author(s):  
Longbing Cao ◽  
Chengqi Zhang

Quantitative intelligence based traditional data mining is facing grand challenges from real-world enterprise and cross-organization applications. For instance, the usual demonstration of specific algorithms cannot support business users to take actions to their advantage and needs. We think this is due to Quantitative Intelligence focused data-driven philosophy. It either views data mining as an autonomous data-driven, trial-and-error process, or only analyzes business issues in an isolated, case-by-case manner. Based on experience and lessons learnt from real-world data mining and complex systems, this article proposes a practical data mining methodology referred to as Domain-Driven Data Mining. On top of quantitative intelligence and hidden knowledge in data, domain-driven data mining aims to meta-synthesize quantitative intelligence and qualitative intelligence in mining complex applications in which human is in the loop. It targets actionable knowledge discovery in constrained environment for satisfying user preference. Domain-driven methodology consists of key components including understanding constrained environment, business-technical questionnaire, representing and involving domain knowledge, human-mining cooperation and interaction, constructing next-generation mining infrastructure, in-depth pattern mining and postprocessing, business interestingness and actionability enhancement, and loop-closed human-cooperated iterative refinement. Domain-driven data mining complements the data-driven methodology, the metasynthesis of qualitative intelligence and quantitative intelligence has potential to discover knowledge from complex systems, and enhance knowledge actionability for practical use by industry and business.


2009 ◽  
Vol 18 (01) ◽  
pp. 81-98 ◽  
Author(s):  
MARTIN ATZMUELLER ◽  
FRANK PUPPE ◽  
HANS-PETER BUSCHER

This paper presents a semi-automatic approach for confounding-aware subgroup discovery: Confounding essentially disturbs the measured effect of an association between variables due to the influence of other parameters that were not considered. The proposed method is embedded into a general subgroup discovery approach, and provides the means for detecting potentially confounded subgroup patterns, other unconfounded relations, and/or patterns that are affected by effect-modification. Since there is no purely automatic test for confounding, the discovered relations are presented to the user in a semi-automatic approach. Furthermore, we utilize (causal) domain knowledge for improving the results of the algorithm, since confounding is itself a causal concept. The applicability and benefit of the presented technique is illustrated by real-world examples from a case-study in the medical domain.


2012 ◽  
Vol 11 (02) ◽  
pp. 389-400 ◽  
Author(s):  
GABOR MELLI ◽  
XINDONG WU ◽  
PAUL BEINAT ◽  
FRANCESCO BONCHI ◽  
LONGBING CAO ◽  
...  

We report on the panel discussion held at the ICDM'10 conference on the top 10 data mining case studies in order to provide a snapshot of where and how data mining techniques have made significant real-world impact. The tasks covered by 10 case studies range from the detection of anomalies such as cancer, fraud, and system failures to the optimization of organizational operations, and include the automated extraction of information from unstructured sources. From the 10 cases we find that supervised methods prevail while unsupervised techniques play a supporting role. Further, significant domain knowledge is generally required to achieve a completed solution. Finally, we find that successful applications are more commonly associated with continual improvement rather than by single "aha moments" of knowledge ("nugget") discovery.


Data ◽  
2020 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Ahmed Elmogy ◽  
Hamada Rizk ◽  
Amany M. Sarhan

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.


2021 ◽  
pp. 111144
Author(s):  
Yuzhou Wang ◽  
Zhengfei Li ◽  
Huanxin Chen ◽  
Jianxin Zhang ◽  
Qian Liu ◽  
...  

2017 ◽  
Vol 27 (1) ◽  
pp. 169-180 ◽  
Author(s):  
Marton Szemenyei ◽  
Ferenc Vajda

Abstract Dimension reduction and feature selection are fundamental tools for machine learning and data mining. Most existing methods, however, assume that objects are represented by a single vectorial descriptor. In reality, some description methods assign unordered sets or graphs of vectors to a single object, where each vector is assumed to have the same number of dimensions, but is drawn from a different probability distribution. Moreover, some applications (such as pose estimation) may require the recognition of individual vectors (nodes) of an object. In such cases it is essential that the nodes within a single object remain distinguishable after dimension reduction. In this paper we propose new discriminant analysis methods that are able to satisfy two criteria at the same time: separating between classes and between the nodes of an object instance. We analyze and evaluate our methods on several different synthetic and real-world datasets.


Sign in / Sign up

Export Citation Format

Share Document