Swarm intelligence for data mining classification tasks: an experimental study using medical decision problems

Author(s):  
Jose A. Saez ◽  
Emilio Corchado
2006 ◽  
Vol 14 (2) ◽  
pp. 183-221 ◽  
Author(s):  
Jorge Muruzábal

The article is about a new Classifier System framework for classification tasks called BYP CS (for BaYesian Predictive Classifier System). The proposed CS approach abandons the focus on high accuracy and addresses a well-posed Data Mining goal, namely, that of uncovering the low-uncertainty patterns of dependence that manifest often in the data. To attain this goal, BYP CS uses a fair amount of probabilistic machinery, which brings its representation language closer to other related methods of interest in statistics and machine learning. On the practical side, the new algorithm is seen to yield stable learning of compact populations, and these still maintain a respectable amount of predictive power. Furthermore, the emerging rules self-organize in interesting ways, sometimes providing unexpected solutions to certain benchmark problems.


Author(s):  
Awder Mohammed Ahmed ◽  
◽  
Adnan Mohsin Abdulazeez ◽  

Multi-label classification addresses the issues that more than one class label assigns to each instance. Many real-world multi-label classification tasks are high-dimensional due to digital technologies, leading to reduced performance of traditional multi-label classifiers. Feature selection is a common and successful approach to tackling this problem by retaining relevant features and eliminating redundant ones to reduce dimensionality. There is several feature selection that is successfully applied in multi-label learning. Most of those features are wrapper methods that employ a multi-label classifier in their processes. They run a classifier in each step, which requires a high computational cost, and thus they suffer from scalability issues. Filter methods are introduced to evaluate the feature subsets using information-theoretic mechanisms instead of running classifiers to deal with this issue. Most of the existing researches and review papers dealing with feature selection in single-label data. While, recently multi-label classification has a wide range of real-world applications such as image classification, emotion analysis, text mining, and bioinformatics. Moreover, researchers have recently focused on applying swarm intelligence methods in selecting prominent features of multi-label data. To the best of our knowledge, there is no review paper that reviews swarm intelligence-based methods for multi-label feature selection. Thus, in this paper, we provide a comprehensive review of different swarm intelligence and evolutionary computing methods of feature selection presented for multi-label classification tasks. To this end, in this review, we have investigated most of the well-known and state-of-the-art methods and categorize them based on different perspectives. We then provided the main characteristics of the existing multi-label feature selection techniques and compared them analytically. We also introduce benchmarks, evaluation measures, and standard datasets to facilitate research in this field. Moreover, we performed some experiments to compare existing works, and at the end of this survey, some challenges, issues, and open problems of this field are introduced to be considered by researchers in the future.


Author(s):  
Markos G. Tsipouras ◽  
Themis P. Exarchos ◽  
Dimitrios I. Fotiadis ◽  
Aris Bechlioulis ◽  
Katerina K. Naka

This article addresses the decision support regarding cardiovascular diseases, using computer-based methods, focusing on the coronary artery disease (CAD) diagnosis and on the prediction of clinical restenosis in patients undergoing angioplasty. Methods reported in the literature are reviewed with respect to (i) the medical information that are employing in order to reach the diagnosis and (ii) the data analysis techniques used for the creation of the CDSSs. In what concerns medical information, easily and noninvasively-obtained data present several advantages compared to other types of data, while data analysis techniques that are characterized by transparency regarding their decisions are more suitable for medical decision making. A recently developed approach that complies with the above requirements is presented. The approach is based on data mining and fuzzy modelling. Using this approach, one CDSS has been developed for each of the two cardiovascular problems mentioned above. These CDSSs are extensively evaluated and comments about the discovered knowledge are provided by medical experts. The later is of great importance in designing and evaluating CDSSs, since it allows them to be integrated into real clinical environments.


Sign in / Sign up

Export Citation Format

Share Document