PCA for heterogeneous data sets in a distributed data mining

Author(s):  
E. Chandra ◽  
P. Ajitha
Author(s):  
Riyaz Sikora ◽  
O'la Al-Laymoun

Distributed data mining and ensemble learning are two methods that aim to address the issue of data scaling, which is required to process the large amount of data collected these days. Distributed data mining looks at how data that is distributed can be effectively mined without having to collect the data at one central location. Ensemble learning techniques aim to create a meta-classifier by combining several classifiers created on the same data and improve their performance. In this chapter, the authors use concepts from both of these fields to create a modified and improved version of the standard stacking ensemble learning technique by using a Genetic Algorithm (GA) for creating the meta-classifier. They test the GA-based stacking algorithm on ten data sets from the UCI Data Repository and show the improvement in performance over the individual learning algorithms as well as over the standard stacking algorithm.


2017 ◽  
Vol 93 ◽  
pp. 23-30 ◽  
Author(s):  
Xavier Limón ◽  
Alejandro Guerra-Hernández ◽  
Nicandro Cruz-Ramírez ◽  
Héctor-Gabriel Acosta-Mesa ◽  
Francisco Grimaldo

Sign in / Sign up

Export Citation Format

Share Document