A Hybrid Decision Tree-Neural Network (DT-NN) Model for Large-Scale Classification Problems

Author(s):  
Jarrod Carson ◽  
Kane Hollingsworth ◽  
Rituparna Datta ◽  
George Clark ◽  
Aviv Segev
2018 ◽  
Vol 77 ◽  
pp. 187-194 ◽  
Author(s):  
Emre Cimen ◽  
Gurkan Ozturk ◽  
Omer Nezih Gerek

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Bo Sun ◽  
Haiyan Chen

k nearest neighbor ( k NN) is a simple and widely used classifier; it can achieve comparable performance with more complex classifiers including decision tree and artificial neural network. Therefore, k NN has been listed as one of the top 10 algorithms in machine learning and data mining. On the other hand, in many classification problems, such as medical diagnosis and intrusion detection, the collected training sets are usually class imbalanced. In class imbalanced data, although positive examples are heavily outnumbered by negative ones, positive examples usually carry more meaningful information and are more important than negative examples. Similar to other classical classifiers, k NN is also proposed under the assumption that the training set has approximately balanced class distribution, leading to its unsatisfactory performance on imbalanced data. In addition, under a class imbalanced scenario, the global resampling strategies that are suitable to decision tree and artificial neural network often do not work well for k NN, which is a local information-oriented classifier. To solve this problem, researchers have conducted many works for k NN over the past decade. This paper presents a comprehensive survey of these works according to their different perspectives and analyzes and compares their characteristics. At last, several future directions are pointed out.


Author(s):  
DEJAN GJORGJEVIKJ ◽  
GJORGJI MADJAROV ◽  
SAŠO DŽEROSKI

Multi-label learning (MLL) problems abound in many areas, including text categorization, protein function classification, and semantic annotation of multimedia. Issues that severely limit the applicability of many current machine learning approaches to MLL are the large-scale problem, which have a strong impact on the computational complexity of learning. These problems are especially pronounced for approaches that transform MLL problems into a set of binary classification problems for which Support Vector Machines (SVMs) are used. On the other hand, the most efficient approaches to MLL, based on decision trees, have clearly lower predictive performance. We propose a hybrid decision tree architecture, where the leaves do not give multi-label predictions directly, but rather utilize local SVM-based classifiers giving multi-label predictions. A binary relevance architecture is employed in the leaves, where a binary SVM classifier is built for each of the labels relevant to that particular leaf. We use a broad range of multi-label datasets with a variety of evaluation measures to evaluate the proposed method against related and state-of-the-art methods, both in terms of predictive performance and time complexity. Our hybrid architecture on almost every large classification problem outperforms the competing approaches in terms of the predictive performance, while its computational efficiency is significantly improved as a result of the integrated decision tree.


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 792
Author(s):  
Dongbao Jia ◽  
Yuka Fujishita ◽  
Cunhua Li ◽  
Yuki Todo ◽  
Hongwei Dai

With the characteristics of simple structure and low cost, the dendritic neuron model (DNM) is used as a neuron model to solve complex problems such as nonlinear problems for achieving high-precision models. Although the DNM obtains higher accuracy and effectiveness than the middle layer of the multilayer perceptron in small-scale classification problems, there are no examples that apply it to large-scale classification problems. To achieve better performance for solving practical problems, an approximate Newton-type method-neural network with random weights for the comparison; and three learning algorithms including back-propagation (BP), biogeography-based optimization (BBO), and a competitive swarm optimizer (CSO) are used in the DNM in this experiment. Moreover, three classification problems are solved by using the above learning algorithms to verify their precision and effectiveness in large-scale classification problems. As a consequence, in the case of execution time, DNM + BP is the optimum; DNM + CSO is the best in terms of both accuracy stability and execution time; and considering the stability of comprehensive performance and the convergence rate, DNM + BBO is a wise choice.


Author(s):  
Ziad Akram Ali Hammouri ◽  
Manuel Fernandez Delgado ◽  
Eva Cernadas ◽  
Senen Barro

Sign in / Sign up

Export Citation Format

Share Document