scholarly journals Mapping Ensembles of Trees to Sparse, Interpretable Multilayer Perceptron Networks

2020 ◽  
Vol 1 (5) ◽  
Author(s):  
Dalia Rodríguez-Salas ◽  
Nina Mürschberger ◽  
Nishant Ravikumar ◽  
Mathias Seuret ◽  
Andreas Maier

Abstract Tree-based classifiers provide easy-to-understand outputs. Artificial neural networks (ANN) commonly outperform tree-based classifiers; nevertheless, understanding their outputs requires specialized knowledge in most cases. The highly redundant architecture of ANN is typically designed through an expensive trial-and-error scheme. We aim at (1) investigating whether using ensembles of decision trees to design the architecture of low-redundant, sparse ANN provides better-performing networks, and (2) evaluating whether such trees can be used to provide human-understandable explanations for their outputs. Information about the hierarchy of the features, and how good they are at separating subsets of samples among the classes, is gathered from each branch in an ensemble of trees. This information is used to design the architecture of a sparse multilayer perceptron network. Networks built using our method are called ForestNet. Tree branches corresponding to highly activated neurons are used to provide explanations of the networks’ outputs. ForestNets are able to handle low- and high-dimensional data, as we show on an evaluation using four datasets. Our networks consistently outperformed their respective ensemble of trees and had similar performance to their fully connected counterparts with a significant reduction of connections. Furthermore, our interpretation method seems to provide support for the ForestNet outputs. While ForestNet’s architectures do not allow them yet to capture well the intrinsic variability of visual data, they exhibit very promising results by reducing more than 98% of connections for such visual tasks. Structure similarities between ForestNets and their respective tree ensemble provide means to interpret their outputs.

Author(s):  
Julio Fernández-Ceniceros ◽  
Andrés Sanz-García ◽  
Fernando Antoñanzas-Torres ◽  
F. Javier Martínez-de-Pisón-Ascacibar

2015 ◽  
Vol 23 (2) ◽  
pp. 1634-1641 ◽  
Author(s):  
Hamza Abderrahim ◽  
Mohammed Reda Chellali ◽  
Ahmed Hamou

Entropy ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. 727 ◽  
Author(s):  
Hlynur Jónsson ◽  
Giovanni Cherubini ◽  
Evangelos Eleftheriou

Information theory concepts are leveraged with the goal of better understanding and improving Deep Neural Networks (DNNs). The information plane of neural networks describes the behavior during training of the mutual information at various depths between input/output and hidden-layer variables. Previous analysis revealed that most of the training epochs are spent on compressing the input, in some networks where finiteness of the mutual information can be established. However, the estimation of mutual information is nontrivial for high-dimensional continuous random variables. Therefore, the computation of the mutual information for DNNs and its visualization on the information plane mostly focused on low-complexity fully connected networks. In fact, even the existence of the compression phase in complex DNNs has been questioned and viewed as an open problem. In this paper, we present the convergence of mutual information on the information plane for a high-dimensional VGG-16 Convolutional Neural Network (CNN) by resorting to Mutual Information Neural Estimation (MINE), thus confirming and extending the results obtained with low-dimensional fully connected networks. Furthermore, we demonstrate the benefits of regularizing a network, especially for a large number of training epochs, by adopting mutual information estimates as additional terms in the loss function characteristic of the network. Experimental results show that the regularization stabilizes the test accuracy and significantly reduces its variance.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Laura Gagliano ◽  
Elie Bou Assi ◽  
Dang K. Nguyen ◽  
Mohamad Sawan

Abstract This work proposes a novel approach for the classification of interictal and preictal brain states based on bispectrum analysis and recurrent Long Short-Term Memory (LSTM) neural networks. Two features were first extracted from bilateral intracranial electroencephalography (iEEG) recordings of dogs with naturally occurring focal epilepsy. Single-layer LSTM networks were trained to classify 5-min long feature vectors as preictal or interictal. Classification performances were compared to previous work involving multilayer perceptron networks and higher-order spectral (HOS) features on the same dataset. The proposed LSTM network proved superior to the multilayer perceptron network and achieved an average classification accuracy of 86.29% on held-out data. Results imply the possibility of forecasting epileptic seizures using recurrent neural networks, with minimal feature extraction.


Sign in / Sign up

Export Citation Format

Share Document