Understanding Time-Series Networks: A Case Study in Rule Extraction

1997 ◽  
Vol 08 (04) ◽  
pp. 373-384 ◽  
Author(s):  
Mark W. Craven ◽  
Jude W. Shavlik

A significant limitation of neural networks is that the representation they learn are usually incomprehensible to humans. We have developed an algorithm, called TREPAN, for extracting comprehensible, symbolic representations from trained neural networks. Given a trained network, TREPAN produces a decision tree that approximates the concept represented by the network. In this article, we discuss the application of TREPAN to a neural network trained on a noisy time series task: predicting the Dollar–Mark exchange rate. We present experiments that show that TREPAN is able to extract a decision tree from this network that equals the network in terms of predictive accuracy, yet provides a comprehensible concept representation. Moreover, our experiments indicate that decision trees induced directly from the training data using conventional algorithms do not match the accuracy nor the comprehensibility of the tree extracted by TREPAN.

2021 ◽  
Vol 11 (15) ◽  
pp. 6723
Author(s):  
Ariana Raluca Hategan ◽  
Romulus Puscas ◽  
Gabriela Cristea ◽  
Adriana Dehelean ◽  
Francois Guyon ◽  
...  

The present work aims to test the potential of the application of Artificial Neural Networks (ANNs) for food authentication. For this purpose, honey was chosen as the working matrix. The samples were originated from two countries: Romania (50) and France (53), having as floral origins: acacia, linden, honeydew, colza, galium verum, coriander, sunflower, thyme, raspberry, lavender and chestnut. The ANNs were built on the isotope and elemental content of the investigated honey samples. This approach conducted to the development of a prediction model for geographical recognition with an accuracy of 96%. Alongside this work, distinct models were developed and tested, with the aim of identifying the most suitable configurations for this application. In this regard, improvements have been continuously performed; the most important of them consisted in overcoming the unwanted phenomenon of over-fitting, observed for the training data set. This was achieved by identifying appropriate values for the number of iterations over the training data and for the size and number of the hidden layers and by introducing of a dropout layer in the configuration of the neural structure. As a conclusion, ANNs can be successfully applied in food authenticity control, but with a degree of caution with respect to the “over optimization” of the correct classification percentage for the training sample set, which can lead to an over-fitted model.


Author(s):  
MARK LAST ◽  
ODED MAIMON ◽  
EINAT MINKOV

Decision-tree algorithms are known to be unstable: small variations in the training set can result in different trees and different predictions for the same validation examples. Both accuracy and stability can be improved by learning multiple models from bootstrap samples of training data, but the "meta-learner" approach makes the extracted knowledge hardly interpretable. In the following paper, we present the Info-Fuzzy Network (IFN), a novel information-theoretic method for building stable and comprehensible decision-tree models. The stability of the IFN algorithm is ensured by restricting the tree structure to using the same feature for all nodes of the same tree level and by the built-in statistical significance tests. The IFN method is shown empirically to produce more compact and stable models than the "meta-learner" techniques, while preserving a reasonable level of predictive accuracy.


Energies ◽  
2021 ◽  
Vol 14 (14) ◽  
pp. 4107
Author(s):  
Akylas Stratigakos ◽  
Athanasios Bachoumis ◽  
Vasiliki Vita ◽  
Elias Zafiropoulos

Short-term electricity load forecasting is key to the safe, reliable, and economical operation of power systems. An important challenge that arises with high-frequency load series, e.g., hourly load, is how to deal with the complex seasonal patterns that are present. Standard approaches suggest either removing seasonality prior to modeling or applying time series decomposition. This work proposes a hybrid approach that combines Singular Spectrum Analysis (SSA)-based decomposition and Artificial Neural Networks (ANNs) for day-ahead hourly load forecasting. First, the trajectory matrix of the time series is constructed and decomposed into trend, oscillating, and noise components. Next, the extracted components are employed as exogenous regressors in a global forecasting model, comprising either a Multilayer Perceptron (MLP) or a Long Short-Term Memory (LSTM) predictive layer. The model is further extended to include exogenous features, e.g., weather forecasts, transformed via parallel dense layers. The predictive performance is evaluated on two real-world datasets, controlling for the effect of exogenous features on predictive accuracy. The results showcase that the decomposition step improves the relative performance for ANN models, with the combination of LSTM and SAA providing the best overall performance.


2018 ◽  
Author(s):  
Jesse G. Meyer ◽  
Shengchao Liu ◽  
Ian J. Miller ◽  
Joshua J. Coon ◽  
Anthony Gitter

AbstractEmpirical testing of chemicals for drug efficacy costs many billions of dollars every year. The ability to predict the action of molecules in silico would greatly increase the speed and decrease the cost of prioritizing drug leads. Here, we asked whether drug function, defined as MeSH “Therapeutic Use” classes, can be predicted from only chemical structure. We evaluated two chemical structure-derived drug classification methods, chemical images with convolutional neural networks and molecular fingerprints with random forests, both of which outperformed previous predictions that used drug-induced transcriptomic changes as chemical representations. This suggests that a chemical’s structure contains at least as much information about its therapeutic use as the transcriptional cellular response to that chemical. Further, because training data based on chemical structure is not limited to a small set of molecules for which transcriptomic measurements are available, our strategy can leverage more training data to significantly improve predictive accuracy to 83-88%. Finally, we explore use of these models for prediction of side effects and drug repurposing opportunities, and demonstrate the effectiveness of this modeling strategy for multi-label classification.


2002 ◽  
Vol 11 (02) ◽  
pp. 189-202 ◽  
Author(s):  
RUDY SETIONO ◽  
ARNULFO AZCARRAGA

Neural networks with a single hidden layer are known to be universal function approximators. However, due to the complexity of the network topology and the nonlinear transfer function used in computing the hidden unit activations, the predictions of a trained network are difficult to comprehend. On the other hand, predictions from a multiple linear regression equation are easy to understand but are not accurate when the underlying relationship between the input variables and the output variable is nonlinear. We have thus developed a method for multivariate function approximation which combines neural network learning, clustering and multiple regression. This method generates a set of multiple linear regression equations using neural networks, where the number of regression equations is determined by clustering the weighted input variables. The predictions for samples of the same cluster are computed by the same regression equation. Experimental results on a number of real-world data demonstrate that this new method generates relatively few regression equations from the training data samples. Yet, drawing from the universal function approximation capacity of neural networks, the predictive accuracy is high. The prediction errors are comparable to or lower than those achieved by existing function approximation methods.


Author(s):  
Kaushal Paneri ◽  
Vishnu TV ◽  
Pankaj Malhotra ◽  
Lovekesh Vig ◽  
Gautam Shroff

Deep neural networks are prone to overfitting, especially in small training data regimes. Often, these networks are overparameterized and the resulting learned weights tend to have strong correlations. However, convolutional networks in general, and fully convolution neural networks (FCNs) in particular, have been shown to be relatively parameter efficient, and have recently been successfully applied to time series classification tasks. In this paper, we investigate the application of different regularizers on the correlation between the learned convolutional filters in FCNs using Batch Normalization (BN) as a regularizer for time series classification (TSC) tasks. Results demonstrate that despite orthogonal initialization of the filters, the average correlation across filters (especially for filters in higher layers) tends to increase as training proceeds, indicating redundancy of filters. To mitigate this redundancy, we propose a strong regularizer, using simple yet effective filter decorrelation. Our proposed method yields significant gains in classification accuracy for 44 diverse time series datasets from the UCR TSC benchmark repository.


Sign in / Sign up

Export Citation Format

Share Document