Extracting Rules from Neural Networks by Pruning and Hidden-Unit Splitting

An algorithm for extracting rules from a standard three-layer feedforward neural network is proposed. The trained network is first pruned not only to remove redundant connections in the network but, more important, to detect the relevant inputs. The algorithm generates rules from the pruned network by considering only a small number of activation values at the hidden units. If the number of inputs connected to a hidden unit is sufficiently small, then rules that describe how each of its activation values is obtained can be readily generated. Otherwise the hidden unit will be split and treated as output units, with each output unit corresponding to an activation value. A hidden layer is inserted and a new subnetwork is formed, trained, and pruned. This process is repeated until every hidden unit in the network has a relatively small number of input units connected to it. Examples on how the proposed algorithm works are shown using real-world data arising from molecular biology and signal processing. Our results show that for these complex problems, the algorithm can extract reasonably compact rule sets that have high predictive accuracy rates.

Download Full-text

RULE EXTRACTION FROM MINIMAL NEURAL NETWORKS FOR CREDIT CARD SCREENING

International Journal of Neural Systems ◽

10.1142/s0129065711002821 ◽

2011 ◽

Vol 21 (04) ◽

pp. 265-276 ◽

Cited By ~ 24

Author(s):

RUDY SETIONO ◽

BART BAESENS ◽

CHRISTOPHE MUES

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Credit Card ◽

Predictive Accuracy ◽

Real Life ◽

Feedforward Neural Networks ◽

Predictive Performance ◽

Classification Problems ◽

Rule Sets

While feedforward neural networks have been widely accepted as effective tools for solving classification problems, the issue of finding the best network architecture remains unresolved, particularly so in real-world problem settings. We address this issue in the context of credit card screening, where it is important to not only find a neural network with good predictive performance but also one that facilitates a clear explanation of how it produces its predictions. We show that minimal neural networks with as few as one hidden unit provide good predictive accuracy, while having the added advantage of making it easier to generate concise and comprehensible classification rules for the user. To further reduce model size, a novel approach is suggested in which network connections from the input units to this hidden unit are removed by a very straightaway pruning procedure. In terms of predictive accuracy, both the minimized neural networks and the rule sets generated from them are shown to compare favorably with other neural network based classifiers. The rules generated from the minimized neural networks are concise and thus easier to validate in a real-life setting.

Download Full-text

FEEDFORWARD NEURAL NETWORK MODELS FOR HANDLING CLASS OVERLAP AND CLASS IMBALANCE

International Journal of Neural Systems ◽

10.1142/s012906570500030x ◽

2005 ◽

Vol 15 (05) ◽

pp. 323-338 ◽

Cited By ~ 3

Author(s):

RALF KRETZSCHMAR ◽

NICOLAOS B. KARAYIANNIS ◽

FRITZ EGGIMANN

Keyword(s):

Neural Network ◽

Neural Networks ◽

Error Function ◽

Class Imbalance ◽

Network Models ◽

Feedforward Neural Networks ◽

Feedforward Neural Network ◽

Neural Network Models ◽

Real World Data ◽

Special Case

This paper proposes a framework for training feedforward neural network models capable of handling class overlap and imbalance by minimizing an error function that compensates for such imperfections of the training set. A special case of the proposed error function can be used for training variance-controlled neural networks (VCNNs), which are developed to handle class overlap by minimizing an error function involving the class-specific variance (CSV) computed at their outputs. Another special case of the proposed error function can be used for training class-balancing neural networks (CBNNs), which are developed to handle class imbalance by relying on class-specific correction (CSC). VCNNs and CBNNs are compared with conventional feedforward neural networks (FFNNs), quantum neural networks (QNNs), and resampling techniques. The properties of VCNNs and CBNNs are illustrated by experiments on artificial data. Various experiments involving real-world data reveal the advantages offered by VCNNs and CBNNs in the presence of class overlap and class imbalance.

Download Full-text

Operational Determination of the Activated Sludge Process Using Neural Networks

Water Science & Technology ◽

10.2166/wst.1992.0762 ◽

1992 ◽

Vol 26 (9-11) ◽

pp. 2461-2464 ◽

Cited By ~ 2

Author(s):

R. D. Tyagi ◽

Y. G. Du

Keyword(s):

Neural Network ◽

Neural Networks ◽

Steady State ◽

Activated Sludge ◽

Feedforward Neural Network ◽

Training Data ◽

Activated Sludge Process

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.

Download Full-text

Optimization Artificial Neural Network Using Artificial Bee Colony in Letter Recognition Classification

JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) ◽

10.24843/jlk.2020.v08.i04.p13 ◽

2020 ◽

Vol 8 (4) ◽

pp. 469

Author(s):

I Gusti Ngurah Alit Indrawan ◽

I Made Widiartha

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Classification Accuracy ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Letter Recognition ◽

Bee Colony ◽

Artificial Neural ◽

Hidden Layer

Artificial Neural Networks or commonly abbreviated as ANN is one branch of science from the field of artificial intelligence which is often used to solve various problems in fields that involve grouping and pattern recognition. This research aims to classify Letter Recognition datasets using Artificial Neural Networks which are weighted optimally using the Artificial Bee Colony algorithm. The best classification accuracy results from this study were 92.85% using a combination of 4 hidden layers with each hidden layer containing 10 neurons.

Download Full-text

Fourier Neural Networks and Generalized Single Hidden Layer Networks in Aircraft Engine Fault Diagnostics

Journal of Engineering for Gas Turbines and Power ◽

10.1115/1.2179465 ◽

2005 ◽

Vol 128 (4) ◽

pp. 773-782 ◽

Cited By ~ 13

Author(s):

H. S. Tan

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Fourier Transforms ◽

Aircraft Engine ◽

Fault Classification ◽

Noise Robustness ◽

Fault Diagnostics ◽

Multiple Faults ◽

Hidden Layer

The conventional approach to neural network-based aircraft engine fault diagnostics has been mainly via multilayer feed-forward systems with sigmoidal hidden neurons trained by back propagation as well as radial basis function networks. In this paper, we explore two novel approaches to the fault-classification problem using (i) Fourier neural networks, which synthesizes the approximation capability of multidimensional Fourier transforms and gradient-descent learning, and (ii) a class of generalized single hidden layer networks (GSLN), which self-structures via Gram-Schmidt orthonormalization. Using a simulation program for the F404 engine, we generate steady-state engine parameters corresponding to a set of combined two-module deficiencies and require various neural networks to classify the multiple faults. We show that, compared to the conventional network architecture, the Fourier neural network exhibits stronger noise robustness and the GSLNs converge at a much superior speed.

Download Full-text

Prediction of Hydrodynamic Forces and Moments on Submarines Using Neural Networks

21st International Conference on Offshore Mechanics and Arctic Engineering, Volume 4 ◽

10.1115/omae2002-28592 ◽

2002 ◽

Cited By ~ 1

Author(s):

Ibrahim Mohamed ◽

Mahmoud Haddara ◽

Christopher D. Williams ◽

Michael Mackay

Keyword(s):

Neural Network ◽

Neural Networks ◽

Hydrodynamic Model ◽

Parametric Identification ◽

Hydrodynamic Forces ◽

Degree Of Freedom ◽

Experimental Time ◽

Time Histories ◽

Trained Network ◽

Identification Tool

This paper describes a parametric identification tool for predicting the hydrodynamic forces acting on a submarine model using its motion history. The tool uses a neural network to identify the hydrodynamic forces and moments; the network was trained with data obtained from multi-degree-of-freedom captive maneuvering tests. The characteristics of the trained network are demonstrated through reconstruction of the force and moment time histories. This technique has the potential to reduce experimental time and cost by enabling a full hydrodynamic model of the vehicle to be obtained from a relatively limited number of test maneuvers.

Download Full-text

Prediction of Emergency Department Hospital Admission Based on Natural Language Processing and Neural Networks

Methods of Information in Medicine ◽

10.3414/me17-01-0024 ◽

2017 ◽

Vol 56 (05) ◽

pp. 377-389 ◽

Cited By ~ 21

Author(s):

Xingyu Zhang ◽

Joyce Kim ◽

Rachel E. Patzer ◽

Stephen R. Pitts ◽

Aaron Patzer ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Emergency Department ◽

Logistic Regression ◽

Natural Language Processing ◽

Natural Language ◽

Hospital Admission ◽

Language Processing ◽

Predictive Accuracy ◽

Free Text

SummaryObjective: To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements.Methods: Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient’s reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model.Results: Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.7310.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN.Conclusions: The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient’s reason for visit regardless of modeling approach. Natural language processing and neural networks that incorporate patient-reported outcome free text may increase predictive accuracy for hospital admission.

Download Full-text

Applying of machine learning in the construction of a voice-controlled interface on the example of a music player

Journal of Computer Sciences Institute ◽

10.35784/jcsi.1324 ◽

2019 ◽

Vol 13 ◽

pp. 302-309

Author(s):

Jakub Basiakowski

Keyword(s):

Neural Network ◽

Machine Learning ◽

Convolutional Neural Network ◽

Feedforward Neural Network ◽

Hidden Layer ◽

Music Player ◽

The Impact

The following paper presents the results of research on the impact of machine learning in the construction of a voice-controlled interface. Two different models were used for the analysys: a feedforward neural network containing one hidden layer and a more complicated convolutional neural network. What is more, a comparison of the applied models was presented. This comparison was performed in terms of quality and the course of training.

Download Full-text

Solving Mixed Volterra - Fredholm Integral Equation (MVFIE) by Designing Neural Network

Baghdad Science Journal ◽

10.21123/bsj.2019.16.1.0116 ◽

2019 ◽

Vol 16 (1) ◽

pp. 0116

Author(s):

Al-Saif Et al.

Keyword(s):

Neural Network ◽

Integral Equation ◽

Analytic Solution ◽

Fredholm Integral Equations ◽

Training Algorithm ◽

Output Unit ◽

Feed Forward Neural Network ◽

Levenberg Marquardt ◽

Hidden Layer ◽

And Training

In this paper, we focus on designing feed forward neural network (FFNN) for solving Mixed Volterra – Fredholm Integral Equations (MVFIEs) of second kind in 2–dimensions. in our method, we present a multi – layers model consisting of a hidden layer which has five hidden units (neurons) and one linear output unit. Transfer function (Log – sigmoid) and training algorithm (Levenberg – Marquardt) are used as a sigmoid activation of each unit. A comparison between the results of numerical experiment and the analytic solution of some examples has been carried out in order to justify the efficiency and the accuracy of our method.

Download Full-text

Pruning Multilayered ELM Using Cholesky Factorization and Givens Rotation Transformation

Mathematical Problems in Engineering ◽

10.1155/2021/5588426 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Jingyi Liu ◽

Xinxin Liu ◽

Chongmin Liu ◽

Ba Tuan Le ◽

Dong Xiao

Keyword(s):

Neural Network ◽

Coal Industry ◽

Feedforward Neural Network ◽

Weight Matrix ◽

Cholesky Factorization ◽

Connection Weight ◽

Givens Rotation ◽

Rotation Transformation ◽

Hidden Layer ◽

Hidden Nodes

Extreme learning machine is originally proposed for the learning of the single hidden layer feedforward neural network to overcome the challenges faced by the backpropagation (BP) learning algorithm and its variants. Recent studies show that ELM can be extended to the multilayered feedforward neural network in which the hidden node could be a subnetwork of nodes or a combination of other hidden nodes. Although the ELM algorithm with multiple hidden layers shows stronger nonlinear expression ability and stability in both theoretical and experimental results than the ELM algorithm with the single hidden layer, with the deepening of the network structure, the problem of parameter optimization is also highlighted, which usually requires more time for model selection and increases the computational complexity. This paper uses Cholesky factorization strategy and Givens rotation transformation to choose the hidden nodes of MELM and obtains the number of nodes more suitable for the network. First, the initial network has a large number of hidden nodes and then uses the idea of ridge regression to prune the nodes. Finally, a complete neural network can be obtained. Therefore, the ELM algorithm eliminates the need to manually set nodes and achieves complete automation. By using information from the previous generation’s connection weight matrix, it can be evitable to re-calculate the weight matrix in the network simplification process. As in the matrix factorization methods, the Cholesky factorization factor is calculated by Givens rotation transform to achieve the fast decreasing update of the current connection weight matrix, thus ensuring the numerical stability and high efficiency of the pruning process. Empirical studies on several commonly used classification benchmark problems and the real datasets collected from coal industry show that compared with the traditional ELM algorithm, the pruning multilayered ELM algorithm proposed in this paper can find the optimal number of hidden nodes automatically and has better generalization performance.

Download Full-text