IMPROVING GENERALIZATION OF NEURAL NETWORKS THROUGH PRUNING

1991 ◽  
Vol 01 (04) ◽  
pp. 317-326 ◽  
Author(s):  
Hans Henrik Thodberg

A technique for constructing neural network architectures with better ability to generalize is presented under the name Ockham's Razor: several networks are trained and then pruned by removing connections one by one and retraining. The networks which achieve fewest connections generalize best. The method is tested on a classification of bit strings (the contiguity problem): the optimal architecture emerges, resulting in perfect generalization. The internal representation of the network changes substantially during the retraining, and this distinguishes the method from previous pruning studies.

2022 ◽  
Vol 4 (4) ◽  
pp. 1-22
Author(s):  
Valentina Candiani ◽  
◽  
Matteo Santacesaria ◽  

<abstract><p>We consider the problem of the detection of brain hemorrhages from three-dimensional (3D) electrical impedance tomography (EIT) measurements. This is a condition requiring urgent treatment for which EIT might provide a portable and quick diagnosis. We employ two neural network architectures - a fully connected and a convolutional one - for the classification of hemorrhagic and ischemic strokes. The networks are trained on a dataset with $ 40\, 000 $ samples of synthetic electrode measurements generated with the complete electrode model on realistic heads with a 3-layer structure. We consider changes in head anatomy and layers, electrode position, measurement noise and conductivity values. We then test the networks on several datasets of unseen EIT data, with more complex stroke modeling (different shapes and volumes), higher levels of noise and different amounts of electrode misplacement. On most test datasets we achieve $ \geq 90\% $ average accuracy with fully connected neural networks, while the convolutional ones display an average accuracy $ \geq 80\% $. Despite the use of simple neural network architectures, the results obtained are very promising and motivate the applications of EIT-based classification methods on real phantoms and ultimately on human patients.</p></abstract>


2021 ◽  
Vol 507 (3) ◽  
pp. 4061-4073
Author(s):  
Thorben Finke ◽  
Michael Krämer ◽  
Silvia Manconi

ABSTRACT Despite the growing number of gamma-ray sources detected by the Fermi-Large Area Telescope (LAT), about one-third of the sources in each survey remains of uncertain type. We present a new deep neural network approach for the classification of unidentified or unassociated gamma-ray sources in the last release of the Fermi-LAT catalogue (4FGL-DR2) obtained with 10 yr of data. In contrast to previous work, our method directly uses the measurements of the photon energy spectrum and time series as input for the classification, instead of specific, human-crafted features. Dense neural networks, and for the first time in the context of gamma-ray source classification recurrent neural networks, are studied in depth. We focus on the separation between extragalactic sources, i.e. active galactic nuclei, and Galactic pulsars, and on the further classification of pulsars into young and millisecond pulsars. Our neural network architectures provide powerful classifiers, with a performance that is comparable to previous analyses based on human-crafted features. Our benchmark neural network predicts that of the sources of uncertain type in the 4FGL-DR2 catalogue, 1050 are active galactic nuclei and 78 are Galactic pulsars, with both classes following the expected sky distribution and the clustering in the variability–curvature plane. We investigate the problem of sample selection bias by testing our architectures against a cross-match test data set using an older catalogue, and propose a feature selection algorithm using autoencoders. Our list of high-confidence candidate sources labelled by the neural networks provides a set of targets for further multiwavelength observations addressed to identify their nature. The deep neural network architectures we develop can be easily extended to include specific features, as well as multiwavelength data on the source photon energy and time spectra coming from different instruments.


Author(s):  
Swathi Jamjala Narayanan ◽  
Boominathan Perumal ◽  
Jayant G. Rohra

Nature-inspired algorithms have been productively applied to train neural network architectures. There exist other mechanisms like gradient descent, second order methods, Levenberg-Marquardt methods etc. to optimize the parameters of neural networks. Compared to gradient-based methods, nature-inspired algorithms are found to be less sensitive towards the initial weights set and also it is less likely to become trapped in local optima. Despite these benefits, some nature-inspired algorithms also suffer from stagnation when applied to neural networks. The other challenge when applying nature inspired techniques for neural networks would be in handling large dimensional and correlated weight space. Hence, there arises a need for scalable nature inspired algorithms for high dimensional neural network optimization. In this chapter, the characteristics of nature inspired techniques towards optimizing neural network architectures along with its applicability, advantages and limitations/challenges are studied.


Sign in / Sign up

Export Citation Format

Share Document