scholarly journals Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference

2021 ◽  
Vol 4 ◽  
Author(s):  
Benjamin Hawks ◽  
Javier Duarte ◽  
Nicholas J. Fraser ◽  
Alessandro Pappalardo ◽  
Nhan Tran ◽  
...  

Efficient machine learning implementations optimized for inference in hardware have wide-ranging benefits, depending on the application, from lower inference latency to higher data throughput and reduced energy consumption. Two popular techniques for reducing computation in neural networks are pruning, removing insignificant synapses, and quantization, reducing the precision of the calculations. In this work, we explore the interplay between pruning and quantization during the training of neural networks for ultra low latency applications targeting high energy physics use cases. Techniques developed for this study have potential applications across many other domains. We study various configurations of pruning during quantization-aware training, which we term quantization-aware pruning, and the effect of techniques like regularization, batch normalization, and different pruning schemes on performance, computational complexity, and information content metrics. We find that quantization-aware pruning yields more computationally efficient models than either pruning or quantization alone for our task. Further, quantization-aware pruning typically performs similar to or better in terms of computational efficiency compared to other neural architecture search techniques like Bayesian optimization. Surprisingly, while networks with different training configurations can have similar performance for the benchmark application, the information content in the network can vary significantly, affecting its generalizability.

2016 ◽  
Vol 93 (9) ◽  
Author(s):  
Pierre Baldi ◽  
Kevin Bauer ◽  
Clara Eng ◽  
Peter Sadowski ◽  
Daniel Whiteson

Author(s):  
Tadeusz Wibig

Standard experimental data analysis is based mainly on conventional, deterministic inference. The complexity of modern physics problems has become so large that new ideas in the field are received with the highest of appreciation. In this paper, the author has analyzed the problem of contemporary high-energy physics concerning the estimation of some parameters of the observed complex phenomenon. This article confronts the Natural and Artificial Networks performance with the standard statistical method of the data analysis and minimization. The general concept of the relations between CI and standard (external) classical and modern informatics was realized and studied by utilizing of Natural Neural Networks (NNN), Artificial Neural Networks (ANN) and MINUIT minimization package from CERN. The idea of Autonomic Computing was followed by using brains of high school students involved in the Roland Maze Project. Some preliminary results of the comparison are given and discussed.


2019 ◽  
Vol 214 ◽  
pp. 06027
Author(s):  
Adrian Bevan ◽  
Thomas Charman ◽  
Jonathan Hays

HIPSTER (Heavily Ionising Particle Standard Toolkit for Event Recognition) is an open source Python package designed to facilitate the use of TensorFlow in a high energy physics analysis context. The core functionality of the software is presented, with images from the MoEDAL experiment Nuclear Track Detectors (NTDs) serving as an example dataset. Convolutional neural networks are selected as the classification algorithm for this dataset and the process of training a variety of models with different hyper-parameters is detailed. Next the results are shown for the MoEDAL problem demonstrating the rich information output by HIPSTER that enables the user to probe the performance of their model in detail.


2019 ◽  
Vol 207 ◽  
pp. 05005 ◽  
Author(s):  
Mirco Huennefeld

Reliable and accurate reconstruction methods are vital to the success of high-energy physics experiments such as IceCube. Machine learning based techniques, in particular deep neural networks, can provide a viable alternative to maximum-likelihood methods. However, most common neural network architectures were developed for other domains such as image recogntion. While these methods can enhance the reconstruction performance in IceCube, there is much potential for tailored techniques. In the typical physics use-case, many symmetries, invariances and prior knowledge exist in the data, which are not fully exploited by current network architectures. Novel and specialized deep learning based reconstruction techniques are desired which can leverage the physics potential of experiments like IceCube. A reconstruction method using convolutional neural networks is presented which can significantly increase the reconstruction accuracy while greatly reducing the runtime in comparison to standard reconstruction methods in Ice- Cube. In addition, first results are discussed for future developments based on generative neural networks.


1993 ◽  
Vol 5 (4) ◽  
pp. 505-549 ◽  
Author(s):  
Bruce Denby

In the past few years a wide variety of applications of neural networks to pattern recognition in experimental high-energy physics has appeared. The neural network solutions are in general of high quality, and, in a number of cases, are superior to those obtained using "traditional'' methods. But neural networks are of particular interest in high-energy physics for another reason as well: much of the pattern recognition must be performed online, that is, in a few microseconds or less. The inherent parallelism of neural network algorithms, and the ability to implement them as very fast hardware devices, may make them an ideal technology for this application.


1999 ◽  
Vol 11 (6) ◽  
pp. 1281-1296
Author(s):  
Marco Budinich ◽  
Renato Frison

We present two methods for nonuniformity correction of imaging array detectors based on neural networks; both exploit image properties to supply lack of calibrations and maximize the entropy of the output. The first method uses a self-organizing net that produces a linear correction of the raw data with coefficients that adapt continuously. The second method employs a kind of contrast equalization curve to match pixel distributions. Our work originates from silicon detectors, but the treatment is general enough to be applicable to many kinds of array detectors like those used in infrared imaging or in high-energy physics.


Sign in / Sign up

Export Citation Format

Share Document