uniform quantization Latest Research Papers

Driven by the need for the compression of weights in neural networks (NNs), which is especially beneficial for edge devices with a constrained resource, and by the need to utilize the simplest possible quantization model, in this paper, we study the performance of three-bit post-training uniform quantization. The goal is to put various choices of the key parameter of the quantizer in question (support region threshold) in one place and provide a detailed overview of this choice’s impact on the performance of post-training quantization for the MNIST dataset. Specifically, we analyze whether it is possible to preserve the accuracy of the two NN models (MLP and CNN) to a great extent with the very simple three-bit uniform quantizer, regardless of the choice of the key parameter. Moreover, our goal is to answer the question of whether it is of the utmost importance in post-training three-bit uniform quantization, as it is in quantization, to determine the optimal support region threshold value of the quantizer to achieve some predefined accuracy of the quantized neural network (QNN). The results show that the choice of the support region threshold value of the three-bit uniform quantizer does not have such a strong impact on the accuracy of the QNNs, which is not the case with two-bit uniform post-training quantization, when applied in MLP for the same classification task. Accordingly, one can anticipate that due to this special property, the post-training quantization model in question can be greatly exploited.

Download Full-text

Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks

Journal of Physics Conference Series ◽

10.1088/1742-6596/2134/1/012004 ◽

2021 ◽

Vol 2134 (1) ◽

pp. 012004

Author(s):

D Chudakov ◽

A Goncharenko ◽

S Alyamkin ◽

A Densidov

Keyword(s):

Neural Network ◽

Neural Networks ◽

Quantization Noise ◽

Uniform Quantization ◽

Resource Requirements ◽

The Moment ◽

Iterative Adaptation

Abstract Quantization is one of the most popular and widely used methods of speeding up a neural network. At the moment, the standard is 8-bit uniform quantization. Nevertheless, the use of uniform low-bit quantization (4- and 6-bit quantization) has significant advantages in speed and resource requirements for inference. We present our quantization algorithm that offers advantages when using uniform low-bit quantization. It is faster than quantization-aware training from scratch and more accurate than methods aimed only at selecting thresholds and reducing noise from quantization. We also investigated quantization noise in neural networks for low-bit quantization and concluded that quantization noise is not always a good metric for quantization quality.

Download Full-text

Iterative Algorithm for Parameterization of Two-Region Piecewise Uniform Quantizer for the Laplacian Source

Mathematics ◽

10.3390/math9233091 ◽

2021 ◽

Vol 9 (23) ◽

pp. 3091

Author(s):

Jelena Nikolić ◽

Danijela Aleksić ◽

Zoran Perić ◽

Milan Dinčić

Keyword(s):

Neural Networks ◽

Performance Assessment ◽

Probability Density ◽

Probability Density Functions ◽

Density Functions ◽

Uniform Quantization ◽

Memory Footprint ◽

High Bit Rates ◽

And Performance ◽

Laplacian Distribution

Motivated by the fact that uniform quantization is not suitable for signals having non-uniform probability density functions (pdfs), as the Laplacian pdf is, in this paper we have divided the support region of the quantizer into two disjunctive regions and utilized the simplest uniform quantization with equal bit-rates within both regions. In particular, we assumed a narrow central granular region (CGR) covering the peak of the Laplacian pdf and a wider peripheral granular region (PGR) where the pdf is predominantly tailed. We performed optimization of the widths of CGR and PGR via distortion optimization per border–clipping threshold scaling ratio which resulted in an iterative formula enabling the parametrization of our piecewise uniform quantizer (PWUQ). For medium and high bit-rates, we demonstrated the convenience of our PWUQ over the uniform quantizer, paying special attention to the case where 99.99% of the signal amplitudes belong to the support region or clipping region. We believe that the resulting formulas for PWUQ design and performance assessment are greatly beneficial in neural networks where weights and activations are typically modelled by the Laplacian distribution, and where uniform quantization is commonly used to decrease memory footprint.

Download Full-text

Multi-level Memristors based on Two-dimensional Electron Gases in Oxide Heterostructures for High Precision Neuromorphic Computing

10.21203/rs.3.rs-1019162/v1 ◽

2021 ◽

Author(s):

Sunwoo Lee ◽

Jaeyoung Jeon ◽

Kitae Eom ◽

Chaehwa Jeong ◽

Yongsoo Yang ◽

...

Keyword(s):

High Precision ◽

Essential Elements ◽

Valuable Insight ◽

Tunneling Conductance ◽

Two Dimensional ◽

Dimensional Electron ◽

Retention Performance ◽

Uniform Quantization ◽

Multi Level ◽

Conductance States

Abstract Memristors are essential elements for hardware implementation of artificial neural networks. The key functionality of the memristors is to realize multiple non-volatile conductance states with high precision. However, the variation of device conductance limits the number of allowed states. Since actual data for neural network training inherently have a non-uniform distribution, the insufficient number of conductance states and the resultant inaccurate weight quantization may generate significant errors in the memristor-based computation. Herein, we demonstrate a multi-level memristor based on two-dimensional electron gas in a Pt/LaAlO3/SrTiO3 heterostructure. By redistributing oxygen vacancies, we precisely controlled the tunneling conductance of the device, achieving multiple conductance states (more than 27). The multi-level switching capability and the high retention performance allow us to implement a variance-aware weight quantization (VAQ), designed for improved computing accuracy. We verify that the VAQ provides greater accuracy in image classification process, as compared to conventional uniform quantization. These results provide valuable insight into developing high-precision multi-bit memristors for practical neuromorphic processors.

Download Full-text

A General Rate-Distortion Optimization Method for Block Compressed Sensing of Images

Entropy ◽

10.3390/e23101354 ◽

2021 ◽

Vol 23 (10) ◽

pp. 1354

Author(s):

Qunlin Chen ◽

Derong Chen ◽

Jiulu Gong

Keyword(s):

Compressed Sensing ◽

Sampling Rate ◽

Rate Distortion ◽

Optimization Method ◽

Rate Model ◽

Bit Rate ◽

Unified Framework ◽

Uniform Quantization ◽

Block Compressed Sensing ◽

Predictive Quantization

Block compressed sensing (BCS) is a promising technology for image sampling and compression for resource-constrained applications, but it needs to balance the sampling rate and quantization bit-depth for a bit-rate constraint. In this paper, we summarize the commonly used CS quantization frameworks into a unified framework, and a new bit-rate model and a model of the optimal bit-depth are proposed for the unified CS framework. The proposed bit-rate model reveals the relationship between the bit-rate, sampling rate, and bit-depth based on the information entropy of generalized Gaussian distribution. The optimal bit-depth model can predict the optimal bit-depth of CS measurements at a given bit-rate. Then, we propose a general algorithm for choosing sampling rate and bit-depth based on the proposed models. Experimental results show that the proposed algorithm achieves near-optimal rate-distortion performance for the uniform quantization framework and predictive quantization framework in BCS.

Download Full-text

Distributed Event-Triggered Circle Formation Control for Multiagent Systems with Nonuniform Quantization

Complexity ◽

10.1155/2021/6684849 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Jiayan Wen ◽

Haijiang Zhang ◽

Guangxing Tan ◽

Ning Cai ◽

Guangming Xie

Keyword(s):

Multiagent Systems ◽

Communication Technology ◽

Formation Control ◽

Nearest Neighbor ◽

Sufficient Conditions ◽

Angular Distance ◽

Communication Framework ◽

Uniform Quantization ◽

Circle Formation ◽

Event Triggered

This article focuses on circle formation control problem of multiagent systems based on event-triggered strategy under limited communication bandwidth. In such system, each agent can only perceive the angular distance of its nearest neighbor in the counterclockwise direction, and the angular distance of the nearest neighbor in the clockwise direction needs to be obtained by communicating with each other. In order to address the aforementioned problem, a novel distributed algorithm based on the combination of nonuniform quantitative communication technology and event-triggered control is proposed. Sufficient conditions on circle formation control are derived under which the states of all agents can be confirmed to converge to some desired equilibrium point. Different from the traditional uniform quantization communication framework, nonuniform quantization can be beneficial for handling small signals and improving the performance of multiagent systems concerned. Furthermore, under the proposed policy, all the designed quantizers do not emerge saturated. Numerical simulation results are provided to verify the effectiveness of the proposed algorithm.

Download Full-text

Design of a 2-Bit Neural Network Quantizer for Laplacian Source

Entropy ◽

10.3390/e23080933 ◽

2021 ◽

Vol 23 (8) ◽

pp. 933

Author(s):

Zoran Perić ◽

Milan Savić ◽

Nikola Simić ◽

Bojan Denić ◽

Vladimir Despotović

Keyword(s):

Neural Network ◽

Classification Accuracy ◽

Processing Time ◽

Computing Power ◽

Popular Approach ◽

Network Applications ◽

High Classification Accuracy ◽

Proposed Model ◽

Uniform Quantization ◽

Neural Network Applications

Achieving real-time inference is one of the major issues in contemporary neural network applications, as complex algorithms are frequently being deployed to mobile devices that have constrained storage and computing power. Moving from a full-precision neural network model to a lower representation by applying quantization techniques is a popular approach to facilitate this issue. Here, we analyze in detail and design a 2-bit uniform quantization model for Laplacian source due to its significance in terms of implementation simplicity, which further leads to a shorter processing time and faster inference. The results show that it is possible to achieve high classification accuracy (more than 96% in the case of MLP and more than 98% in the case of CNN) by implementing the proposed model, which is competitive to the performance of the other quantization solutions with almost optimal precision.

Download Full-text

Hybrid and non-uniform quantization methods using retro synthesis data for efficient inference

10.1109/ijcnn52387.2021.9533724 ◽

2021 ◽

Author(s):

GVSL Tej Pratap ◽

Raja Kumar ◽

NS Pradeep

Keyword(s):

Uniform Quantization

Download Full-text

Full-Duplex Cell-Free mMIMO Systems: Analysis and Decentralized Optimization

10.36227/techrxiv.14805216.v1 ◽

2021 ◽

Author(s):

Soumyadeep Datta

Keyword(s):

Multiple Input Multiple Output ◽

High Capacity ◽

Full Duplex ◽

Decentralized Optimization ◽

Free System ◽

Maximum Ratio Combining ◽

Alternating Direction ◽

Uniform Quantization ◽

Input Multiple Output ◽

The Individual

<p>Cell-free (CF) massive multiple-input-multiple-output (mMIMO) deployments are usually investigated with half-duplex nodes and high-capacity fronthaul links. To leverage the possible gains in throughput and energy efficiency (EE) of full-duplex (FD) communications, we consider a FD CF mMIMO system with practical limited-capacity fronthaul links. We derive closed-form spectral efficiency (SE) lower bounds for this system with maximum-ratio combining/maximum-ratio transmission processing and optimal uniform quantization. We then optimize the weighted sum EE (WSEE) via downlink and uplink power control by using a two-layered approach: the first layer formulates the optimization as a generalized convex program, while the second layer solves the optimization decentrally using alternating direction method of multipliers. We analytically show that the proposed two-layered formulation yields a Karush-Kuhn-Tucker point of the original WSEE optimization. We numerically show the influence of weights on the individual EE of the users, which demonstrates the utility of WSEE metric to incorporate heterogeneous EE requirements of users. We show that the low fronthaul capacity reduces the number of users each AP can support, and the cell-free system, consequently, becomes user-centric.</p>

Download Full-text

Full-Duplex Cell-Free mMIMO Systems: Analysis and Decentralized Optimization

10.36227/techrxiv.14805216 ◽

2021 ◽

Author(s):

Soumyadeep Datta

Keyword(s):

Multiple Input Multiple Output ◽

High Capacity ◽

Full Duplex ◽

Decentralized Optimization ◽

Free System ◽

Maximum Ratio Combining ◽

Alternating Direction ◽

Uniform Quantization ◽

Input Multiple Output ◽

The Individual

<p>Cell-free (CF) massive multiple-input-multiple-output (mMIMO) deployments are usually investigated with half-duplex nodes and high-capacity fronthaul links. To leverage the possible gains in throughput and energy efficiency (EE) of full-duplex (FD) communications, we consider a FD CF mMIMO system with practical limited-capacity fronthaul links. We derive closed-form spectral efficiency (SE) lower bounds for this system with maximum-ratio combining/maximum-ratio transmission processing and optimal uniform quantization. We then optimize the weighted sum EE (WSEE) via downlink and uplink power control by using a two-layered approach: the first layer formulates the optimization as a generalized convex program, while the second layer solves the optimization decentrally using alternating direction method of multipliers. We analytically show that the proposed two-layered formulation yields a Karush-Kuhn-Tucker point of the original WSEE optimization. We numerically show the influence of weights on the individual EE of the users, which demonstrates the utility of WSEE metric to incorporate heterogeneous EE requirements of users. We show that the low fronthaul capacity reduces the number of users each AP can support, and the cell-free system, consequently, becomes user-centric.</p>

Download Full-text

uniform quantization
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Whether the Support Region of Three-Bit Uniform Quantizer Has a Strong Impact on Post-Training Quantization for MNIST Dataset?

Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks

Iterative Algorithm for Parameterization of Two-Region Piecewise Uniform Quantizer for the Laplacian Source

Multi-level Memristors based on Two-dimensional Electron Gases in Oxide Heterostructures for High Precision Neuromorphic Computing

A General Rate-Distortion Optimization Method for Block Compressed Sensing of Images

Distributed Event-Triggered Circle Formation Control for Multiagent Systems with Nonuniform Quantization

Design of a 2-Bit Neural Network Quantizer for Laplacian Source

Hybrid and non-uniform quantization methods using retro synthesis data for efficient inference

Full-Duplex Cell-Free mMIMO Systems: Analysis and Decentralized Optimization

Full-Duplex Cell-Free mMIMO Systems: Analysis and Decentralized Optimization

Export Citation Format

uniform quantizationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Whether the Support Region of Three-Bit Uniform Quantizer Has a Strong Impact on Post-Training Quantization for MNIST Dataset?

Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks

Iterative Algorithm for Parameterization of Two-Region Piecewise Uniform Quantizer for the Laplacian Source

Multi-level Memristors based on Two-dimensional Electron Gases in Oxide Heterostructures for High Precision Neuromorphic Computing

A General Rate-Distortion Optimization Method for Block Compressed Sensing of Images

Distributed Event-Triggered Circle Formation Control for Multiagent Systems with Nonuniform Quantization

Design of a 2-Bit Neural Network Quantizer for Laplacian Source

Hybrid and non-uniform quantization methods using retro synthesis data for efficient inference

Full-Duplex Cell-Free mMIMO Systems: Analysis and Decentralized Optimization

Full-Duplex Cell-Free mMIMO Systems: Analysis and Decentralized Optimization

uniform quantization
Recently Published Documents