HLHLp: Quantized Neural Networks Training for Reaching Flat Minima in Loss Surface
Quantization of deep neural networks is extremely essential for efficient implementations. Low-precision networks are typically designed to represent original floating-point counterparts with high fidelity, and several elaborate quantization algorithms have been developed. We propose a novel training scheme for quantized neural networks to reach flat minima in the loss surface with the aid of quantization noise. The proposed training scheme employs high-low-high-low precision in an alternating manner for network training. The learning rate is also abruptly changed at each stage for coarse- or fine-tuning. With the proposed training technique, we show quite good performance improvements for convolutional neural networks when compared to the previous fine-tuning based quantization scheme. We achieve the state-of-the-art results for recurrent neural network based language modeling with 2-bit weight and activation.