scholarly journals Channel-Level Variable Quantization Network for Deep Image Compression

Author(s):  
Zhisheng Zhong ◽  
Hiroaki Akutsu ◽  
Kiyoharu Aizawa

Deep image compression systems mainly contain four components: encoder, quantizer, entropy model, and decoder. To optimize these four components, a joint rate-distortion framework was proposed, and many deep neural network-based methods achieved great success in image compression. However, almost all convolutional neural network-based methods treat channel-wise feature maps equally, reducing the flexibility in handling different types of information. In this paper, we propose a channel-level variable quantization network to dynamically allocate more bitrates for significant channels and withdraw bitrates for negligible channels. Specifically, we propose a variable quantization controller. It consists of two key components: the channel importance module, which can dynamically learn the importance of channels during training, and the splitting-merging module, which can allocate different bitrates for different channels. We also formulate the quantizer into a Gaussian mixture model manner. Quantitative and qualitative experiments verify the effectiveness of the proposed model and demonstrate that our method achieves superior performance and can produce much better visual reconstructions.

2020 ◽  
Vol 10 (3) ◽  
pp. 809 ◽  
Author(s):  
Yunfan Chen ◽  
Hyunchul Shin

Pedestrian-related accidents are much more likely to occur during nighttime when visible (VI) cameras are much less effective. Unlike VI cameras, infrared (IR) cameras can work in total darkness. However, IR images have several drawbacks, such as low-resolution, noise, and thermal energy characteristics that can differ depending on the weather. To overcome these drawbacks, we propose an IR camera system to identify pedestrians at night that uses a novel attention-guided encoder-decoder convolutional neural network (AED-CNN). In AED-CNN, encoder-decoder modules are introduced to generate multi-scale features, in which new skip connection blocks are incorporated into the decoder to combine the feature maps from the encoder and decoder module. This new architecture increases context information which is helpful for extracting discriminative features from low-resolution and noisy IR images. Furthermore, we propose an attention module to re-weight the multi-scale features generated by the encoder-decoder module. The attention mechanism effectively highlights pedestrians while eliminating background interference, which helps to detect pedestrians under various weather conditions. Empirical experiments on two challenging datasets fully demonstrate that our method shows superior performance. Our approach significantly improves the precision of the state-of-the-art method by 5.1% and 23.78% on the Keimyung University (KMU) and Computer Vision Center (CVC)-09 pedestrian dataset, respectively.


2021 ◽  
Vol 2 ◽  
pp. 633-647
Author(s):  
Michael Schafer ◽  
Sophie Pientka ◽  
Jonathan Pfaff ◽  
Heiko Schwarz ◽  
Detlev Marpe ◽  
...  

2018 ◽  
Vol 10 (12) ◽  
pp. 1893 ◽  
Author(s):  
Wenjia Xu ◽  
Guangluan Xu ◽  
Yang Wang ◽  
Xian Sun ◽  
Daoyu Lin ◽  
...  

The spatial resolution and clarity of remote sensing images are crucial for many applications such as target detection and image classification. In the last several decades, tremendous image restoration tasks have shown great success in ordinary images. However, since remote sensing images are more complex and more blurry than ordinary images, most of the existing methods are not good enough for remote sensing image restoration. To address such problem, we propose a novel method named deep memory connected network (DMCN) based on the convolutional neural network to reconstruct high-quality images. We build local and global memory connections to combine image detail with global information. To further reduce parameters and ease time consumption, we propose Downsampling Units, shrinking the spatial size of feature maps. We verify its capability on two representative applications, Gaussian image denoising and single image super-resolution (SR). DMCN is tested on three remote sensing datasets with various spatial resolution. Experimental results indicate that our method yields promising improvements and better visual performance over the current state-of-the-art. The PSNR and SSIM improvements over the second best method are up to 0.3 dB.


2019 ◽  
Vol 9 (17) ◽  
pp. 3580 ◽  
Author(s):  
Cheng Wang ◽  
Yifei Han ◽  
Weidong Wang

Lossy image compression can reduce the bandwidth required for image transmission in a network and the storage space of a device, which is of great value in improving network efficiency. With the rapid development of deep learning theory, neural networks have achieved great success in image processing. In this paper, inspired by the diverse extent of attention in human eyes to each region of the image, we propose an image compression framework based on semantic analysis, which creatively combines the application of deep learning in the field of image classification and image compression. We first use a convolutional neural network (CNN) to semantically analyze the image, obtain the semantic importance map, and propose a compression bit allocation algorithm to allow the recurrent neural network (RNN)-based compression network to hierarchically compress the image according to the semantic importance map. Experimental results validate that the proposed compression framework has better visual quality compared with other methods at the same compression ratio.


2020 ◽  
Vol 34 (07) ◽  
pp. 11013-11020 ◽  
Author(s):  
Yueyu Hu ◽  
Wenhan Yang ◽  
Jiaying Liu

Approaches to image compression with machine learning now achieve superior performance on the compression rate compared to existing hybrid codecs. The conventional learning-based methods for image compression exploits hyper-prior and spatial context model to facilitate probability estimations. Such models have limitations in modeling long-term dependency and do not fully squeeze out the spatial redundancy in images. In this paper, we propose a coarse-to-fine framework with hierarchical layers of hyper-priors to conduct comprehensive analysis of the image and more effectively reduce spatial redundancy, which improves the rate-distortion performance of image compression significantly. Signal Preserving Hyper Transforms are designed to achieve an in-depth analysis of the latent representation and the Information Aggregation Reconstruction sub-network is proposed to maximally utilize side-information for reconstruction. Experimental results show the effectiveness of the proposed network to efficiently reduce the redundancies in images and improve the rate-distortion performance, especially for high-resolution images. Our project is publicly available at https://huzi96.github.io/coarse-to-fine-compression.html.


2021 ◽  
Vol 38 (3) ◽  
pp. 895-906
Author(s):  
Ruiyang Qi ◽  
Zhiqiang Liu

Fire image monitoring systems are being applied to more and more fields, owing to their large monitoring area. However, the existing image processing-based fire detection technology cannot effectively make real-time fire warning in actual scenes, and the relevant fire recognition algorithms are not robust enough. To solve the problems, this paper tries to extract and classify image features for fire recognition based on convolutional neural network (CNN). Specifically, the authors set up the framework of a fire recognition system based on fire video images (FVIFRS), and extracted both static and dynamic features of flame. To improve the efficiency of image analysis, a Gaussian mixture model was established to extract the features from the fire smoke movement areas. Finally, the CNN was improved to process and classify the fire feature maps of the CNN. The proposed algorithm and model were proved to be feasible and effective through experiments.


2021 ◽  
Author(s):  
Michael Schafer ◽  
Sophie Pientka ◽  
Jonathan Pfaff ◽  
Heiko Schwarz ◽  
Detlev Marpe ◽  
...  

Author(s):  
Ranganathan G ◽  
Bindhu V

There have been many compression standards developed during the past few decades and technological advances has resulted in introducing many methodologies with promising results. As far as PSNR metric is concerned, there is a performance gap between reigning compression standards and learned compression algorithms. Based on research, we experimented using an accurate entropy model on the learned compression algorithms to determine the rate-distortion performance. In this paper, discretized Gaussian Mixture likelihood is proposed to determine the latent code parameters in order to attain a more flexible and accurate model of entropy. Moreover, we have also enhanced the performance of the work by introducing recent attention modules in the network architecture. Simulation results indicate that when compared with the previously existing techniques using high-resolution and Kodak datasets, the proposed work achieves a higher rate of performance. When MS-SSIM is used for optimization, our work generates a more visually pleasant image.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Narjes Rohani ◽  
Changiz Eslahchi

Abstract Drug-Drug Interaction (DDI) prediction is one of the most critical issues in drug development and health. Proposing appropriate computational methods for predicting unknown DDI with high precision is challenging. We proposed "NDD: Neural network-based method for drug-drug interaction prediction" for predicting unknown DDIs using various information about drugs. Multiple drug similarities based on drug substructure, target, side effect, off-label side effect, pathway, transporter, and indication data are calculated. At first, NDD uses a heuristic similarity selection process and then integrates the selected similarities with a nonlinear similarity fusion method to achieve high-level features. Afterward, it uses a neural network for interaction prediction. The similarity selection and similarity integration parts of NDD have been proposed in previous studies of other problems. Our novelty is to combine these parts with new neural network architecture and apply these approaches in the context of DDI prediction. We compared NDD with six machine learning classifiers and six state-of-the-art graph-based methods on three benchmark datasets. NDD achieved superior performance in cross-validation with AUPR ranging from 0.830 to 0.947, AUC from 0.954 to 0.994 and F-measure from 0.772 to 0.902. Moreover, cumulative evidence in case studies on numerous drug pairs, further confirm the ability of NDD to predict unknown DDIs. The evaluations corroborate that NDD is an efficient method for predicting unknown DDIs. The data and implementation of NDD are available at https://github.com/nrohani/NDD.


Sign in / Sign up

Export Citation Format

Share Document