scholarly journals An End-to-End Deep Learning Image Compression Framework Based on Semantic Analysis

2019 ◽  
Vol 9 (17) ◽  
pp. 3580 ◽  
Author(s):  
Cheng Wang ◽  
Yifei Han ◽  
Weidong Wang

Lossy image compression can reduce the bandwidth required for image transmission in a network and the storage space of a device, which is of great value in improving network efficiency. With the rapid development of deep learning theory, neural networks have achieved great success in image processing. In this paper, inspired by the diverse extent of attention in human eyes to each region of the image, we propose an image compression framework based on semantic analysis, which creatively combines the application of deep learning in the field of image classification and image compression. We first use a convolutional neural network (CNN) to semantically analyze the image, obtain the semantic importance map, and propose a compression bit allocation algorithm to allow the recurrent neural network (RNN)-based compression network to hierarchically compress the image according to the semantic importance map. Experimental results validate that the proposed compression framework has better visual quality compared with other methods at the same compression ratio.

Author(s):  
Y. Lin ◽  
K. Suzuki ◽  
H. Takeda ◽  
K. Nakamura

Abstract. Nowadays, digitizing roadside objects, for instance traffic signs, is a necessary step for generating High Definition Maps (HD Map) which remains as an open challenge. Rapid development of deep learning technology using Convolutional Neural Networks (CNN) has achieved great success in computer vision field in recent years. However, performance of most deep learning algorithms highly depends on the quality of training data. Collecting the desired training dataset is a difficult task, especially for roadside objects due to their imbalanced numbers along roadside. Although, training the neural network using synthetic data have been proposed. The distribution gap between synthetic and real data still exists and could aggravate the performance. We propose to transfer the style between synthetic and real data using Multi-Task Generative Adversarial Networks (SYN-MTGAN) before training the neural network which conducts the detection of roadside objects. Experiments focusing on traffic signs show that our proposed method can reach mAP of 0.77 and is able to improve detection performance for objects whose training samples are difficult to collect.


Entropy ◽  
2021 ◽  
Vol 23 (7) ◽  
pp. 816
Author(s):  
Pingping Liu ◽  
Xiaokang Yang ◽  
Baixin Jin ◽  
Qiuzhan Zhou

Diabetic retinopathy (DR) is a common complication of diabetes mellitus (DM), and it is necessary to diagnose DR in the early stages of treatment. With the rapid development of convolutional neural networks in the field of image processing, deep learning methods have achieved great success in the field of medical image processing. Various medical lesion detection systems have been proposed to detect fundus lesions. At present, in the image classification process of diabetic retinopathy, the fine-grained properties of the diseased image are ignored and most of the retinopathy image data sets have serious uneven distribution problems, which limits the ability of the network to predict the classification of lesions to a large extent. We propose a new non-homologous bilinear pooling convolutional neural network model and combine it with the attention mechanism to further improve the network’s ability to extract specific features of the image. The experimental results show that, compared with the most popular fundus image classification models, the network model we proposed can greatly improve the prediction accuracy of the network while maintaining computational efficiency.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Jingkai Weng ◽  
Yujiang Ding ◽  
Chengbo Hu ◽  
Xue-Feng Zhu ◽  
Bin Liang ◽  
...  

AbstractAnalyzing scattered wave to recognize object is of fundamental significance in wave physics. Recently-emerged deep learning technique achieved great success in interpreting wave field such as in ultrasound non-destructive testing and disease diagnosis, but conventionally need time-consuming computer postprocessing or bulky-sized diffractive elements. Here we theoretically propose and experimentally demonstrate a purely-passive and small-footprint meta-neural-network for real-time recognizing complicated objects by analyzing acoustic scattering. We prove meta-neural-network mimics a standard neural network despite its compactness, thanks to unique capability of its metamaterial unit-cells (dubbed meta-neurons) to produce deep-subwavelength phase shift as training parameters. The resulting device exhibits the “intelligence” to perform desired tasks with potential to overcome the current limitations, showcased by two distinctive examples of handwritten digit recognition and discerning misaligned orbital-angular-momentum vortices. Our mechanism opens the route to new metamaterial-based deep-learning paradigms and enable conceptual devices automatically analyzing signals, with far-reaching implications for acoustics and related fields.


2019 ◽  
Vol 11 (9) ◽  
pp. 1006 ◽  
Author(s):  
Quanlong Feng ◽  
Jianyu Yang ◽  
Dehai Zhu ◽  
Jiantao Liu ◽  
Hao Guo ◽  
...  

Coastal land cover classification is a significant yet challenging task in remote sensing because of the complex and fragmented nature of coastal landscapes. However, availability of multitemporal and multisensor remote sensing data provides opportunities to improve classification accuracy. Meanwhile, rapid development of deep learning has achieved astonishing results in computer vision tasks and has also been a popular topic in the field of remote sensing. Nevertheless, designing an effective and concise deep learning model for coastal land cover classification remains problematic. To tackle this issue, we propose a multibranch convolutional neural network (MBCNN) for the fusion of multitemporal and multisensor Sentinel data to improve coastal land cover classification accuracy. The proposed model leverages a series of deformable convolutional neural networks to extract representative features from a single-source dataset. Extracted features are aggregated through an adaptive feature fusion module to predict final land cover categories. Experimental results indicate that the proposed MBCNN shows good performance, with an overall accuracy of 93.78% and a Kappa coefficient of 0.9297. Inclusion of multitemporal data improves accuracy by an average of 6.85%, while multisensor data contributes to 3.24% of accuracy increase. Additionally, the featured fusion module in this study also increases accuracy by about 2% when compared with the feature-stacking method. Results demonstrate that the proposed method can effectively mine and fuse multitemporal and multisource Sentinel data, which improves coastal land cover classification accuracy.


Author(s):  
Zhisheng Zhong ◽  
Hiroaki Akutsu ◽  
Kiyoharu Aizawa

Deep image compression systems mainly contain four components: encoder, quantizer, entropy model, and decoder. To optimize these four components, a joint rate-distortion framework was proposed, and many deep neural network-based methods achieved great success in image compression. However, almost all convolutional neural network-based methods treat channel-wise feature maps equally, reducing the flexibility in handling different types of information. In this paper, we propose a channel-level variable quantization network to dynamically allocate more bitrates for significant channels and withdraw bitrates for negligible channels. Specifically, we propose a variable quantization controller. It consists of two key components: the channel importance module, which can dynamically learn the importance of channels during training, and the splitting-merging module, which can allocate different bitrates for different channels. We also formulate the quantizer into a Gaussian mixture model manner. Quantitative and qualitative experiments verify the effectiveness of the proposed model and demonstrate that our method achieves superior performance and can produce much better visual reconstructions.


Author(s):  
Diyar Waysi Naaman

Image compression research has increased dramatically as a result of the growing demands for image transmission in computer and mobile environments. It is needed especially for reduced storage and efficient image transmission and used to reduce the bits necessary to represent a picture digitally while preserving its original quality. Fractal encoding is an advanced technique of image compression. It is based on the image's forms as well as the generation of repetitive blocks via mathematical conversions. Because of resources needed to compress large data volumes, enormous programming time is needed, therefore Fractal Image Compression's main disadvantage is a very high encoding time where decoding times are extremely fast. An artificial intelligence technique similar to a neural network is used to reduce the search space and encoding time for images by employing a neural network algorithm known as the “back propagation” neural network algorithm. Initially, the image is divided into fixed-size and domains. For each range block its most matched domain is selected, its range index is produced and best matched domains index is the expert system's input, which reduces matching domain blocks in sets of results. This leads in the training of the neural network. This trained network is now used to compress other images which give encoding a lot less time. During the decoding phase, any random original image, converging after some changes to the Fractal image, reciprocates the transformation parameters. The quality of this FIC is indeed demonstrated by the simulation findings. This paper explores a unique neural network FIC that is capable of increasing neural network speed and image quality simultaneously.


Author(s):  
Devon Livingstone ◽  
Aron S. Talai ◽  
Justin Chau ◽  
Nils D. Forkert

Abstract Background Otologic diseases are often difficult to diagnose accurately for primary care providers. Deep learning methods have been applied with great success in many areas of medicine, often outperforming well trained human observers. The aim of this work was to develop and evaluate an automatic software prototype to identify otologic abnormalities using a deep convolutional neural network. Material and methods A database of 734 unique otoscopic images of various ear pathologies, including 63 cerumen impactions, 120 tympanostomy tubes, and 346 normal tympanic membranes were acquired. 80% of the images were used for the training of a convolutional neural network and the remaining 20% were used for algorithm validation. Image augmentation was employed on the training dataset to increase the number of training images. The general network architecture consisted of three convolutional layers plus batch normalization and dropout layers to avoid over fitting. Results The validation based on 45 datasets not used for model training revealed that the proposed deep convolutional neural network is capable of identifying and differentiating between normal tympanic membranes, tympanostomy tubes, and cerumen impactions with an overall accuracy of 84.4%. Conclusion Our study shows that deep convolutional neural networks hold immense potential as a diagnostic adjunct for otologic disease management.


Author(s):  
Huixin Yang ◽  
Xiang Li ◽  
Wei Zhang

Abstract Despite the rapid development of deep learning-based intelligent fault diagnosis methods on rotating machinery, the data-driven approach generally remains a "black box" to researchers, and its internal mechanism has not been sufficiently understood. The weak interpretability significantly impedes further development and applications of the effective deep neural network-based methods. This paper contributes efforts to understanding the mechanical signal processing of deep learning on the fault diagnosis problems. The diagnostic knowledge learned by the deep neural network is visualized using the neuron activation maximization and the saliency map methods. The discriminative features of different machine health conditions are intuitively observed. The relationship between the data-driven methods and the well-established conventional fault diagnosis knowledge is confirmed by the experimental investigations on two datasets. The results of this study can benefit researchers on understanding the complex neural networks, and increase the reliability of the data-driven fault diagnosis model in the real engineering cases.


2021 ◽  
pp. 1-11
Author(s):  
Feifei Sun

With the rapid development of deep learning and parallel computing, deep learning neural network based on big data has been applied to the field of facial nerve recognition. This innovative operation has attracted extensive attention of scholars. The reason why the application of neural network is realized lies in deep learning, which can reduce the error and change the weight by means of back propagation and error optimization, so as to extract more key points and features. In spite of this, data collection and key points extraction is still a very complex problem. This paper mainly aims at the above problems, studies the way of deep learning and information extraction and its internal structure, and optimizes its application to classroom learning, so as to provide effective help for the realization of distance education.


Entropy ◽  
2020 ◽  
Vol 22 (1) ◽  
pp. 96 ◽  
Author(s):  
Xingliang Tang ◽  
Xianrui Zhang

Decoding motor imagery (MI) electroencephalogram (EEG) signals for brain-computer interfaces (BCIs) is a challenging task because of the severe non-stationarity of perceptual decision processes. Recently, deep learning techniques have had great success in EEG decoding because of their prominent ability to learn features from raw EEG signals automatically. However, the challenge that the deep learning method faces is that the shortage of labeled EEG signals and EEGs sampled from other subjects cannot be used directly to train a convolutional neural network (ConvNet) for a target subject. To solve this problem, in this paper, we present a novel conditional domain adaptation neural network (CDAN) framework for MI EEG signal decoding. Specifically, in the CDAN, a densely connected ConvNet is firstly applied to obtain high-level discriminative features from raw EEG time series. Then, a novel conditional domain discriminator is introduced to work as an adversarial with the label classifier to learn commonly shared intra-subjects EEG features. As a result, the CDAN model trained with sufficient EEG signals from other subjects can be used to classify the signals from the target subject efficiently. Competitive experimental results on a public EEG dataset (High Gamma Dataset) against the state-of-the-art methods demonstrate the efficacy of the proposed framework in recognizing MI EEG signals, indicating its effectiveness in automatic perceptual decision decoding.


Sign in / Sign up

Export Citation Format

Share Document