scholarly journals Lightweight image classifier using dilated and depthwise separable convolutions

Author(s):  
Wei Sun ◽  
Xiaorui Zhang ◽  
Xiaozheng He

Abstract The image classification based on cloud computing suffers from difficult deployment as the network depth and data volume increase. Due to the depth of the model and the convolution process of each layer will produce a great amount of calculation, the GPU and storage performance of the device are extremely demanding, and the GPU and storage devices equipped on the embedded and mobile terminals cannot support large models. So it is necessary to compress the model so that the model can be deployed on these devices. Meanwhile, traditional compression based methods often miss many global features during the compression process, resulting in low classification accuracy. To solve the problem, this paper proposes a lightweight neural network model based on dilated convolution and depthwise separable convolution with twenty-nine layers for image classification. The proposed model employs the dilated convolution to expand the receptive field during the convolution process while maintaining the number of convolution parameters, which can extract more high-level global semantic features to improve the classification accuracy. Also, the depthwise separable convolution is applied to reduce the network parameters and computational complexity in convolution operations, which reduces the size of the network. The proposed model introduces three hyperparameters: width multiplier, image resolution, and dilated rate, to compress the network on the premise of ensuring accuracy. The experimental results show that compared with GoogleNet, the network proposed in this paper improves the classification accuracy by nearly 1%, and the number of parameters is reduced by 3.7 million.

2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Wei Wang ◽  
Yiyang Hu ◽  
Ting Zou ◽  
Hongmei Liu ◽  
Jin Wang ◽  
...  

Because deep neural networks (DNNs) are both memory-intensive and computation-intensive, they are difficult to apply to embedded systems with limited hardware resources. Therefore, DNN models need to be compressed and accelerated. By applying depthwise separable convolutions, MobileNet can decrease the number of parameters and computational complexity with less loss of classification precision. Based on MobileNet, 3 improved MobileNet models with local receptive field expansion in shallow layers, also called Dilated-MobileNet (Dilated Convolution MobileNet) models, are proposed, in which dilated convolutions are introduced into a specific convolutional layer of the MobileNet model. Without increasing the number of parameters, dilated convolutions are used to increase the receptive field of the convolution filters to obtain better classification accuracy. The experiments were performed on the Caltech-101, Caltech-256, and Tubingen animals with attribute datasets, respectively. The results show that Dilated-MobileNets can obtain up to 2% higher classification accuracy than MobileNet.


Symmetry ◽  
2020 ◽  
Vol 12 (9) ◽  
pp. 1511
Author(s):  
Fenglei Wang ◽  
Hao Zhou ◽  
Shuohao Li ◽  
Jun Lei ◽  
Jun Zhang

Fine-grained image classification has seen a great improvement benefiting from the advantages of deep learning techniques. Most fine-grained image classification methods focus on extracting discriminative features and combining the global features with the local ones. However, the accuracy is limited due to the inter-class similarity and the inner-class divergence as well as the lack of enough labelled images to train a deep network which can generalize to fine-grained classes. To deal with these problems, we develop an algorithm which combines Maximizing the Mutual Information (MMI) with the Learning Attention (LA). We make use of MMI to distill knowledge from the image pairs which contain the same object. Meanwhile we take advantage of the LA mechanism to find the salient region of the image to enhance the information distillation. Our model can extract more discriminative semantic features and improve the performance on fine-grained image classification. Our model has a symmetric structure, in which the paired images are inputted into the same network to extract the local and global features for the subsequent MMI and LA modules. We train the model by maximizing the mutual information and minimizing the cross-entropy stage by stage alternatively. Experiments show that our model can improve the performance of the fine-grained image classification effectively.


2019 ◽  
Vol 11 (11) ◽  
pp. 1325 ◽  
Author(s):  
Chen Chen ◽  
Yi Ma ◽  
Guangbo Ren

Deep learning models, especially the convolutional neural networks (CNNs), are very active in hyperspectral remote sensing image classification. In order to better apply the CNN model to hyperspectral classification, we propose a CNN model based on Fletcher–Reeves algorithm (F–R CNN), which uses the Fletcher–Reeves (F–R) algorithm for gradient updating to optimize the convergence performance of the model in classification. In view of the fact that there are fewer optional training samples in practical applications, we further propose a method of increasing the number of samples by adding a certain degree of perturbed samples, which can also test the anti-interference ability of classification methods. Furthermore, we analyze the anti-interference and convergence performance of the proposed model in terms of different training sample data sets, different batch training sample numbers and iteration time. In this paper, we describe the experimental process in detail and comprehensively evaluate the proposed model based on the classification of CHRIS hyperspectral imagery covering coastal wetlands, and further evaluate it on a commonly used hyperspectral image benchmark dataset. The experimental results show that the accuracy of the two models after increasing training samples and adjusting the number of batch training samples is improved. When the number of batch training samples is continuously increased to 350, the classification accuracy of the proposed method can still be maintained above 80.7%, which is 2.9% higher than the traditional one. And its time consumption is less than that of the traditional one while ensuring classification accuracy. It can be concluded that the proposed method has anti-interference ability and outperforms the traditional CNN in terms of batch computing adaptability and convergence speed.


2020 ◽  
Vol 10 (13) ◽  
pp. 4652
Author(s):  
Fangxiong Chen ◽  
Guoheng Huang ◽  
Jiaying Lan ◽  
Yanhui Wu ◽  
Chi-Man Pun ◽  
...  

The fine-grained image classification task is about differentiating between different object classes. The difficulties of the task are large intra-class variance and small inter-class variance. For this reason, improving models’ accuracies on the task heavily relies on discriminative parts’ annotations and regional parts’ annotations. Such delicate annotations’ dependency causes the restriction on models’ practicability. To tackle this issue, a saliency module based on a weakly supervised fine-grained image classification model is proposed by this article. Through our salient region localization module, the proposed model can localize essential regional parts with the use of saliency maps, while only image class annotations are provided. Besides, the bilinear attention module can improve the performance on feature extraction by using higher- and lower-level layers of the network to fuse regional features with global features. With the application of the bilinear attention architecture, we propose the different layer feature fusion module to improve the expression ability of model features. We tested and verified our model on public datasets released specifically for fine-grained image classification. The results of our test show that our proposed model can achieve close to state-of-the-art classification performance on various datasets, while only the least training data are provided. Such a result indicates that the practicality of our model is incredibly improved since fine-grained image datasets are expensive.


Information ◽  
2018 ◽  
Vol 9 (10) ◽  
pp. 252 ◽  
Author(s):  
Andrea Apicella ◽  
Anna Corazza ◽  
Francesco Isgrò ◽  
Giuseppe Vettigli

The use of ontological knowledge to improve classification results is a promising line of research. The availability of a probabilistic ontology raises the possibility of combining the probabilities coming from the ontology with the ones produced by a multi-class classifier that detects particular objects in an image. This combination not only provides the relations existing between the different segments, but can also improve the classification accuracy. In fact, it is known that the contextual information can often give information that suggests the correct class. This paper proposes a possible model that implements this integration, and the experimental assessment shows the effectiveness of the integration, especially when the classifier’s accuracy is relatively low. To assess the performance of the proposed model, we designed and implemented a simulated classifier that allows a priori decisions of its performance with sufficient precision.


Information ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 249
Author(s):  
Xin Jin ◽  
Yuanwen Zou ◽  
Zhongbing Huang

The cell cycle is an important process in cellular life. In recent years, some image processing methods have been developed to determine the cell cycle stages of individual cells. However, in most of these methods, cells have to be segmented, and their features need to be extracted. During feature extraction, some important information may be lost, resulting in lower classification accuracy. Thus, we used a deep learning method to retain all cell features. In order to solve the problems surrounding insufficient numbers of original images and the imbalanced distribution of original images, we used the Wasserstein generative adversarial network-gradient penalty (WGAN-GP) for data augmentation. At the same time, a residual network (ResNet) was used for image classification. ResNet is one of the most used deep learning classification networks. The classification accuracy of cell cycle images was achieved more effectively with our method, reaching 83.88%. Compared with an accuracy of 79.40% in previous experiments, our accuracy increased by 4.48%. Another dataset was used to verify the effect of our model and, compared with the accuracy from previous results, our accuracy increased by 12.52%. The results showed that our new cell cycle image classification system based on WGAN-GP and ResNet is useful for the classification of imbalanced images. Moreover, our method could potentially solve the low classification accuracy in biomedical images caused by insufficient numbers of original images and the imbalanced distribution of original images.


Author(s):  
Jianfang Cao ◽  
Minmin Yan ◽  
Yiming Jia ◽  
Xiaodong Tian ◽  
Zibang Zhang

AbstractIt is difficult to identify the historical period in which some ancient murals were created because of damage due to artificial and/or natural factors; similarities in content, style, and color among murals; low image resolution; and other reasons. This study proposed a transfer learning-fused Inception-v3 model for dynasty-based classification. First, the model adopted Inception-v3 with frozen fully connected and softmax layers for pretraining over ImageNet. Second, the model fused Inception-v3 with transfer learning for parameter readjustment over small datasets. Third, the corresponding bottleneck files of the mural images were generated, and the deep-level features of the images were extracted. Fourth, the cross-entropy loss function was employed to calculate the loss value at each step of the training, and an algorithm for the adaptive learning rate on the stochastic gradient descent was applied to unify the learning rate. Finally, the updated softmax classifier was utilized for the dynasty-based classification of the images. On the constructed small datasets, the accuracy rate, recall rate, and F1 value of the proposed model were 88.4%, 88.36%, and 88.32%, respectively, which exhibited noticeable increases compared with those of typical deep learning models and modified convolutional neural networks. Comparisons of the classification outcomes for the mural dataset with those for other painting datasets and natural image datasets showed that the proposed model achieved stable classification outcomes with a powerful generalization capacity. The training time of the proposed model was only 0.7 s, and overfitting seldom occurred.


2021 ◽  
Vol 13 (3) ◽  
pp. 335
Author(s):  
Yuhao Qing ◽  
Wenyi Liu

In recent years, image classification on hyperspectral imagery utilizing deep learning algorithms has attained good results. Thus, spurred by that finding and to further improve the deep learning classification accuracy, we propose a multi-scale residual convolutional neural network model fused with an efficient channel attention network (MRA-NET) that is appropriate for hyperspectral image classification. The suggested technique comprises a multi-staged architecture, where initially the spectral information of the hyperspectral image is reduced into a two-dimensional tensor, utilizing a principal component analysis (PCA) scheme. Then, the constructed low-dimensional image is input to our proposed ECA-NET deep network, which exploits the advantages of its core components, i.e., multi-scale residual structure and attention mechanisms. We evaluate the performance of the proposed MRA-NET on three public available hyperspectral datasets and demonstrate that, overall, the classification accuracy of our method is 99.82 %, 99.81%, and 99.37, respectively, which is higher compared to the corresponding accuracy of current networks such as 3D convolutional neural network (CNN), three-dimensional residual convolution structure (RES-3D-CNN), and space–spectrum joint deep network (SSRN).


2018 ◽  
Vol 28 (4) ◽  
pp. 735-744 ◽  
Author(s):  
Michał Koziarski ◽  
Bogusław Cyganek

Abstract Due to the advances made in recent years, methods based on deep neural networks have been able to achieve a state-of-the-art performance in various computer vision problems. In some tasks, such as image recognition, neural-based approaches have even been able to surpass human performance. However, the benchmarks on which neural networks achieve these impressive results usually consist of fairly high quality data. On the other hand, in practical applications we are often faced with images of low quality, affected by factors such as low resolution, presence of noise or a small dynamic range. It is unclear how resilient deep neural networks are to the presence of such factors. In this paper we experimentally evaluate the impact of low resolution on the classification accuracy of several notable neural architectures of recent years. Furthermore, we examine the possibility of improving neural networks’ performance in the task of low resolution image recognition by applying super-resolution prior to classification. The results of our experiments indicate that contemporary neural architectures remain significantly affected by low image resolution. By applying super-resolution prior to classification we were able to alleviate this issue to a large extent as long as the resolution of the images did not decrease too severely. However, in the case of very low resolution images the classification accuracy remained considerably affected.


Sign in / Sign up

Export Citation Format

Share Document