scholarly journals An Efficient Technique for Size Reduction of Convolutional Neural Networks after Transfer Learning for Scene Recognition Tasks

2018 ◽  
Vol 23 (2) ◽  
pp. 141-149 ◽  
Author(s):  
Vadim Romanuke

Abstract A complex classification task as scene recognition is considered in the present research. Scene recognition tasks are successfully solved by the paradigm of transfer learning from pretrained convolutional neural networks, but a problem is that the eventual size of the network is huge despite a common scene recognition task has up to a few tens of scene categories. Thus, the goal is to ascertain possibility of a size reduction. The modelling recognition task is a small dataset of 4485 grayscale images broken into 15 image categories. The pretrained network is AlexNet dealing with much simpler image categories whose number is 1000, though. This network has two fully connected layers, which can be potentially reduced or deleted. A regular transfer learning network occupies about 202.6 MB performing at up to 92 % accuracy rate for the scene recognition. It is revealed that deleting the layers is not reasonable. The network size is reduced by setting a fewer number of filters in the 17th and 20th layers of the AlexNet-based networks using a dichotomy principle or similar. The best truncated network with 384 and 192 filters in those layers performs at 93.3 % accuracy rate, and its size is 21.63 MB.

Author(s):  
Chantana Chantrapornchai ◽  
Samrid Duangkaew

Several kinds of pretrained convolutional neural networks (CNN) exist nowadays. Utilizing these networks with the new classification task requires the retraining with new data sets. With the small embedded device, large network cannot be implemented. The authors study the use of pretrained models and customizing them towards accuracy and size against face recognition tasks. The results show 1) the performance of existing pretrained networks (e.g., AlexNet, GoogLeNet, CaffeNet, SqueezeNet), as well as size, and 2) demonstrate the layers customization towards the model size and accuracy. The studied results show that among the various networks with different data sets, SqueezeNet can achieve the same accuracy (0.99) as others with small size (up to 25 times smaller). Secondly, the two customizations with layer skipping are presented. The experiments show the example of SqueezeNet layer customizing, reducing the network size while keeping the accuracy (i.e., reducing the size by 7% with the slower convergence time). The experiments are measured based on Caffe 0.15.14.


2020 ◽  
Vol 12 (7) ◽  
pp. 1092
Author(s):  
David Browne ◽  
Michael Giering ◽  
Steven Prestwich

Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory- and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply only to the convolutional or fully connected layers, but not both. In experiments, we set new standards in classification accuracy on four remote-sensing and two scene-recognition image datasets.


2021 ◽  
Vol 2 (3) ◽  
Author(s):  
Gustaf Halvardsson ◽  
Johanna Peterson ◽  
César Soto-Valero ◽  
Benoit Baudry

AbstractThe automatic interpretation of sign languages is a challenging task, as it requires the usage of high-level vision and high-level motion processing systems for providing accurate image perception. In this paper, we use Convolutional Neural Networks (CNNs) and transfer learning to make computers able to interpret signs of the Swedish Sign Language (SSL) hand alphabet. Our model consists of the implementation of a pre-trained InceptionV3 network, and the usage of the mini-batch gradient descent optimization algorithm. We rely on transfer learning during the pre-training of the model and its data. The final accuracy of the model, based on 8 study subjects and 9400 images, is 85%. Our results indicate that the usage of CNNs is a promising approach to interpret sign languages, and transfer learning can be used to achieve high testing accuracy despite using a small training dataset. Furthermore, we describe the implementation details of our model to interpret signs as a user-friendly web application.


Author(s):  
Naoki Matsumura ◽  
Yasuaki Ito ◽  
Koji Nakano ◽  
Akihiko Kasagi ◽  
Tsuguchika Tabaru

Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2005
Author(s):  
Veronika Scholz ◽  
Peter Winkler ◽  
Andreas Hornig ◽  
Maik Gude ◽  
Angelos Filippatos

Damage identification of composite structures is a major ongoing challenge for a secure operational life-cycle due to the complex, gradual damage behaviour of composite materials. Especially for composite rotors in aero-engines and wind-turbines, a cost-intensive maintenance service has to be performed in order to avoid critical failure. A major advantage of composite structures is that they are able to safely operate after damage initiation and under ongoing damage propagation. Therefore, a robust, efficient diagnostic damage identification method would allow monitoring the damage process with intervention occurring only when necessary. This study investigates the structural vibration response of composite rotors by applying machine learning methods and the ability to identify, localise and quantify the present damage. To this end, multiple fully connected neural networks and convolutional neural networks were trained on vibration response spectra from damaged composite rotors with barely visible damage, mostly matrix cracks and local delaminations using dimensionality reduction and data augmentation. A databank containing 720 simulated test cases with different damage states is used as a basis for the generation of multiple data sets. The trained models are tested using k-fold cross validation and they are evaluated based on the sensitivity, specificity and accuracy. Convolutional neural networks perform slightly better providing a performance accuracy of up to 99.3% for the damage localisation and quantification.


Animals ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 1263
Author(s):  
Zhaojun Wang ◽  
Jiangning Wang ◽  
Congtian Lin ◽  
Yan Han ◽  
Zhaosheng Wang ◽  
...  

With the rapid development of digital technology, bird images have become an important part of ornithology research data. However, due to the rapid growth of bird image data, it has become a major challenge to effectively process such a large amount of data. In recent years, deep convolutional neural networks (DCNNs) have shown great potential and effectiveness in a variety of tasks regarding the automatic processing of bird images. However, no research has been conducted on the recognition of habitat elements in bird images, which is of great help when extracting habitat information from bird images. Here, we demonstrate the recognition of habitat elements using four DCNN models trained end-to-end directly based on images. To carry out this research, an image database called Habitat Elements of Bird Images (HEOBs-10) and composed of 10 categories of habitat elements was built, making future benchmarks and evaluations possible. Experiments showed that good results can be obtained by all the tested models. ResNet-152-based models yielded the best test accuracy rate (95.52%); the AlexNet-based model yielded the lowest test accuracy rate (89.48%). We conclude that DCNNs could be efficient and useful for automatically identifying habitat elements from bird images, and we believe that the practical application of this technology will be helpful for studying the relationships between birds and habitat elements.


Sign in / Sign up

Export Citation Format

Share Document