Efficiency Optimization of Capsule Network Model Based on Vector Element

Author(s):  
Lijuan Zhou ◽  
Kai Feng ◽  
Hui Li ◽  
Shudong Zhang ◽  
Xiangyang Huang

Currently, Deep Learning and Convolutional Neural Network (CNN) have been widely used in many fields and have generated very high value in these fields, especially in the field of image recognition. But there are some deficiencies in certain issues of image recognition. For example, CNN’s recognizing performance is not good at different angles of objects and overlapping objects. Also, CNN is sometimes very sensitive to slight perturbations, modifying one pixel of a recognized image may cause recognition errors. For these problems, the capsule network (CapsNet) proposed by Geoffrey Hinton can solve the problems of traditional convolutional networks. Shortly after CapsNet was proposed, the model structure was relatively simple, and many aspects could be explored for improvement. This paper will optimize CapsNet from two aspects: “optimization of routing mechanism” and “increase Dropout operation.” And carry out experiments and results analysis on these optimizations.

2020 ◽  
Vol 31 (1) ◽  
pp. 9-17

Recently, deep learning has been widely applying to speech and image recognition. Convolutional neural network (CNN) is one of the main categories to do image classifications with very high accuracy. In Android malware classification field, many works have been trying to convert Android malwares into “images” to make them well-matched with the CNN input to take advantage of the CNN model. The performance, however, is not significantly improved because simply converting malwares into images may lack several important features of the malwares. This paper proposes a method for improving the feature set of Android malware classification based on co-concurrence matrix (co-matrix). The co-matrix is established based on a list of raw features extracted from .apk files. The proposed feature can take the advantage of CNN while remaining important features of the Android malwares. Experimental results of CNN model conducted on a very popular Android malware dataset, Drebin, prove the feasibility of our proposed co-matrix feature.


Electronics ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 81
Author(s):  
Jianbin Xiong ◽  
Dezheng Yu ◽  
Shuangyin Liu ◽  
Lei Shu ◽  
Xiaochan Wang ◽  
...  

Plant phenotypic image recognition (PPIR) is an important branch of smart agriculture. In recent years, deep learning has achieved significant breakthroughs in image recognition. Consequently, PPIR technology that is based on deep learning is becoming increasingly popular. First, this paper introduces the development and application of PPIR technology, followed by its classification and analysis. Second, it presents the theory of four types of deep learning methods and their applications in PPIR. These methods include the convolutional neural network, deep belief network, recurrent neural network, and stacked autoencoder, and they are applied to identify plant species, diagnose plant diseases, etc. Finally, the difficulties and challenges of deep learning in PPIR are discussed.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2852
Author(s):  
Parvathaneni Naga Srinivasu ◽  
Jalluri Gnana SivaSai ◽  
Muhammad Fazal Ijaz ◽  
Akash Kumar Bhoi ◽  
Wonjoon Kim ◽  
...  

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.


2019 ◽  
Vol 11 (9) ◽  
pp. 1051 ◽  
Author(s):  
Guangming Wu ◽  
Yimin Guo ◽  
Xiaoya Song ◽  
Zhiling Guo ◽  
Haoran Zhang ◽  
...  

Applying deep-learning methods, especially fully convolutional networks (FCNs), has become a popular option for land-cover classification or segmentation in remote sensing. Compared with traditional solutions, these approaches have shown promising generalization capabilities and precision levels in various datasets of different scales, resolutions, and imaging conditions. To achieve superior performance, a lot of research has focused on constructing more complex or deeper networks. However, using an ensemble of different fully convolutional models to achieve better generalization and to prevent overfitting has long been ignored. In this research, we design four stacked fully convolutional networks (SFCNs), and a feature alignment framework for multi-label land-cover segmentation. The proposed feature alignment framework introduces an alignment loss of features extracted from basic models to balance their similarity and variety. Experiments on a very high resolution(VHR) image dataset with six categories of land-covers indicates that the proposed SFCNs can gain better performance when compared to existing deep learning methods. In the 2nd variant of SFCN, the optimal feature alignment gains increments of 4.2% (0.772 vs. 0.741), 6.8% (0.629 vs. 0.589), and 5.5% (0.727 vs. 0.689) for its f1-score, jaccard index, and kappa coefficient, respectively.


2019 ◽  
Vol 11 (6) ◽  
pp. 684 ◽  
Author(s):  
Maria Papadomanolaki ◽  
Maria Vakalopoulou ◽  
Konstantinos Karantzalos

Deep learning architectures have received much attention in recent years demonstrating state-of-the-art performance in several segmentation, classification and other computer vision tasks. Most of these deep networks are based on either convolutional or fully convolutional architectures. In this paper, we propose a novel object-based deep-learning framework for semantic segmentation in very high-resolution satellite data. In particular, we exploit object-based priors integrated into a fully convolutional neural network by incorporating an anisotropic diffusion data preprocessing step and an additional loss term during the training process. Under this constrained framework, the goal is to enforce pixels that belong to the same object to be classified at the same semantic category. We compared thoroughly the novel object-based framework with the currently dominating convolutional and fully convolutional deep networks. In particular, numerous experiments were conducted on the publicly available ISPRS WGII/4 benchmark datasets, namely Vaihingen and Potsdam, for validation and inter-comparison based on a variety of metrics. Quantitatively, experimental results indicate that, overall, the proposed object-based framework slightly outperformed the current state-of-the-art fully convolutional networks by more than 1% in terms of overall accuracy, while intersection over union results are improved for all semantic categories. Qualitatively, man-made classes with more strict geometry such as buildings were the ones that benefit most from our method, especially along object boundaries, highlighting the great potential of the developed approach.


2020 ◽  
Vol 37 (9) ◽  
pp. 1661-1668
Author(s):  
Min Wang ◽  
Shudao Zhou ◽  
Zhong Yang ◽  
Zhanhua Liu

AbstractConventional classification methods are based on artificial experience to extract features, and each link is independent, which is a kind of “shallow learning.” As a result, the scope of the cloud category applied by this method is limited. In this paper, we propose a new convolutional neural network (CNN) with deep learning ability, called CloudA, for the ground-based cloud image recognition method. We use the Singapore Whole-Sky Imaging Categories (SWIMCAT) sample library and total-sky sample library to train and test CloudA. In particular, we visualize the cloud features captured by CloudA using the TensorBoard visualization method, and these features can help us to understand the process of ground-based cloud classification. We compare this method with other commonly used methods to explore the feasibility of using CloudA to classify ground-based cloud images, and the evaluation of a large number of experiments show that the average accuracy of this method is nearly 98.63% for ground-based cloud classification.


2022 ◽  
Vol 12 ◽  
Author(s):  
Wei Lu ◽  
Rongting Du ◽  
Pengshuai Niu ◽  
Guangnan Xing ◽  
Hui Luo ◽  
...  

Soybean yield is a highly complex trait determined by multiple factors such as genotype, environment, and their interactions. The earlier the prediction during the growing season the better. Accurate soybean yield prediction is important for germplasm innovation and planting environment factor improvement. But until now, soybean yield has been determined by weight measurement manually after soybean plant harvest which is time-consuming, has high cost and low precision. This paper proposed a soybean yield in-field prediction method based on bean pods and leaves image recognition using a deep learning algorithm combined with a generalized regression neural network (GRNN). A faster region-convolutional neural network (Faster R-CNN), feature pyramid network (FPN), single shot multibox detector (SSD), and You Only Look Once (YOLOv3) were employed for bean pods recognition in which recognition precision and speed were 86.2, 89.8, 80.1, 87.4%, and 13 frames per second (FPS), 7 FPS, 24 FPS, and 39 FPS, respectively. Therefore, YOLOv3 was selected considering both recognition precision and speed. For enhancing detection performance, YOLOv3 was improved by changing IoU loss function, using the anchor frame clustering algorithm, and utilizing the partial neural network structure with which recognition precision increased to 90.3%. In order to improve soybean yield prediction precision, leaves were identified and counted, moreover, pods were further classified as single, double, treble, four, and five seeds types by improved YOLOv3 because each type seed weight varies. In addition, soybean seed number prediction models of each soybean planter were built using PLSR, BP, and GRNN with the input of different type pod numbers and leaf numbers with which prediction results were 96.24, 96.97, and 97.5%, respectively. Finally, the soybean yield of each planter was obtained by accumulating the weight of all soybean pod types and the average accuracy was up to 97.43%. The results show that it is feasible to predict the soybean yield of plants in situ with high precision by fusing the number of leaves and different type soybean pods recognized by a deep neural network combined with GRNN which can speed up germplasm innovation and planting environmental factor optimization.


2021 ◽  
Vol 290 ◽  
pp. 02020
Author(s):  
Boyu Zhang ◽  
Xiao Wang ◽  
Shudong Li ◽  
Jinghua Yang

Current underwater shipwreck side scan sonar samples are few and difficult to label. With small sample sizes, their image recognition accuracy with a convolutional neural network model is low. In this study, we proposed an image recognition method for shipwreck side scan sonar that combines transfer learning with deep learning. In the non-transfer learning, shipwreck sonar sample data were used to train the network, and the results were saved as the control group. The weakly correlated data were applied to train the network, then the network parameters were transferred to the new network, and then the shipwreck sonar data was used for training. These steps were repeated using strongly correlated data. Experiments were carried out on Lenet-5, AlexNet, GoogLeNet, ResNet and VGG networks. Without transfer learning, the highest accuracy was obtained on the ResNet network (86.27%). Using weakly correlated data for transfer training, the highest accuracy was on the VGG network (92.16%). Using strongly correlated data for transfer training, the highest accuracy was also on the VGG network (98.04%). In all network architectures, transfer learning improved the correct recognition rate of convolutional neural network models. Experiments show that transfer learning combined with deep learning improves the accuracy and generalization of the convolutional neural network in the case of small sample sizes.


2021 ◽  
Vol 2066 (1) ◽  
pp. 012009
Author(s):  
Shuqiang Du

Abstract The selection and extraction of image recognition by artificial means needs more complicated work, which is not conducive to the recognition and extraction of important features. Deep learning and neural network represent the iterative expansion of computer intelligent tech, and bring significant results to image recognition. Based on this, this paper first gives the concept and model of neural network, then studies the utilization of deep learning neural network in image recognition, and finally analyses the picture recognition system on account of in-depth learning neural network.


Convolutional neural network (CNN) is actually a deep neural network which plays an important role in image recognition. The CNN recognizes images similar to visual cortex in our eyes. In this proposed work, an accelerator is used for high efficient convolutional computations. The main aim of using the accelerator is to avoid ineffectusal computations and to improve performance and energy efficiency during image recognition without any loss in accuracy. However, the throughput of the accelerator is improved by adding max-pooling function only. Since the CNN includes multiple inputs and intermediate weights for its convolutional computation, the computational complexity is increased enormously. Hence, to reduce the computational complexity of the CNN, a CNN accelerator is proposed in this paper. The accelerator design is simulated and synthesized in Cadence RTL compiler tool with 90nm technology library.


Sign in / Sign up

Export Citation Format

Share Document