scholarly journals A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks

Electronics ◽  
2021 ◽  
Vol 10 (24) ◽  
pp. 3183
Author(s):  
Cheng Li ◽  
Fei Miao ◽  
Gang Gao

Deep Neural Networks (DNNs) are commonly used methods in computational intelligence. Most prevalent DNN-based image classification methods are dedicated to promoting the performance by designing complicated network architectures and requiring large amounts of model parameters. These large-scale DNN-based models are performed on all images consistently. However, since there are meaningful differences between images, it is difficult to accurately classify all images by a consistent network architecture. For example, a deeper network is fit for the images that are difficult to be distinguished, but may lead to model overfitting for simple images. Therefore, we should selectively use different models to deal with different images, which is similar to the human cognition mechanism, in which different levels of neurons are activated according to the difficulty of object recognition. To this end, we propose a Hierarchical Convolutional Neural Network (HCNN) for image classification in this paper. HCNNs comprise multiple sub-networks, which can be viewed as different levels of neurons in humans, and these sub-networks are used to classify the images progressively. Specifically, we first initialize the weight of each image and each image category, and these images and initial weights are used for training the first sub-network. Then, according to the predicted results of the first sub-network, the weights of misclassified images are increased, while the weights of correctly classified images are decreased. Furthermore, the images with the updated weights are used for training the next sub-networks. Similar operations are performed on all sub-networks. In the test stage, each image passes through the sub-networks in turn. If the prediction confidences in a sub-network are higher than a given threshold, then the results are output directly. Otherwise, deeper visual features need to be learned successively by the subsequent sub-networks until a reliable image classification result is obtained or the last sub-network is reached. Experimental results show that HCNNs can obtain better results than classical CNNs and the existing models based on ensemble learning. HCNNs have 2.68% higher accuracy than Residual Network 50 (Resnet50) on the ultrasonic image dataset, 1.19% than Resnet50 on the chimpanzee facial image dataset, and 10.86% than Adaboost-CNN on the CIFAR-10 dataset. Furthermore, the HCNN is extensible, since the types of sub-networks and their combinations can be dynamically adjusted.


2021 ◽  
Vol 40 (3) ◽  
pp. 1-13
Author(s):  
Lumin Yang ◽  
Jiajie Zhuang ◽  
Hongbo Fu ◽  
Xiangzhi Wei ◽  
Kun Zhou ◽  
...  

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.



Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2852
Author(s):  
Parvathaneni Naga Srinivasu ◽  
Jalluri Gnana SivaSai ◽  
Muhammad Fazal Ijaz ◽  
Akash Kumar Bhoi ◽  
Wonjoon Kim ◽  
...  

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.



2020 ◽  
Author(s):  
B Wang ◽  
Y Sun ◽  
Bing Xue ◽  
Mengjie Zhang

© 2019, Springer Nature Switzerland AG. Image classification is a difficult machine learning task, where Convolutional Neural Networks (CNNs) have been applied for over 20 years in order to solve the problem. In recent years, instead of the traditional way of only connecting the current layer with its next layer, shortcut connections have been proposed to connect the current layer with its forward layers apart from its next layer, which has been proved to be able to facilitate the training process of deep CNNs. However, there are various ways to build the shortcut connections, it is hard to manually design the best shortcut connections when solving a particular problem, especially given the design of the network architecture is already very challenging. In this paper, a hybrid evolutionary computation (EC) method is proposed to automatically evolve both the architecture of deep CNNs and the shortcut connections. Three major contributions of this work are: Firstly, a new encoding strategy is proposed to encode a CNN, where the architecture and the shortcut connections are encoded separately; Secondly, a hybrid two-level EC method, which combines particle swarm optimisation and genetic algorithms, is developed to search for the optimal CNNs; Lastly, an adjustable learning rate is introduced for the fitness evaluations, which provides a better learning rate for the training process given a fixed number of epochs. The proposed algorithm is evaluated on three widely used benchmark datasets of image classification and compared with 12 peer Non-EC based competitors and one EC based competitor. The experimental results demonstrate that the proposed method outperforms all of the peer competitors in terms of classification accuracy.





Author(s):  
Yunpeng Chen ◽  
Xiaojie Jin ◽  
Bingyi Kang ◽  
Jiashi Feng ◽  
Shuicheng Yan

The residual unit and its variations are wildly used in building very deep neural networks for alleviating optimization difficulty. In this work, we revisit the standard residual function as well as its several successful variants and propose a unified framework based on tensor Block Term Decomposition (BTD) to explain these apparently different residual functions from the tensor decomposition view. With the BTD framework, we further propose a novel basic network architecture, named the Collective Residual Unit (CRU). CRU further enhances parameter efficiency of deep residual neural networks by sharing core factors derived from collective tensor factorization over the involved residual units. It enables efficient knowledge sharing across multiple residual units, reduces the number of model parameters, lowers the risk of over-fitting, and provides better generalization ability. Extensive experimental results show that our proposed CRU network brings outstanding parameter efficiency -- it achieves comparable classification performance with ResNet-200 while using a model size as small as ResNet-50 on the ImageNet-1k and Places365-Standard benchmark datasets.



Author(s):  
Shiva Prasad Kasiviswanathan ◽  
Nina Narodytska ◽  
Hongxia Jin

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.



2020 ◽  
Vol 10 (14) ◽  
pp. 4913
Author(s):  
Tin Kramberger ◽  
Božidar Potočnik

Currently there is no publicly available adequate dataset that could be used for training Generative Adversarial Networks (GANs) on car images. All available car datasets differ in noise, pose, and zoom levels. Thus, the objective of this work was to create an improved car image dataset that would be better suited for GAN training. To improve the performance of the GAN, we coupled the LSUN and Stanford car datasets. A new merged dataset was then pruned in order to adjust zoom levels and reduce the noise of images. This process resulted in fewer images that could be used for training, with increased quality though. This pruned dataset was evaluated by training the StyleGAN with original settings. Pruning the combined LSUN and Stanford datasets resulted in 2,067,710 images of cars with less noise and more adjusted zoom levels. The training of the StyleGAN on the LSUN-Stanford car dataset proved to be superior to the training with just the LSUN dataset by 3.7% using the Fréchet Inception Distance (FID) as a metric. Results pointed out that the proposed LSUN-Stanford car dataset is more consistent and better suited for training GAN neural networks than other currently available large car datasets.



2021 ◽  
pp. 1-36
Author(s):  
Vagan Terziyan ◽  
Olena Kaikova

Abstract Machine learning is a good tool to simulate human cognitive skills as it is about mapping perceived information to various labels or action choices, aiming at optimal behavior policies for a human or an artificial agent operating in the environment. Regarding autonomous systems, objects and situations are perceived by some receptors as divided between sensors. Reactions to the input (e.g., actions) are distributed among the particular capability providers or actuators. Cognitive models can be trained as, for example, neural networks. We suggest training such models for cases of potential disabilities. Disability can be either the absence of one or more cognitive sensors or actuators at different levels of cognitive model. We adapt several neural network architectures to simulate various cognitive disabilities. The idea has been triggered by the “coolability” (enhanced capability) paradox, according to which a person with some disability can be more efficient in using other capabilities. Therefore, an autonomous system (human or artificial) pretrained with simulated disabilities will be more efficient when acting in adversarial conditions. We consider these coolabilities as complementary artificial intelligence and argue on the usefulness if this concept for various applications.



2017 ◽  
Vol 115 (2) ◽  
pp. 254-259 ◽  
Author(s):  
Daniël M. Pelt ◽  
James A. Sethian

Deep convolutional neural networks have been successfully applied to many image-processing problems in recent works. Popular network architectures often add additional operations and connections to the standard architecture to enable training deeper networks. To achieve accurate results in practice, a large number of trainable parameters are often required. Here, we introduce a network architecture based on using dilated convolutions to capture features at different image scales and densely connecting all feature maps with each other. The resulting architecture is able to achieve accurate results with relatively few parameters and consists of a single set of operations, making it easier to implement, train, and apply in practice, and automatically adapts to different problems. We compare results of the proposed network architecture with popular existing architectures for several segmentation problems, showing that the proposed architecture is able to achieve accurate results with fewer parameters, with a reduced risk of overfitting the training data.





Sign in / Sign up

Export Citation Format

Share Document