A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks

Cheng Li; Fei Miao; Gang Gao

doi:10.3390/electronics10243183

A Novel Progressive Image Classification Method Based on Hierarchical Convolutional Neural Networks

Electronics ◽

10.3390/electronics10243183 ◽

2021 ◽

Vol 10 (24) ◽

pp. 3183

Author(s):

Cheng Li ◽

Fei Miao ◽

Gang Gao

Keyword(s):

Neural Networks ◽

Image Classification ◽

Network Architecture ◽

Large Scale ◽

Human Cognition ◽

Visual Features ◽

Model Parameters ◽

Network Architectures ◽

Image Dataset ◽

Different Levels

Deep Neural Networks (DNNs) are commonly used methods in computational intelligence. Most prevalent DNN-based image classification methods are dedicated to promoting the performance by designing complicated network architectures and requiring large amounts of model parameters. These large-scale DNN-based models are performed on all images consistently. However, since there are meaningful differences between images, it is difficult to accurately classify all images by a consistent network architecture. For example, a deeper network is fit for the images that are difficult to be distinguished, but may lead to model overfitting for simple images. Therefore, we should selectively use different models to deal with different images, which is similar to the human cognition mechanism, in which different levels of neurons are activated according to the difficulty of object recognition. To this end, we propose a Hierarchical Convolutional Neural Network (HCNN) for image classification in this paper. HCNNs comprise multiple sub-networks, which can be viewed as different levels of neurons in humans, and these sub-networks are used to classify the images progressively. Specifically, we first initialize the weight of each image and each image category, and these images and initial weights are used for training the first sub-network. Then, according to the predicted results of the first sub-network, the weights of misclassified images are increased, while the weights of correctly classified images are decreased. Furthermore, the images with the updated weights are used for training the next sub-networks. Similar operations are performed on all sub-networks. In the test stage, each image passes through the sub-networks in turn. If the prediction confidences in a sub-network are higher than a given threshold, then the results are output directly. Otherwise, deeper visual features need to be learned successively by the subsequent sub-networks until a reliable image classification result is obtained or the last sub-network is reached. Experimental results show that HCNNs can obtain better results than classical CNNs and the existing models based on ensemble learning. HCNNs have 2.68% higher accuracy than Residual Network 50 (Resnet50) on the ultrasonic image dataset, 1.19% than Resnet50 on the chimpanzee facial image dataset, and 10.86% than Adaboost-CNN on the CIFAR-10 dataset. Furthermore, the HCNN is extensible, since the types of sub-networks and their combinations can be dynamically adjusted.

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

ACM Transactions on Graphics ◽

10.1145/3450284 ◽

2021 ◽

Vol 40 (3) ◽

pp. 1-13

Author(s):

Lumin Yang ◽

Jiajie Zhuang ◽

Hongbo Fu ◽

Xiangzhi Wei ◽

Kun Zhou ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Semantic Segmentation ◽

Structure Information ◽

Graph Neural Networks ◽

Node Labels ◽

Point Level

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.

Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM

Sensors ◽

10.3390/s21082852 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2852

Author(s):

Parvathaneni Naga Srinivasu ◽

Jalluri Gnana SivaSai ◽

Muhammad Fazal Ijaz ◽

Akash Kumar Bhoi ◽

Wonjoon Kim ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Skin Disease ◽

Network Architecture ◽

Large Scale ◽

Short Term Memory ◽

Convolutional Networks ◽

Occurrence Matrix

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.

A Hybrid GA-PSO Method for Evolving Architecture and Short Connections of Deep Convolutional Neural Networks

10.26686/wgtn.13158299.v1 ◽

2020 ◽

Author(s):

B Wang ◽

Y Sun ◽

Bing Xue ◽

Mengjie Zhang

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Network Architecture ◽

Learning Task ◽

Fixed Number ◽

Learning Rate ◽

Current Layer ◽

Training Process ◽

Deep Convolutional Neural Networks

© 2019, Springer Nature Switzerland AG. Image classification is a difficult machine learning task, where Convolutional Neural Networks (CNNs) have been applied for over 20 years in order to solve the problem. In recent years, instead of the traditional way of only connecting the current layer with its next layer, shortcut connections have been proposed to connect the current layer with its forward layers apart from its next layer, which has been proved to be able to facilitate the training process of deep CNNs. However, there are various ways to build the shortcut connections, it is hard to manually design the best shortcut connections when solving a particular problem, especially given the design of the network architecture is already very challenging. In this paper, a hybrid evolutionary computation (EC) method is proposed to automatically evolve both the architecture of deep CNNs and the shortcut connections. Three major contributions of this work are: Firstly, a new encoding strategy is proposed to encode a CNN, where the architecture and the shortcut connections are encoded separately; Secondly, a hybrid two-level EC method, which combines particle swarm optimisation and genetic algorithms, is developed to search for the optimal CNNs; Lastly, an adjustable learning rate is introduced for the fitness evaluations, which provides a better learning rate for the training process given a fixed number of epochs. The proposed algorithm is evaluated on three widely used benchmark datasets of image classification and compared with 12 peer Non-EC based competitors and one EC based competitor. The experimental results demonstrate that the proposed method outperforms all of the peer competitors in terms of classification accuracy.

A QoS network architecture to interconnect large-scale VLSI neural networks

2009 International Joint Conference on Neural Networks ◽

10.1109/ijcnn.2009.5178983 ◽

2009 ◽

Cited By ~ 7

Author(s):

Stefan Philipp ◽

Johannes Schemmel ◽

Karlheinz Meier

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Large Scale

Sharing Residual Units Through Collective Tensor Factorization To Improve Deep Neural Networks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/88 ◽

2018 ◽

Cited By ~ 6

Author(s):

Yunpeng Chen ◽

Xiaojie Jin ◽

Bingyi Kang ◽

Jiashi Feng ◽

Shuicheng Yan

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Deep Neural Networks ◽

Tensor Decomposition ◽

Classification Performance ◽

Model Parameters ◽

Tensor Factorization ◽

Unified Framework ◽

Benchmark Datasets ◽

Basic Network

The residual unit and its variations are wildly used in building very deep neural networks for alleviating optimization difficulty. In this work, we revisit the standard residual function as well as its several successful variants and propose a unified framework based on tensor Block Term Decomposition (BTD) to explain these apparently different residual functions from the tensor decomposition view. With the BTD framework, we further propose a novel basic network architecture, named the Collective Residual Unit (CRU). CRU further enhances parameter efficiency of deep residual neural networks by sharing core factors derived from collective tensor factorization over the involved residual units. It enables efficient knowledge sharing across multiple residual units, reduces the number of model parameters, lowers the risk of over-fitting, and provides better generalization ability. Extensive experimental results show that our proposed CRU network brings outstanding parameter efficiency -- it achieves comparable classification performance with ResNet-200 while using a model size as small as ResNet-50 on the ImageNet-1k and Places365-Standard benchmark datasets.

Network Approximation using Tensor Sketching

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/321 ◽

2018 ◽

Cited By ~ 1

Author(s):

Shiva Prasad Kasiviswanathan ◽

Nina Narodytska ◽

Hongxia Jin

Keyword(s):

Neural Networks ◽

Language Processing ◽

Network Architecture ◽

Deep Neural Networks ◽

Network Architectures ◽

Effective Parameters ◽

Unified Framework ◽

Design Changes ◽

Target Network ◽

Fully Connected

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.

LSUN-Stanford Car Dataset: Enhancing Large-Scale Car Image Datasets Using Deep Learning for Usage in GAN Training

Applied Sciences ◽

10.3390/app10144913 ◽

2020 ◽

Vol 10 (14) ◽

pp. 4913

Author(s):

Tin Kramberger ◽

Božidar Potočnik

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Large Scale ◽

Generative Adversarial Networks ◽

Adversarial Networks ◽

Image Dataset ◽

Image Datasets

Currently there is no publicly available adequate dataset that could be used for training Generative Adversarial Networks (GANs) on car images. All available car datasets differ in noise, pose, and zoom levels. Thus, the objective of this work was to create an improved car image dataset that would be better suited for GAN training. To improve the performance of the GAN, we coupled the LSUN and Stanford car datasets. A new merged dataset was then pruned in order to adjust zoom levels and reduce the noise of images. This process resulted in fewer images that could be used for training, with increased quality though. This pruned dataset was evaluated by training the StyleGAN with original settings. Pruning the combined LSUN and Stanford datasets resulted in 2,067,710 images of cars with less noise and more adjusted zoom levels. The training of the StyleGAN on the LSUN-Stanford car dataset proved to be superior to the training with just the LSUN dataset by 3.7% using the Fréchet Inception Distance (FID) as a metric. Results pointed out that the proposed LSUN-Stanford car dataset is more consistent and better suited for training GAN neural networks than other currently available large car datasets.

Neural Networks with Disabilities: An Introduction to Complementary Artificial Intelligence

Neural Computation ◽

10.1162/neco_a_01449 ◽

2021 ◽

pp. 1-36

Author(s):

Vagan Terziyan ◽

Olena Kaikova

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Cognitive Skills ◽

Autonomous Systems ◽

Cognitive Model ◽

Network Architectures ◽

Artificial Agent ◽

Good Tool ◽

Optimal Behavior ◽

Different Levels

Abstract Machine learning is a good tool to simulate human cognitive skills as it is about mapping perceived information to various labels or action choices, aiming at optimal behavior policies for a human or an artificial agent operating in the environment. Regarding autonomous systems, objects and situations are perceived by some receptors as divided between sensors. Reactions to the input (e.g., actions) are distributed among the particular capability providers or actuators. Cognitive models can be trained as, for example, neural networks. We suggest training such models for cases of potential disabilities. Disability can be either the absence of one or more cognitive sensors or actuators at different levels of cognitive model. We adapt several neural network architectures to simulate various cognitive disabilities. The idea has been triggered by the “coolability” (enhanced capability) paradox, according to which a person with some disability can be more efficient in using other capabilities. Therefore, an autonomous system (human or artificial) pretrained with simulated disabilities will be more efficient when acting in adversarial conditions. We consider these coolabilities as complementary artificial intelligence and argue on the usefulness if this concept for various applications.

A mixed-scale dense convolutional neural network for image analysis

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1715832114 ◽

2017 ◽

Vol 115 (2) ◽

pp. 254-259 ◽

Cited By ~ 60

Author(s):

Daniël M. Pelt ◽

James A. Sethian

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Processing ◽

Network Architecture ◽

Training Data ◽

Network Architectures ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Single Set ◽

Reduced Risk

Deep convolutional neural networks have been successfully applied to many image-processing problems in recent works. Popular network architectures often add additional operations and connections to the standard architecture to enable training deeper networks. To achieve accurate results in practice, a large number of trainable parameters are often required. Here, we introduce a network architecture based on using dilated convolutions to capture features at different image scales and densely connecting all feature maps with each other. The resulting architecture is able to achieve accurate results with relatively few parameters and consists of a single set of operations, making it easier to implement, train, and apply in practice, and automatically adapts to different problems. We compare results of the proposed network architecture with popular existing architectures for several segmentation problems, showing that the proposed architecture is able to achieve accurate results with fewer parameters, with a reduced risk of overfitting the training data.

Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2016.2612821 ◽

2017 ◽

Vol 55 (2) ◽

pp. 645-657 ◽

Cited By ~ 387

Author(s):

Emmanuel Maggiori ◽

Yuliya Tarabalka ◽

Guillaume Charpiat ◽

Pierre Alliez

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Large Scale ◽

Remote Sensing Image ◽

Remote Sensing Image Classification