NU-LiteNet: Mobile Landmark Recognition using Convolutional Neural Networks

The growth of high-performance mobile devices has resulted in more research into on-device image recognition. The research problems have been the latency and accuracy of automatic recognition, which remain as obstacles to its real-world usage. Although the recently developed deep neural networks can achieve accuracy comparable to that of a human user, some of them are still too slow. This paper describes the development of the architecture of a new convolutional neural network model, NU-LiteNet. For this, SqueezeNet was developed to reduce the model size to a degree suitable for smartphones. The model size of NU-LiteNet was therefore 2.6 times smaller than that of SqueezeNet. The model outperformed other Convolutional Neural Network (CNN) models for mobile devices (eg. SqueezeNet and MobileNet) with an accuracy of 81.15% and 69.58% on Singapore and Paris landmark datasets respectively. The shortest execution time of 0.7 seconds per image was recorded with NU-LiteNet on mobile phones.

Download Full-text

EMOTIONS RECOGNITION IN HUMAN SPEECH USING DEEP NEURAL NETWORKS

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2021.01.pp.044-051 ◽

2021 ◽

pp. 44-51

Author(s):

E. Yu. Shchetinin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Network ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Audio Recordings ◽

Computer Studies

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.

Download Full-text

Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

Applied Sciences ◽

10.3390/app10062104 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2104

Author(s):

Michał Tomaszewski ◽

Paweł Michalski ◽

Jakub Osuchowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Detection Efficiency ◽

Training Data ◽

Training Dataset ◽

Training Set ◽

Convolutional Network

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

Download Full-text

Damage Detection of Composite Materials Using Data Fusion With Deep Neural Networks

Volume 10B: Structures and Dynamics ◽

10.1115/gt2020-15097 ◽

2020 ◽

Author(s):

Shweta Dabetwar ◽

Stephen Ekwaro-Osire ◽

João Paulo Dias

Keyword(s):

Neural Network ◽

Neural Networks ◽

Composite Materials ◽

Data Fusion ◽

Damage Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Mixed Data ◽

Image Encoding ◽

Using Data

Abstract Composite materials have enormous applications in various fields. Thus, it is important to have an efficient damage detection method to avoid catastrophic failures. Due to the existence of multiple damage modes and the availability of data in different formats, it is important to employ efficient techniques to consider all the types of damage. Deep neural networks were seen to exhibit the ability to address similar complex problems. The research question in this work is ‘Can data fusion improve damage classification using the convolutional neural network?’ The specific aims developed were to 1) assess the performance of image encoding algorithms, 2) classify the damage using data from separate experimental coupons, and 3) classify the damage using mixed data from multiple experimental coupons. Two different experimental measurements were taken from NASA Ames Prognostic Repository for Carbon Fiber Reinforced polymer. To use data fusion, the piezoelectric signals were converted into images using Gramian Angular Field (GAF) and Markov Transition Field. Using data fusion techniques, the input dataset was created for a convolutional neural network with three hidden layers to determine the damage states. The accuracies of all the image encoding algorithms were compared. The analysis showed that data fusion provided better results as it contained more information on the damages modes that occur in composite materials. Additionally, GAF was shown to perform the best. Thus, the combination of data fusion and deep neural network techniques provides an efficient method for damage detection of composite materials.

Download Full-text

Deep Convolutional Neural Network for Object Classification

Handbook of Research on Deep Learning-Based Image Analysis Under Constrained and Unconstrained Environments - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-6690-9.ch016 ◽

2021 ◽

pp. 317-343

Author(s):

Amira Ahmad Al-Sharkawy ◽

Gehan A. Bahgat ◽

Elsayed E. Hemayed ◽

Samia Abdel-Razik Mashali

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Computational Models ◽

Human Performance ◽

Deep Neural Networks ◽

Object Classification ◽

Classification Problem ◽

Deep Convolutional Neural Network ◽

Object Appearance

Object classification problem is essential in many applications nowadays. Human can easily classify objects in unconstrained environments easily. Classical classification techniques were far away from human performance. Thus, researchers try to mimic the human visual system till they reached the deep neural networks. This chapter gives a review and analysis in the field of the deep convolutional neural network usage in object classification under constrained and unconstrained environment. The chapter gives a brief review on the classical techniques of object classification and the development of bio-inspired computational models from neuroscience till the creation of deep neural networks. A review is given on the constrained environment issues: the hardware computing resources and memory, the object appearance and background, and the training and processing time. Datasets that are used to test the performance are analyzed according to the images environmental conditions, besides the dataset biasing is discussed.

Download Full-text

Identification of Thoracic Diseases by Exploiting Deep Neural Networks (Preprint)

10.2196/preprints.23644 ◽

2020 ◽

Author(s):

Albahli Saleh ◽

Ali Alkhalifah

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Medical Image Analysis ◽

Medical Community ◽

Learning Models ◽

X Ray ◽

Chest Disease

BACKGROUND To diagnose cardiothoracic diseases, a chest x-ray (CXR) is examined by a radiologist. As more people get affected, doctors are becoming scarce especially in developing countries. However, with the advent of image processing tools, the task of diagnosing these cardiothoracic diseases has seen great progress. A lot of researchers have put in work to see how the problems associated with medical images can be mitigated by using neural networks. OBJECTIVE Previous works used state-of-the-art techniques and got effective results with one or two cardiothoracic diseases but could lead to misclassification. In our work, we adopted GANs to synthesize the chest radiograph (CXR) to augment the training set on multiple cardiothoracic diseases to efficiently diagnose the chest diseases in different classes as shown in Figure 1. In this regard, our major contributions are classifying various cardiothoracic diseases to detect a specific chest disease based on CXR, use the advantage of GANs to overcome the shortages of small training datasets, address the problem of imbalanced data; and implementing optimal deep neural network architecture with different hyper-parameters to improve the model with the best accuracy. METHODS For this research, we are not building a model from scratch due to computational restraints as they require very high-end computers. Rather, we use a Convolutional Neural Network (CNN) as a class of deep neural networks to propose a generative adversarial network (GAN) -based model to generate synthetic data for training the data as the amount of the data is limited. We will use pre-trained models which are models that were trained on a large benchmark dataset to solve a problem similar to the one we want to solve. For example, the ResNet-152 model we used was initially trained on the ImageNet dataset. RESULTS After successful training and validation of the models we developed, ResNet-152 with image augmentation proved to be the best model for the automatic detection of cardiothoracic disease. However, one of the main problems associated with radiographic deep learning projects and research is the scarcity and unavailability of enough datasets which is a key component of all deep learning models as they require a lot of data for training. This is the reason why some of our models had image augmentation to increase the number of images without duplication. As more data are collected in the field of chest radiology, the models could be retrained to improve the accuracies of the models as deep learning models improve with more data. CONCLUSIONS This research employs the advantages of computer vision and medical image analysis to develop an automated model that has the clinical potential for early detection of the disease. Using deep learning models, the research aims to evaluate the effectiveness and accuracy of different convolutional neural network models in the automatic diagnosis of cardiothoracic diseases from x-ray images compared to diagnosis by experts in the medical community.

Download Full-text

Mitigation of Effects of Occlusion on Object Recognition with Deep Neural Networks through Low-Level Image Completion

Computational Intelligence and Neuroscience ◽

10.1155/2016/6425257 ◽

2016 ◽

Vol 2016 ◽

pp. 1-15 ◽

Cited By ~ 7

Author(s):

Benjamin Chandler ◽

Ennio Mingolla

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Recognition ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Poor Performance ◽

Classification Algorithms ◽

Image Completion ◽

Convolutional Network ◽

Occluded Objects

Heavily occluded objects are more difficult for classification algorithms to identify correctly than unoccluded objects. This effect is rare and thus hard to measure with datasets like ImageNet and PASCAL VOC, however, owing to biases in human-generated image pose selection. We introduce a dataset that emphasizes occlusion and additions to a standard convolutional neural network aimed at increasing invariance to occlusion. An unmodified convolutional neural network trained and tested on the new dataset rapidly degrades to chance-level accuracy as occlusion increases. Training with occluded data slows this decline but still yields poor performance with high occlusion. Integrating novel preprocessing stages to segment the input and inpaint occlusions is an effective mitigation. A convolutional network so modified is nearly as effective with more than 81% of pixels occluded as it is with no occlusion. Such a network is also more accurate on unoccluded images than an otherwise identical network that has been trained with only unoccluded images. These results depend on successful segmentation. The occlusions in our dataset are deliberately easy to segment from the figure and background. Achieving similar results on a more challenging dataset would require finding a method to split figure, background, and occluding pixels in the input.

Download Full-text

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

IoT ◽

10.3390/iot2020012 ◽

2021 ◽

Vol 2 (2) ◽

pp. 222-235

Author(s):

Guillaume Coiffier ◽

Ghouthi Boukli Hacene ◽

Vincent Gripon

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Spatial Resolution ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Maps ◽

Neural Network Architecture

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

Download Full-text

A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3480170 ◽

2022 ◽

Vol 15 (3) ◽

pp. 1-31

Author(s):

Shulin Zeng ◽

Guohao Dai ◽

Hanbo Sun ◽

Jun Liu ◽

Shiyao Li ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

High Performance ◽

Deep Neural Networks ◽

Cost Effective ◽

General Purpose ◽

The Other ◽

Private Cloud ◽

Single Task ◽

Other Hand

INFerence-as-a-Service (INFaaS) has become a primary workload in the cloud. However, existing FPGA-based Deep Neural Network (DNN) accelerators are mainly optimized for the fastest speed of a single task, while the multi-tenancy of INFaaS has not been explored yet. As the demand for INFaaS keeps growing, simply increasing the number of FPGA-based DNN accelerators is not cost-effective, while merely sharing these single-task optimized DNN accelerators in a time-division multiplexing way could lead to poor isolation and high-performance loss for INFaaS. On the other hand, current cloud-based DNN accelerators have excessive compilation overhead, especially when scaling out to multi-FPGA systems for multi-tenant sharing, leading to unacceptable compilation costs for both offline deployment and online reconfiguration. Therefore, it is far from providing efficient and flexible FPGA virtualization for public and private cloud scenarios. Aiming to solve these problems, we propose a unified virtualization framework for general-purpose deep neural networks in the cloud, enabling multi-tenant sharing for both the Convolution Neural Network (CNN), and the Recurrent Neural Network (RNN) accelerators on a single FPGA. The isolation is enabled by introducing a two-level instruction dispatch module and a multi-core based hardware resources pool. Such designs provide isolated and runtime-programmable hardware resources, which further leads to performance isolation for multi-tenant sharing. On the other hand, to overcome the heavy re-compilation overheads, a tiling-based instruction frame package design and a two-stage static-dynamic compilation, are proposed. Only the lightweight runtime information is re-compiled with ∼1 ms overhead, thus guaranteeing the private cloud’s performance. Finally, the extensive experimental results show that the proposed virtualized solutions achieve up to 3.12× and 6.18× higher throughput in the private cloud compared with the static CNN and RNN baseline designs, respectively.

Download Full-text

Lightweight Convolutional Neural Networks with Model-Switching Architecture for Multi-Scenario Road Semantic Segmentation

Applied Sciences ◽

10.3390/app11167424 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7424

Author(s):

Peng-Wei Lin ◽

Chih-Ming Hsu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Individual Model ◽

Suppression Effect ◽

Optimal Weights ◽

Model Size ◽

Multiple Scenarios

A convolutional neural network (CNN) that was trained using datasets for multiple scenarios was proposed to facilitate real-time road semantic segmentation for various scenarios encountered in autonomous driving. However, the CNN inhibited the mutual suppression effect between weights; thus, it did not perform as well as a network that was trained using a single scenario. To address this limitation, we used a model-switching architecture in the network and maintained the optimal weights of each individual model which required considerable space and computation. We, subsequently, incorporated a lightweight process into the model to reduce the model size and computational load. The experimental results indicated that the proposed lightweight CNN with a model-switching architecture outperformed and was faster than the conventional methods across multiple scenarios in road semantic segmentation.

Download Full-text

Make Some Noise. Unleashing the Power of Convolutional Neural Networks for Profiled Side-channel Analysis

IACR Transactions on Cryptographic Hardware and Embedded Systems ◽

10.46586/tches.v2019.i3.148-179 ◽

2019 ◽

pp. 148-179 ◽

Cited By ~ 4

Author(s):

Jaehun Kim ◽

Stjepan Picek ◽

Annelie Heuser ◽

Shivam Bhasin ◽

Alan Hanjalic

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

High Performance ◽

Artificial Noise ◽

Side Channel ◽

Secret Key ◽

Side Channel Analysis ◽

Channel Analysis

Profiled side-channel analysis based on deep learning, and more precisely Convolutional Neural Networks, is a paradigm showing significant potential. The results, although scarce for now, suggest that such techniques are even able to break cryptographic implementations protected with countermeasures. In this paper, we start by proposing a new Convolutional Neural Network instance able to reach high performance for a number of considered datasets. We compare our neural network with the one designed for a particular dataset with masking countermeasure and we show that both are good designs but also that neither can be considered as a superior to the other one.Next, we address how the addition of artificial noise to the input signal can be actually beneficial to the performance of the neural network. Such noise addition is equivalent to the regularization term in the objective function. By using this technique, we are able to reduce the number of measurements needed to reveal the secret key by orders of magnitude for both neural networks. Our new convolutional neural network instance with added noise is able to break the implementation protected with the random delay countermeasure by using only 3 traces in the attack phase. To further strengthen our experimental results, we investigate the performance with a varying number of training samples, noise levels, and epochs. Our findings show that adding noise is beneficial throughout all training set sizes and epochs.

Download Full-text