Robust deep convolutional neural network against image distortions

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2021.14 ◽

2021 ◽

Vol 10 ◽

Author(s):

Liang-Yao Wang ◽

Sau-Gee Chen ◽

Feng-Tsun Chien

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

High Frequency ◽

Input Image ◽

Processing Unit ◽

Different Types ◽

Hybrid Module ◽

Frequency Components ◽

Discrete Wavelets ◽

Value Decomposition

Many approaches have been proposed in the literature to enhance the robustness of Convolutional Neural Network (CNN)-based architectures against image distortions. Attempts to combat various types of distortions can be made by combining multiple expert networks, each trained by a certain type of distorted images, which however lead to a large model with high complexity. In this paper, we propose a CNN-based architecture with a pre-processing unit in which only undistorted data are used for training. The pre-processing unit employs discrete cosine transform (DCT) and discrete wavelets transform (DWT) to remove high-frequency components while capturing prominent high-frequency features in the undistorted data by means of random selection. We further utilize the singular value decomposition (SVD) to extract features before feeding the preprocessed data into the CNN for training. During testing, distorted images directly enter the CNN for classification without having to go through the hybrid module. Five different types of distortions are produced in the SVHN dataset and the CIFAR-10/100 datasets. Experimental results show that the proposed DCT-DWT-SVD module built upon the CNN architecture provides a classifier robust to input image distortions, outperforming the state-of-the-art approaches in terms of accuracy under different types of distortions.

Download Full-text

Research on Convolutional Neural Network Model for Sonar IMAGE Segmentation

MATEC Web of Conferences ◽

10.1051/matecconf/201822010004 ◽

2018 ◽

Vol 220 ◽

pp. 10004 ◽

Cited By ~ 1

Author(s):

Shengxi Jiao ◽

Chunyu Zhao ◽

Ye Xin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Loss Function ◽

Spatial Information ◽

Original Learning ◽

Speckle Noise ◽

Input Image ◽

Image Features ◽

Processing Unit ◽

Sonar Image

The speckle noise of sonar images affects the human interpretation and automatic recognition of images seriously. It is important and difficult to realize the precision segmentation of sonar image with speckle noise in the field of image processing. Full convolution neural network (FCN) has the advantage of accepting arbitrary size image and preserving spatial information of original input image. In this paper, the image features are obtained by autonomic learning of convolutional neural network, the original learning rules based on the mean square error loss function is improved. Taking the pixel as the processing unit, the segmentation method based on FCN model with relative loss function(FCN-RLF) for small submarine sonar image is proposed, sonar image pixel-level segmentation is achievied. Experimental results show that the improved algorithm can improve the segmentation accuracy and keep the edge and detail of sonar image better. The proposed model has better ability to reject sonar image speckle noise.

Download Full-text

Sparse-FCM and Deep Convolutional Neural Network for the segmentation and classification of acute lymphoblastic leukaemia

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2018-0213 ◽

2020 ◽

Vol 65 (6) ◽

pp. 759-773

Author(s):

Segu Praveena ◽

Sohan Pal Singh

Keyword(s):

Neural Network ◽

Acute Lymphoblastic Leukaemia ◽

Convolutional Neural Network ◽

Optimization Algorithm ◽

Lymphoblastic Leukaemia ◽

Input Image ◽

Deep Convolutional Neural Network ◽

Grey Wolf ◽

Deep Cnn

AbstractLeukaemia detection and diagnosis in advance is the trending topic in the medical applications for reducing the death toll of patients with acute lymphoblastic leukaemia (ALL). For the detection of ALL, it is essential to analyse the white blood cells (WBCs) for which the blood smear images are employed. This paper proposes a new technique for the segmentation and classification of the acute lymphoblastic leukaemia. The proposed method of automatic leukaemia detection is based on the Deep Convolutional Neural Network (Deep CNN) that is trained using an optimization algorithm, named Grey wolf-based Jaya Optimization Algorithm (GreyJOA), which is developed using the Grey Wolf Optimizer (GWO) and Jaya Optimization Algorithm (JOA) that improves the global convergence. Initially, the input image is applied to pre-processing and the segmentation is performed using the Sparse Fuzzy C-Means (Sparse FCM) clustering algorithm. Then, the features, such as Local Directional Patterns (LDP) and colour histogram-based features, are extracted from the segments of the pre-processed input image. Finally, the extracted features are applied to the Deep CNN for the classification. The experimentation evaluation of the method using the images of the ALL IDB2 database reveals that the proposed method acquired a maximal accuracy, sensitivity, and specificity of 0.9350, 0.9528, and 0.9389, respectively.

Download Full-text

Methods for Preventing Visual Attacks in Convolutional Neural Networks Based on Data Discard and Dimensionality Reduction

Applied Sciences ◽

10.3390/app11115235 ◽

2021 ◽

Vol 11 (11) ◽

pp. 5235

Author(s):

Nikita Andriyanov

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Network Inference ◽

Recognition Accuracy ◽

Network Architectures ◽

Reduced Dimensions ◽

Different Types ◽

Rectangular Area ◽

Noise Impulse ◽

Selection Of

The article is devoted to the study of convolutional neural network inference in the task of image processing under the influence of visual attacks. Attacks of four different types were considered: simple, involving the addition of white Gaussian noise, impulse action on one pixel of an image, and attacks that change brightness values within a rectangular area. MNIST and Kaggle dogs vs. cats datasets were chosen. Recognition characteristics were obtained for the accuracy, depending on the number of images subjected to attacks and the types of attacks used in the training. The study was based on well-known convolutional neural network architectures used in pattern recognition tasks, such as VGG-16 and Inception_v3. The dependencies of the recognition accuracy on the parameters of visual attacks were obtained. Original methods were proposed to prevent visual attacks. Such methods are based on the selection of “incomprehensible” classes for the recognizer, and their subsequent correction based on neural network inference with reduced image sizes. As a result of applying these methods, gains in the accuracy metric by a factor of 1.3 were obtained after iteration by discarding incomprehensible images, and reducing the amount of uncertainty by 4–5% after iteration by applying the integration of the results of image analyses in reduced dimensions.

Download Full-text

Automatic ECG Classification Using Continuous Wavelet Transform and Convolutional Neural Network

Entropy ◽

10.3390/e23010119 ◽

2021 ◽

Vol 23 (1) ◽

pp. 119

Author(s):

Tao Wang ◽

Changhua Lu ◽

Yining Sun ◽

Mei Yang ◽

Chun Liu ◽

...

Keyword(s):

Neural Network ◽

Wavelet Transform ◽

Convolutional Neural Network ◽

Continuous Wavelet Transform ◽

Continuous Wavelet ◽

Time Frequency ◽

Ecg Signals ◽

Rr Interval ◽

Frequency Components ◽

Fully Connected

Early detection of arrhythmia and effective treatment can prevent deaths caused by cardiovascular disease (CVD). In clinical practice, the diagnosis is made by checking the electrocardiogram (ECG) beat-by-beat, but this is usually time-consuming and laborious. In the paper, we propose an automatic ECG classification method based on Continuous Wavelet Transform (CWT) and Convolutional Neural Network (CNN). CWT is used to decompose ECG signals to obtain different time-frequency components, and CNN is used to extract features from the 2D-scalogram composed of the above time-frequency components. Considering the surrounding R peak interval (also called RR interval) is also useful for the diagnosis of arrhythmia, four RR interval features are extracted and combined with the CNN features to input into a fully connected layer for ECG classification. By testing in the MIT-BIH arrhythmia database, our method achieves an overall performance of 70.75%, 67.47%, 68.76%, and 98.74% for positive predictive value, sensitivity, F1-score, and accuracy, respectively. Compared with existing methods, the overall F1-score of our method is increased by 4.75~16.85%. Because our method is simple and highly accurate, it can potentially be used as a clinical auxiliary diagnostic tool.

Download Full-text

Image Registration Algorithm Based on Convolutional Neural Network and Local Homography Transformation

Applied Sciences ◽

10.3390/app10030732 ◽

2020 ◽

Vol 10 (3) ◽

pp. 732 ◽

Cited By ~ 1

Author(s):

Yuanwei Wang ◽

Mei Yu ◽

Gangyi Jiang ◽

Zhiyong Pan ◽

Jiqiang Lin

Keyword(s):

Neural Network ◽

Image Registration ◽

Convolutional Neural Network ◽

Estimation Algorithm ◽

Matrix Estimation ◽

Registration Algorithm ◽

Image Registration Algorithm ◽

Different Types ◽

Homography Transformation ◽

Traditional Image

In order to overcome the poor robustness of traditional image registration algorithms in illuminating and solving the problem of low accuracy of a learning-based image homography matrix estimation algorithm, an image registration algorithm based on convolutional neural network (CNN) and local homography transformation is proposed. Firstly, to ensure the diversity of samples, a sample and label generation method based on moving direct linear transformation (MDLT) is designed. The generated samples and labels can effectively reflect the local characteristics of images and are suitable for training the CNN model with which multiple pairs of local matching points between two images to be registered can be calculated. Then, the local homography matrices between the two images are estimated by using the MDLT and finally the image registration can be realized. The experimental results show that the proposed image registration algorithm achieves higher accuracy than other commonly used algorithms such as the SIFT, ORB, ECC, and APAP algorithms, as well as another two learning-based algorithms, and it has good robustness for different types of illumination imaging.

Download Full-text

Research on the Extraction of Image Edge Information in Convolutional Neural Networks

Journal of Physics Conference Series ◽

10.1088/1742-6596/2083/3/032015 ◽

2021 ◽

Vol 2083 (3) ◽

pp. 032015

Author(s):

Guanru Zou ◽

Yulin Luo ◽

Zefeng Feng

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Input Image ◽

Edge Weight ◽

Edge Information ◽

Image Dimension ◽

Original Dataset ◽

Image Edge ◽

Accuracy Rates ◽

Object Move

Abstract Convolutional neural network is an important neural network model in deep learning and a common algorithm in computer vision problems. From the perspective of practical application scenarios, this paper studies whether padding in convolutional neural network convolution layer weakens the image edge information. In order to eliminate the background factor, this paper select MNIST dataset as the research object, move the 0-9 digital image to the specified image edge by clearing the white area pixels in the specified direction, and use OpenCV to realize bilinear interpolation to scale the image to ensure that the image dimension is 28×28. The convolution neural network is built to train the original dataset and the processed dataset, and the accuracy rates are 0.9892 and 0.1082 respectively. In the comparative experiment, padding cannot solve the problem of weakening the image edge weight well. In the actual digital recognition scene, it is necessary to consider whether the core recognition area in the input image is at the edge of the image.

Download Full-text

Sign Language Recognition Using Two-Stream Convolutional Neural Networks with Wi-Fi Signals

Applied Sciences ◽

10.3390/app10249005 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9005

Author(s):

Chien-Cheng Lee ◽

Zhongjian Gao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Sign Language ◽

Recognition Accuracy ◽

Signal Interference ◽

Stream Networks ◽

Language Recognition ◽

Sign Language Recognition ◽

Stream Network ◽

Value Decomposition

Sign language is an important way for deaf people to understand and communicate with others. Many researchers use Wi-Fi signals to recognize hand and finger gestures in a non-invasive manner. However, Wi-Fi signals usually contain signal interference, background noise, and mixed multipath noise. In this study, Wi-Fi Channel State Information (CSI) is preprocessed by singular value decomposition (SVD) to obtain the essential signals. Sign language includes the positional relationship of gestures in space and the changes of actions over time. We propose a novel dual-output two-stream convolutional neural network. It not only combines the spatial-stream network and the motion-stream network, but also effectively alleviates the backpropagation problem of the two-stream convolutional neural network (CNN) and improves its recognition accuracy. After the two stream networks are fused, an attention mechanism is applied to select the important features learned by the two-stream networks. Our method has been validated by the public dataset SignFi and adopted five-fold cross-validation. Experimental results show that SVD preprocessing can improve the performance of our dual-output two-stream network. For home, lab, and lab + home environment, the average recognition accuracy rates are 99.13%, 96.79%, and 97.08%, respectively. Compared with other methods, our method has good performance and better generalization capability.

Download Full-text

Inferring Emotion Tags from Object Images Using Convolutional Neural Network

Applied Sciences ◽

10.3390/app10155333 ◽

2020 ◽

Vol 10 (15) ◽

pp. 5333

Author(s):

Anam Manzoor ◽

Waqar Ahmad ◽

Muhammad Ehatisham-ul-Haq ◽

Abdul Hannan ◽

Muhammad Asif Khan ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Human Behavior ◽

Real Life ◽

Vital Role ◽

Object Categories ◽

Different Types ◽

Gender Based ◽

High Level ◽

Human Behavior Analysis

Emotions are a fundamental part of human behavior and can be stimulated in numerous ways. In real-life, we come across different types of objects such as cake, crab, television, trees, etc., in our routine life, which may excite certain emotions. Likewise, object images that we see and share on different platforms are also capable of expressing or inducing human emotions. Inferring emotion tags from these object images has great significance as it can play a vital role in recommendation systems, image retrieval, human behavior analysis and, advertisement applications. The existing schemes for emotion tag perception are based on the visual features, like color and texture of an image, which are poorly affected by lightning conditions. The main objective of our proposed study is to address this problem by introducing a novel idea of inferring emotion tags from the images based on object-related features. In this aspect, we first created an emotion-tagged dataset from the publicly available object detection dataset (i.e., “Caltech-256”) using subject evaluation from 212 users. Next, we used a convolutional neural network-based model to automatically extract the high-level features from object images for recognizing nine (09) emotion categories, such as amusement, awe, anger, boredom, contentment, disgust, excitement, fear, and sadness. Experimental results on our emotion-tagged dataset endorse the success of our proposed idea in terms of accuracy, precision, recall, specificity, and F1-score. Overall, the proposed scheme achieved an accuracy rate of approximately 85% and 79% using top-level and bottom-level emotion tagging, respectively. We also performed a gender-based analysis for inferring emotion tags and observed that male and female subjects have discernment in emotions perception concerning different object categories.

Download Full-text

Visual Interpretation of Convolutional Neural Network Predictions in Classifying Medical Image Modalities

Diagnostics ◽

10.3390/diagnostics9020038 ◽

2019 ◽

Vol 9 (2) ◽

pp. 38 ◽

Cited By ~ 10

Author(s):

Incheol Kim ◽

Sivaramakrishnan Rajaraman ◽

Sameer Antani

Keyword(s):

Neural Network ◽

Information Retrieval ◽

Convolutional Neural Network ◽

Visual Information ◽

Medical Image ◽

Input Image ◽

Visual Interpretation ◽

Feature Maps ◽

Novel Method ◽

Series Of Experiments

Deep learning (DL) methods are increasingly being applied for developing reliable computer-aided detection (CADe), diagnosis (CADx), and information retrieval algorithms. However, challenges in interpreting and explaining the learned behavior of the DL models hinders their adoption and use in real-world systems. In this study, we propose a novel method called “Class-selective Relevance Mapping” (CRM) for localizing and visualizing discriminative regions of interest (ROI) within a medical image. Such visualizations offer improved explanation of the convolutional neural network (CNN)-based DL model predictions. We demonstrate CRM effectiveness in classifying medical imaging modalities toward automatically labeling them for visual information retrieval applications. The CRM is based on linear sum of incremental mean squared errors (MSE) calculated at the output layer of the CNN model. It measures both positive and negative contributions of each spatial element in the feature maps produced from the last convolution layer leading to correct classification of an input image. A series of experiments on a “multi-modality” CNN model designed for classifying seven different types of image modalities shows that the proposed method is significantly better in detecting and localizing the discriminative ROIs than other state of the art class-activation methods. Further, to visualize its effectiveness we generate “class-specific” ROI maps by averaging the CRM scores of images in each modality class, and characterize the visual explanation through their different size, shape, and location for our multi-modality CNN model that achieved over 98% performance on a dataset constructed from publicly available images.

Download Full-text

Convolutional Neural Network Classification of Telematics Car Driving Data

Risks ◽

10.3390/risks7010006 ◽

2019 ◽

Vol 7 (1) ◽

pp. 6 ◽

Cited By ~ 9

Author(s):

Guangyuan Gao ◽

Mario Wüthrich

Keyword(s):

Neural Network ◽

Neural Networks ◽

Time Series ◽

Convolutional Neural Network ◽

High Frequency ◽

Location Data ◽

Neural Network Classification ◽

Feature Information ◽

Car Driving

The aim of this project is to analyze high-frequency GPS location data (second per second) of individual car drivers (and trips). We extract feature information about speeds, acceleration, deceleration, and changes of direction from this high-frequency GPS location data. Time series of this feature information allow us to appropriately allocate individual car driving trips to selected drivers using convolutional neural networks.

Download Full-text