scholarly journals Attention-based deep learning networks for identification of human gait using radar micro-Doppler spectrograms

Author(s):  
Hannah Garcia Doherty ◽  
Roberto Arnaiz Burgueño ◽  
Roeland P. Trommel ◽  
Vasileios Papanastasiou ◽  
Ronny I. A. Harmanny

Abstract Identification of human individuals within a group of 39 persons using micro-Doppler (μ-D) features has been investigated. Deep convolutional neural networks with two different training procedures have been used to perform classification. Visualization of the inner network layers revealed the sections of the input image most relevant when determining the class label of the target. A convolutional block attention module is added to provide a weighted feature vector in the channel and feature dimension, highlighting the relevant μ-D feature-filled areas in the image and improving classification performance.

Author(s):  
Mehmet Sarigul ◽  
Levent Karacan

Since the invention of cameras, video shooting has become a passion for human. However, the quality of videos recorded with devices such as handheld cameras, head cameras, and vehicle cameras may be low due to shaking, jittering and unwanted periodic movements. Although the issue of video stabilization has been studied for decades, there is no consensus on how to measure the performance of a video stabilization method. In many studies in the literature, different metrics have been used for comparison of different methods. In this study, deep convolutional neural networks are used as a decision maker for video stabilization. VGG networks with different number of layers are used to determine the stability status of the videos. It was observed that VGG networks showed a classification performance up to 96.537% using only two consecutive scenes. These results show that deep learning networks can be utilized as a metric for video stabilization.


Author(s):  
Bo Wang ◽  
Xiaoting Yu ◽  
Chengeng Huang ◽  
Qinghong Sheng ◽  
Yuanyuan Wang ◽  
...  

The excellent feature extraction ability of deep convolutional neural networks (DCNNs) has been demonstrated in many image processing tasks, by which image classification can achieve high accuracy with only raw input images. However, the specific image features that influence the classification results are not readily determinable and what lies behind the predictions is unclear. This study proposes a method combining the Sobel and Canny operators and an Inception module for ship classification. The Sobel and Canny operators obtain enhanced edge features from the input images. A convolutional layer is replaced with the Inception module, which can automatically select the proper convolution kernel for ship objects in different image regions. The principle is that the high-level features abstracted by the DCNN, and the features obtained by multi-convolution concatenation of the Inception module must ultimately derive from the edge information of the preprocessing input images. This indicates that the classification results are based on the input edge features, which indirectly interpret the classification results to some extent. Experimental results show that the combination of the edge features and the Inception module improves DCNN ship classification performance. The original model with the raw dataset has an average accuracy of 88.72%, while when using enhanced edge features as input, it achieves the best performance of 90.54% among all models. The model that replaces the fifth convolutional layer with the Inception module has the best performance of 89.50%. It performs close to VGG-16 on the raw dataset and is significantly better than other deep neural networks. The results validate the functionality and feasibility of the idea posited.


Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 427 ◽  
Author(s):  
Sanxing Zhang ◽  
Zhenhuan Ma ◽  
Gang Zhang ◽  
Tao Lei ◽  
Rui Zhang ◽  
...  

Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Wei Hu ◽  
Yangyu Huang ◽  
Li Wei ◽  
Fan Zhang ◽  
Hengchao Li

Recently, convolutional neural networks have demonstrated excellent performance on various visual tasks, including the classification of common two-dimensional images. In this paper, deep convolutional neural networks are employed to classify hyperspectral images directly in spectral domain. More specifically, the architecture of the proposed classifier contains five layers with weights which are the input layer, the convolutional layer, the max pooling layer, the full connection layer, and the output layer. These five layers are implemented on each spectral signature to discriminate against others. Experimental results based on several hyperspectral image data sets demonstrate that the proposed method can achieve better classification performance than some traditional methods, such as support vector machines and the conventional deep learning-based methods.


Author(s):  
Jingzhao Hu ◽  
Hao Zhang ◽  
Yang Liu ◽  
Richard Sutcliffe ◽  
Jun Feng

AbstractIn recent years, Deep Neural Networks (DNNs) have achieved excellent performance on many tasks, but it is very difficult to train good models from imbalanced datasets. Creating balanced batches either by majority data down-sampling or by minority data up-sampling can solve the problem in certain cases. However, it may lead to learning process instability and overfitting. In this paper, we propose the Batch Balance Wrapper (BBW), a novel framework which can adapt a general DNN to be well trained from extremely imbalanced datasets with few minority samples. In BBW, two extra network layers are added to the start of a DNN. The layers prevent overfitting of minority samples and improve the expressiveness of the sample distribution of minority samples. Furthermore, Batch Balance (BB), a class-based sampling algorithm, is proposed to make sure the samples in each batch are always balanced during the learning process. We test BBW on three well-known extremely imbalanced datasets with few minority samples. The maximum imbalance ratio reaches 1167:1 with only 16 positive samples. Compared with existing approaches, BBW achieves better classification performance. In addition, BBW-wrapped DNNs are 16.39 times faster, relative to unwrapped DNNs. Moreover, BBW does not require data preprocessing or additional hyper-parameter tuning, operations that may require additional processing time. The experiments prove that BBW can be applied to common applications of extremely imbalanced data with few minority samples, such as the classification of EEG signals, medical images and so on.


2021 ◽  
Author(s):  
Johannes Janek Daniel Singer ◽  
Katja Seeliger ◽  
Tim Christian Kietzmann ◽  
Martin N Hebart

Line drawings convey meaning with just a few strokes. Despite strong simplifications, humans can recognize objects depicted in such abstracted images without effort. To what degree do deep convolutional neural networks (CNNs) mirror this human ability to generalize to abstracted object images? While CNNs trained on natural images have been shown to exhibit poor classification performance on drawings, other work has demonstrated highly similar latent representations in the networks for abstracted and natural images. Here, we address these seemingly conflicting findings by analyzing the activation patterns of a CNN trained on natural images across a set of photos, drawings and sketches of the same objects and comparing them to human behavior. We find a highly similar representational structure across levels of visual abstraction in early and intermediate layers of the network. This similarity, however, does not translate to later stages in the network, resulting in low classification performance for drawings and sketches. We identified that texture bias in CNNs contributes to the dissimilar representational structure in late layers and the poor performance on drawings. Finally, by fine-tuning late network layers with object drawings, we show that performance can be largely restored, demonstrating the general utility of features learned on natural images in early and intermediate layers for the recognition of drawings. In conclusion, generalization to abstracted images such as drawings seems to be an emergent property of CNNs trained on natural images, which is, however, suppressed by domain-related biases that arise during later processing stages in the network.


Author(s):  
Jing Wang ◽  
Xiaobin Cheng ◽  
Xun Wang ◽  
Yan Gao ◽  
Bin Liu ◽  
...  

Abstract t-distributed stochastic neighbour embedding (t-SNE) is of considerable interest in machining condition monitoring for feature selection. In this paper, the neural networks are introduced to solidify the manifold of the t-SNE prior to classification. This leads to the improved feature selection method, namely the Net-SNE. Conventional statistical features are first extracted from vibration signals to form a high dimensional feature vector. The redundancies in the feature vector are subsequently removed by the t-SNE. Then the neural networks build a mapping model between the high dimensional feature vector and the selected features. The new data is calculated directly using the mapping model. The experiments were conducted on a lathe and a milling machine to collect vibration signals under common working conditions. The K-nearest neighbour classifier is applied to a small sample case and a class-imbalance case to compare the classification performance with and without the Net-SNE. The results demonstrate that the Net-SNE has the advantage over the t-SNE, since it can mine the discriminative features and solidifiy the manifold in the calculation of the new data. Moreover, the proposed method significantly improves the classification accuracy by Net-SNE, along with better classification performance in data-limited situations.


Biomolecules ◽  
2020 ◽  
Vol 10 (7) ◽  
pp. 984 ◽  
Author(s):  
Shintaro Sukegawa ◽  
Kazumasa Yoshii ◽  
Takeshi Hara ◽  
Katsusuke Yamashita ◽  
Keisuke Nakano ◽  
...  

In this study, we used panoramic X-ray images to classify and clarify the accuracy of different dental implant brands via deep convolutional neural networks (CNNs) with transfer-learning strategies. For objective labeling, 8859 implant images of 11 implant systems were used from digital panoramic radiographs obtained from patients who underwent dental implant treatment at Kagawa Prefectural Central Hospital, Japan, between 2005 and 2019. Five deep CNN models (specifically, a basic CNN with three convolutional layers, VGG16 and VGG19 transfer-learning models, and finely tuned VGG16 and VGG19) were evaluated for implant classification. Among the five models, the finely tuned VGG16 model exhibited the highest implant classification performance. The finely tuned VGG19 was second best, followed by the normal transfer-learning VGG16. We confirmed that the finely tuned VGG16 and VGG19 CNNs could accurately classify dental implant systems from 11 types of panoramic X-ray images.


2021 ◽  
Author(s):  
Soumyya Kanti Datta ◽  
Mohammad Abuzar Shaikh ◽  
Sargur N. Srihari ◽  
Mingchen Gao

In clinical applications, neural networks must focus on and highlight the most important parts of an input image. Soft-Attention mechanism enables a neural network to achieve this goal. This paper investigates the effectiveness of Soft-Attention in deep neural architectures. The central aim of Soft-Attention is to boost the value of important features and suppress the noise-inducing features. We compare the performance of VGG, ResNet, InceptionResNetv2 and DenseNet architectures with and without the Soft-Attention mechanism, while classifying skin lesions. The original network when coupled with Soft-Attention outperforms the baseline[14] by 4.7% while achieving a precision of 93.7% on HAM10000 dataset. Additionally, Soft-Attention coupling improves the sensitivity score by 3.8% compared to baseline[28] and achieves 91.6% on ISIC-2017 dataset. The code is publicly available at github.


2021 ◽  
Vol 10 ◽  
Author(s):  
Jiarong Zhou ◽  
Wenzhe Wang ◽  
Biwen Lei ◽  
Wenhao Ge ◽  
Yu Huang ◽  
...  

With the increasing daily workload of physicians, computer-aided diagnosis (CAD) systems based on deep learning play an increasingly important role in pattern recognition of diagnostic medical images. In this paper, we propose a framework based on hierarchical convolutional neural networks (CNNs) for automatic detection and classification of focal liver lesions (FLLs) in multi-phasic computed tomography (CT). A total of 616 nodules, composed of three types of malignant lesions (hepatocellular carcinoma, intrahepatic cholangiocarcinoma, and metastasis) and benign lesions (hemangioma, focal nodular hyperplasia, and cyst), were randomly divided into training and test sets at an approximate ratio of 3:1. To evaluate the performance of our model, other commonly adopted CNN models and two physicians were included for comparison. Our model achieved the best results to detect FLLs, with an average test precision of 82.8%, recall of 93.4%, and F1-score of 87.8%. Our model initially classified FLLs into malignant and benign and then classified them into more detailed classes. For the binary and six-class classification, our model achieved average accuracy results of 82.5 and73.4%, respectively, which were better than the other three classification neural networks. Interestingly, the classification performance of the model was placed between a junior physician and a senior physician. Overall, this preliminary study demonstrates that our proposed multi-modality and multi-scale CNN structure can locate and classify FLLs accurately in a limited dataset, and would help inexperienced physicians to reach a diagnosis in clinical practice.


Sign in / Sign up

Export Citation Format

Share Document