scholarly journals Spatial-Spectral Network for Hyperspectral Image Classification: A 3-D CNN and Bi-LSTM Framework

2021 ◽  
Vol 13 (12) ◽  
pp. 2353
Author(s):  
Junru Yin ◽  
Changsheng Qi ◽  
Qiqiang Chen ◽  
Jiantao Qu

Recently, deep learning methods based on the combination of spatial and spectral features have been successfully applied in hyperspectral image (HSI) classification. To improve the utilization of the spatial and spectral information from the HSI, this paper proposes a unified network framework using a three-dimensional convolutional neural network (3-D CNN) and a band grouping-based bidirectional long short-term memory (Bi-LSTM) network for HSI classification. In the framework, extracting spectral features is regarded as a procedure of processing sequence data, and the Bi-LSTM network acts as the spectral feature extractor of the unified network to fully exploit the close relationships between spectral bands. The 3-D CNN has a unique advantage in processing the 3-D data; therefore, it is used as the spatial-spectral feature extractor in this unified network. Finally, in order to optimize the parameters of both feature extractors simultaneously, the Bi-LSTM and 3-D CNN share a loss function to form a unified network. To evaluate the performance of the proposed framework, three datasets were tested for HSI classification. The results demonstrate that the performance of the proposed method is better than the current state-of-the-art HSI classification methods.

Forests ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 33
Author(s):  
Xueliang Wang ◽  
Honge Ren

Multi-source data remote sensing provides innovative technical support for tree species recognition. Tree species recognition is relatively poor despite noteworthy advancements in image fusion methods because the features from multi-source data for each pixel in the same region cannot be deeply exploited. In the present paper, a novel deep learning approach for hyperspectral imagery is proposed to improve accuracy for the classification of tree species. The proposed method, named the double branch multi-source fusion (DBMF) method, could more deeply determine the relationship between multi-source data and provide more effective information. The DBMF method does this by fusing spectral features extracted from a hyperspectral image (HSI) captured by the HJ-1A satellite and spatial features extracted from a multispectral image (MSI) captured by the Sentinel-2 satellite. The network has two branches in the spatial branch to avoid the risk of information loss, of which, sandglass blocks are embedded into a convolutional neural network (CNN) to extract the corresponding spatial neighborhood features from the MSI. Simultaneously, to make the useful spectral feature transfer more effective in the spectral branch, we employed bidirectional long short-term memory (Bi-LSTM) with a triple attention mechanism to extract the spectral features of each pixel in the HSI with low resolution. The feature information is fused to classify the tree species after the addition of a fusion activation function, which could allow the network to obtain more interactive information. Finally, the fusion strategy allows for the prediction of the full classification map of three study areas. Experimental results on a multi-source dataset show that DBMF has a significant advantage over other state-of-the-art frameworks.


2019 ◽  
Vol 11 (16) ◽  
pp. 1954 ◽  
Author(s):  
Yangjie Sun ◽  
Zhongliang Fu ◽  
Liang Fan

Today, more and more deep learning frameworks are being applied to hyperspectral image classification tasks and have achieved great results. However, such approaches are still hampered by long training times. Traditional spectral–spatial hyperspectral image classification only utilizes spectral features at the pixel level, without considering the correlation between local spectral signatures. Our article has tested a novel hyperspectral image classification pattern, using random-patches convolution and local covariance (RPCC). The RPCC is an effective two-branch method that, on the one hand, obtains a specified number of convolution kernels from the image space through a random strategy and, on the other hand, constructs a covariance matrix between different spectral bands by clustering local neighboring pixels. In our method, the spatial features come from multi-scale and multi-level convolutional layers. The spectral features represent the correlations between different bands. We use the support vector machine as well as spectral and spatial fusion matrices to obtain classification results. Through experiments, RPCC is tested with five excellent methods on three public data-sets. Quantitative and qualitative evaluation indicators indicate that the accuracy of our RPCC method can match or exceed the current state-of-the-art methods.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1652 ◽  
Author(s):  
Peida Wu ◽  
Ziguan Cui ◽  
Zongliang Gan ◽  
Feng Liu

In recent years, deep learning methods have been widely used in the hyperspectral image (HSI) classification tasks. Among them, spectral-spatial combined methods based on the three-dimensional (3-D) convolution have shown good performance. However, because of the three-dimensional convolution, increasing network depth will result in a dramatic rise in the number of parameters. In addition, the previous methods do not make full use of spectral information. They mostly use the data after dimensionality reduction directly as the input of networks, which result in poor classification ability in some categories with small numbers of samples. To address the above two issues, in this paper, we designed an end-to-end 3D-ResNeXt network which adopts feature fusion and label smoothing strategy further. On the one hand, the residual connections and split-transform-merge strategy can alleviate the declining-accuracy phenomenon and decrease the number of parameters. We can adjust the hyperparameter cardinality instead of the network depth to extract more discriminative features of HSIs and improve the classification accuracy. On the other hand, in order to improve the classification accuracies of classes with small numbers of samples, we enrich the input of the 3D-ResNeXt spectral-spatial feature learning network by additional spectral feature learning, and finally use a loss function modified by label smoothing strategy to solve the imbalance of classes. The experimental results on three popular HSI datasets demonstrate the superiority of our proposed network and an effective improvement in the accuracies especially for the classes with small numbers of training samples.


2020 ◽  
Vol 12 (12) ◽  
pp. 1964 ◽  
Author(s):  
Mengbin Rao ◽  
Ping Tang ◽  
Zheng Zhang

Since hyperspectral images (HSI) captured by different sensors often contain different number of bands, but most of the convolutional neural networks (CNN) require a fixed-size input, the generalization capability of deep CNNs to use heterogeneous input to achieve better classification performance has become a research focus. For classification tasks with limited labeled samples, the training strategy of feeding CNNs with sample-pairs instead of single sample has proven to be an efficient approach. Following this strategy, we propose a Siamese CNN with three-dimensional (3D) adaptive spatial-spectral pyramid pooling (ASSP) layer, called ASSP-SCNN, that takes as input 3D sample-pair with varying size and can easily be transferred to another HSI dataset regardless of the number of spectral bands. The 3D ASSP layer can also extract different levels of 3D information to improve the classification performance of the equipped CNN. To evaluate the classification and generalization performance of ASSP-SCNN, our experiments consist of two parts: the experiments of ASSP-SCNN without pre-training and the experiments of ASSP-SCNN-based transfer learning framework. Experimental results on three HSI datasets demonstrate that both ASSP-SCNN without pre-training and transfer learning based on ASSP-SCNN achieve higher classification accuracies than several state-of-the-art CNN-based methods. Moreover, we also compare the performance of ASSP-SCNN on different transfer learning tasks, which further verifies that ASSP-SCNN has a strong generalization capability.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5191
Author(s):  
Jin Zhang ◽  
Fengyuan Wei ◽  
Fan Feng ◽  
Chunyang Wang

Convolutional neural networks provide an ideal solution for hyperspectral image (HSI) classification. However, the classification effect is not satisfactory when limited training samples are available. Focused on “small sample” hyperspectral classification, we proposed a novel 3D-2D-convolutional neural network (CNN) model named AD-HybridSN (Attention-Dense-HybridSN). In our proposed model, a dense block was used to reuse shallow features and aimed at better exploiting hierarchical spatial–spectral features. Subsequent depth separable convolutional layers were used to discriminate the spatial information. Further refinement of spatial–spectral features was realized by the channel attention method and spatial attention method, which were performed behind every 3D convolutional layer and every 2D convolutional layer, respectively. Experiment results indicate that our proposed model can learn more discriminative spatial–spectral features using very few training data. In Indian Pines, Salinas and the University of Pavia, AD-HybridSN obtain 97.02%, 99.59% and 98.32% overall accuracy using only 5%, 1% and 1% labeled data for training, respectively, which are far better than all the contrast models.


2021 ◽  
Vol 13 (18) ◽  
pp. 3590
Author(s):  
Tianyu Zhang ◽  
Cuiping Shi ◽  
Diling Liao ◽  
Liguo Wang

Convolutional neural networks (CNNs) have exhibited excellent performance in hyperspectral image classification. However, due to the lack of labeled hyperspectral data, it is difficult to achieve high classification accuracy of hyperspectral images with fewer training samples. In addition, although some deep learning techniques have been used in hyperspectral image classification, due to the abundant information of hyperspectral images, the problem of insufficient spatial spectral feature extraction still exists. To address the aforementioned issues, a spectral–spatial attention fusion with a deformable convolution residual network (SSAF-DCR) is proposed for hyperspectral image classification. The proposed network is composed of three parts, and each part is connected sequentially to extract features. In the first part, a dense spectral block is utilized to reuse spectral features as much as possible, and a spectral attention block that can refine and optimize the spectral features follows. In the second part, spatial features are extracted and selected by a dense spatial block and attention block, respectively. Then, the results of the first two parts are fused and sent to the third part, and deep spatial features are extracted by the DCR block. The above three parts realize the effective extraction of spectral–spatial features, and the experimental results for four commonly used hyperspectral datasets demonstrate that the proposed SSAF-DCR method is superior to some state-of-the-art methods with very few training samples.


Cybersecurity ◽  
2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Lulu Yang ◽  
Chen Li ◽  
Ruibang You ◽  
Bibo Tu ◽  
Linghui Li

AbstractKeystroke-based behavioral biometrics have been proven effective for continuous user authentication. Current state-of-the-art algorithms have achieved outstanding results in long text or short text collected by doing some tasks. It remains a considerable challenge to authenticate users continuously and accurately with short keystroke inputs collected in uncontrolled settings. In this work, we propose a Timely Keystroke-based method for Continuous user Authentication, named TKCA. It integrates the key name and two kinds of timing features through an embedding mechanism. And it captures the relationship between context keystrokes by the Bidirectional Long Short-Term Memory (Bi-LSTM) network. We conduct a series of experiments to validate it on a public dataset - the Clarkson II dataset collected in a completely uncontrolled and natural setting. Experiment results show that the proposed TKCA achieves state-of-the-art performance with 8.28% of EER when using only 30 keystrokes and 2.78% of EER when using 190 keystrokes.


2019 ◽  
Vol 07 (01) ◽  
pp. 19-40
Author(s):  
Shakil Ahmed Sumon ◽  
Raihan Goni ◽  
Niyaz Bin Hashem ◽  
Tanzil Shahria ◽  
Rashedur M. Rahman

In this paper, we have explored different strategies to find out the saliency of the features from different pretrained models in detecting violence in videos. A dataset has been created which consists of violent and non-violent videos of different settings. Three ImageNet models; VGG16, VGG19, ResNet50 are being used to extract features from the frames of the videos. In one of the experiments, the extracted features have been feed into a fully connected network which detects violence in frame level. Moreover, in another experiment, we have fed the extracted features of 30 frames to a long short-term memory (LSTM) network at a time. Furthermore, we have applied attention to the features extracted from the frames through spatial transformer network which also enables transformations like rotation, translation and scale. Along with these models, we have designed a custom convolutional neural network (CNN) as a feature extractor and a pretrained model which is initially trained on a movie violence dataset. In the end, the features extracted from the ResNet50 pretrained model proved to be more salient towards detecting violence. These ResNet50 features, in combination with LSTM provide an accuracy of 97.06% which is better than the other models we have experimented with.


2021 ◽  
Vol 13 (20) ◽  
pp. 4060
Author(s):  
Zhe Meng ◽  
Feng Zhao ◽  
Miaomiao Liang

Convolutional neural networks (CNNs) are the go-to model for hyperspectral image (HSI) classification because of the excellent locally contextual modeling ability that is beneficial to spatial and spectral feature extraction. However, CNNs with a limited receptive field pose challenges for modeling long-range dependencies. To solve this issue, we introduce a novel classification framework which regards the input HSI as a sequence data and is constructed exclusively with multilayer perceptrons (MLPs). Specifically, we propose a spectral-spatial MLP (SS-MLP) architecture, which uses matrix transposition and MLPs to achieve both spectral and spatial perception in global receptive field, capturing long-range dependencies and extracting more discriminative spectral-spatial features. Four benchmark HSI datasets are used to evaluate the classification performance of the proposed SS-MLP. Experimental results show that our pure MLP-based architecture outperforms other state-of-the-art convolution-based models in terms of both classification performance and computational time. When comparing with the SSSERN model, the average accuracy improvement of our approach is as high as 3.03%. We believe that our impressive experimental results will foster additional research on simple yet effective MLP-based architecture for HSI classification.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Qian Haizhong

Hyperspectral image data are widely used in real life because it contains rich spectral and spatial information. Hyperspectral image classification is to distinguish different functions based on different features. The computer performs quantitative analysis through the captured image and classifies each pixel in the image. However, the traditional deep learning-based hyperspectral image classification technology, due to insufficient spatial-spectral feature extraction, too many network layers, and complex calculations, leads to large parameters and optimizes hyperspectral images. For this reason, I proposed the I3D-CNN model. The number of classification parameters is large, and the network is complex. This method uses hyperspectral image cubes to directly extract spectral-spatial coupling features, adds depth separable convolution to 3D convolution to reextract spatial features, and extracts the parameter amount and calculation time at the same time. In addition, the model removes the pooling layer to achieve fewer parameters, smaller model scale, and easier training effects. The performance of the I3D-CNN model on the test datasets is better than other deep learning-based methods after comparison. The results show that the model still exhibits strong classification performance, reduces a large number of learning parameters, and reduces complexity. The accuracy rate, average classification accuracy rate, and kappa coefficient are all stable above 95%.


Sign in / Sign up

Export Citation Format

Share Document