UAV Image Multi-Labeling with Data-Efficient Transformers

In this paper, we present an approach for the multi-label classification of remote sensing images based on data-efficient transformers. During the training phase, we generated a second view for each image from the training set using data augmentation. Then, both the image and its augmented version were reshaped into a sequence of flattened patches and then fed to the transformer encoder. The latter extracts a compact feature representation from each image with the help of a self-attention mechanism, which can handle the global dependencies between different regions of the high-resolution aerial image. On the top of the encoder, we mounted two classifiers, a token and a distiller classifier. During training, we minimized a global loss consisting of two terms, each corresponding to one of the two classifiers. In the test phase, we considered the average of the two classifiers as the final class labels. Experiments on two datasets acquired over the cities of Trento and Civezzano with a ground resolution of two-centimeter demonstrated the effectiveness of the proposed model.

Download Full-text

Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks

npj Computational Materials ◽

10.1038/s41524-019-0196-x ◽

2019 ◽

Vol 5 (1) ◽

Cited By ~ 23

Author(s):

Felipe Oviedo ◽

Zekun Ren ◽

Shijing Sun ◽

Charles Settens ◽

Zhe Liu ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Augmentation ◽

X Ray Diffraction ◽

X Ray ◽

Small X ◽

Using Data ◽

Interpretable Classification

Download Full-text

Large-Scale Whale-Call Classification by Transfer Learning on Multi-Scale Waveforms and Time-Frequency Features

Applied Sciences ◽

10.3390/app9051020 ◽

2019 ◽

Vol 9 (5) ◽

pp. 1020 ◽

Cited By ~ 6

Author(s):

Lilun Zhang ◽

Dezhi Wang ◽

Changchun Bao ◽

Yongxian Wang ◽

Kele Xu

Keyword(s):

Transfer Learning ◽

Large Scale ◽

Data Augmentation ◽

Feature Representation ◽

Biological Research ◽

Time Frequency ◽

Feature Representations ◽

Multi Scale ◽

Data Driven Approach

Whale vocal calls contain valuable information and abundant characteristics that are important for classification of whale sub-populations and related biological research. In this study, an effective data-driven approach based on pre-trained Convolutional Neural Networks (CNN) using multi-scale waveforms and time-frequency feature representations is developed in order to perform the classification of whale calls from a large open-source dataset recorded by sensors carried by whales. Specifically, the classification is carried out through a transfer learning approach by using pre-trained state-of-the-art CNN models in the field of computer vision. 1D raw waveforms and 2D log-mel features of the whale-call data are respectively used as the input of CNN models. For raw waveform input, windows are applied to capture multiple sketches of a whale-call clip at different time scales and stack the features from different sketches for classification. When using the log-mel features, the delta and delta-delta features are also calculated to produce a 3-channel feature representation for analysis. In the training, a 4-fold cross-validation technique is employed to reduce the overfitting effect, while the Mix-up technique is also applied to implement data augmentation in order to further improve the system performance. The results show that the proposed method can improve the accuracies by more than 20% in percentage for the classification into 16 whale pods compared with the baseline method using groups of 2D shape descriptors of spectrograms and the Fisher discriminant scores on the same dataset. Moreover, it is shown that classifications based on log-mel features have higher accuracies than those based directly on raw waveforms. The phylogeny graph is also produced to significantly illustrate the relationships among the whale sub-populations.

Download Full-text

Multi-Class classification of brain tumor images using data augmentation with deep neural network

Materials Today Proceedings ◽

10.1016/j.matpr.2021.01.601 ◽

2021 ◽

Author(s):

B. Srikanth ◽

S. Venkata Suryanarayana

Keyword(s):

Neural Network ◽

Brain Tumor ◽

Deep Neural Network ◽

Data Augmentation ◽

Using Data ◽

Multi Class Classification

Download Full-text

Improved Active Deep Learning for Semi-Supervised Classification of Hyperspectral Image

Remote Sensing ◽

10.3390/rs14010171 ◽

2021 ◽

Vol 14 (1) ◽

pp. 171

Author(s):

Qingyan Wang ◽

Meng Chen ◽

Junping Zhang ◽

Shouqiang Kang ◽

Yujing Wang

Keyword(s):

Deep Learning ◽

Supervised Classification ◽

Hyperspectral Image ◽

Graph Algorithm ◽

Hyperspectral Images ◽

Training Set ◽

Proposed Model ◽

Deep Networks ◽

Classification Tasks

Hyperspectral image (HSI) data classification often faces the problem of the scarcity of labeled samples, which is considered to be one of the major challenges in the field of remote sensing. Although active deep networks have been successfully applied in semi-supervised classification tasks to address this problem, their performance inevitably meets the bottleneck due to the limitation of labeling cost. To address the aforementioned issue, this paper proposes a semi-supervised classification method for hyperspectral images that improves active deep learning. Specifically, the proposed model introduces the random multi-graph algorithm and replaces the expert mark in active learning with the anchor graph algorithm, which can label a considerable amount of unlabeled data precisely and automatically. In this way, a large number of pseudo-labeling samples would be added to the training subsets such that the model could be fine-tuned and the generalization performance could be improved without extra efforts for data manual labeling. Experiments based on three standard HSIs demonstrate that the proposed model can get better performance than other conventional methods, and they also outperform other studied algorithms in the case of a small training set.

Download Full-text

Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network

Remote Sensing ◽

10.3390/rs11070830 ◽

2019 ◽

Vol 11 (7) ◽

pp. 830 ◽

Cited By ~ 26

Author(s):

Penghua Liu ◽

Xiaoping Liu ◽

Mengxi Liu ◽

Qian Shi ◽

Jinxing Yang ◽

...

Keyword(s):

Remote Sensing ◽

Large Scale ◽

Rapid Development ◽

Morphological Characteristics ◽

Aerial Image ◽

Model Parameters ◽

Building Detection ◽

Remote Sensing Images ◽

Convolutional Network ◽

Proposed Model

The rapid development in deep learning and computer vision has introduced new opportunities and paradigms for building extraction from remote sensing images. In this paper, we propose a novel fully convolutional network (FCN), in which a spatial residual inception (SRI) module is proposed to capture and aggregate multi-scale contexts for semantic understanding by successively fusing multi-level features. The proposed SRI-Net is capable of accurately detecting large buildings that might be easily omitted while retaining global morphological characteristics and local details. On the other hand, to improve computational efficiency, depthwise separable convolutions and convolution factorization are introduced to significantly decrease the number of model parameters. The proposed model is evaluated on the Inria Aerial Image Labeling Dataset and the Wuhan University (WHU) Aerial Building Dataset. The experimental results show that the proposed methods exhibit significant improvements compared with several state-of-the-art FCNs, including SegNet, U-Net, RefineNet, and DeepLab v3+. The proposed model shows promising potential for building detection from remote sensing images on a large scale.

Download Full-text

Prototype Calibration with Feature Generation for Few-Shot Remote Sensing Image Scene Classification

Remote Sensing ◽

10.3390/rs13142728 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2728

Author(s):

Qingjie Zeng ◽

Jie Geng ◽

Kai Huang ◽

Wen Jiang ◽

Jun Guo

Keyword(s):

Remote Sensing ◽

Remote Sensing Image ◽

Classification Performance ◽

Scene Classification ◽

Remote Sensing Images ◽

Training Set ◽

Shot Classification ◽

Support Set ◽

Feature Expression

Few-shot classification of remote sensing images has attracted attention due to its important applications in various fields. The major challenge in few-shot remote sensing image scene classification is that limited labeled samples can be utilized for training. This may lead to the deviation of prototype feature expression, and thus the classification performance will be impacted. To solve these issues, a prototype calibration with a feature-generating model is proposed for few-shot remote sensing image scene classification. In the proposed framework, a feature encoder with self-attention is developed to reduce the influence of irrelevant information. Then, the feature-generating module is utilized to expand the support set of the testing set based on prototypes of the training set, and prototype calibration is proposed to optimize features of support images that can enhance the representativeness of each category features. Experiments on NWPU-RESISC45 and WHU-RS19 datasets demonstrate that the proposed method can yield superior classification accuracies for few-shot remote sensing image scene classification.

Download Full-text

Classification of tectonic and non-tectonic seismicity based on convolutional neural network

Geophysical Journal International ◽

10.1093/gji/ggaa444 ◽

2020 ◽

Vol 224 (1) ◽

pp. 191-198

Author(s):

Xinliang Liu ◽

Tao Ren ◽

Hongfeng Chen ◽

Yufeng Chen

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

Proposed Model ◽

Single Station ◽

Using Data ◽

Fully Connected

SUMMARY In this paper, convolutional neural networks (CNNs) were used to distinguish between tectonic and non-tectonic seismicity. The proposed CNNs consisted of seven convolutional layers with small kernels and one fully connected layer, which only relied on the acoustic waveform without extracting features manually. For a single station, the accuracy of the model was 0.90, and the event accuracy could reach 0.93. The proposed model was tested using data from January 2019 to August 2019 in China. The event accuracy could reach 0.92, showing that the proposed model could distinguish between tectonic and non-tectonic seismicity.

Download Full-text

Deep Hash Remote Sensing Image Retrieval with Hard Probability Sampling

Remote Sensing ◽

10.3390/rs12172789 ◽

2020 ◽

Vol 12 (17) ◽

pp. 2789 ◽

Cited By ~ 1

Author(s):

Xue Shan ◽

Pingping Liu ◽

Guixia Gou ◽

Qiuzhan Zhou ◽

Zhen Wang

Keyword(s):

Remote Sensing ◽

Image Retrieval ◽

Data Augmentation ◽

Satellite Observation ◽

Remote Sensing Image ◽

Aerial Image ◽

Remote Sensing Images ◽

Retrieval Method ◽

Probability Sampling ◽

Memory Space

As satellite observation technology improves, the number of remote sensing images significantly and rapidly increases. Therefore, a growing number of studies are focusing on remote sensing image retrieval. However, having a large number of remote sensing images considerably slows the retrieval time and takes up a great deal of memory space. The hash method is being increasingly used for rapid image retrieval because of its remarkably fast performance. At the same time, selecting samples that contain more information and greater stability to train the network has gradually become the key to improving retrieval performance. Given the above considerations, we propose a deep hash remote sensing image retrieval method, called the hard probability sampling hash retrieval method (HPSH), which combines hash code learning with hard probability sampling in a deep network. Specifically, we used a probability sampling method to select training samples, and we designed one novel hash loss function to better train the network parameters and reduce the hashing accuracy loss due to quantization. Our experimental results demonstrate that HPSH could yield an excellent representation compared with other state-of-the-art hash approaches. For the university of California, merced (UCMD) dataset, HPSH+S resulted in a mean average precision (mAP) of up to 90.9% on 16 hash bits, 92.2% on 24 hash bits, and 92.8% on 32 hash bits. For the aerial image dataset (AID), HPSH+S achieved a mAP of up to 89.8% on 16 hash bits, 93.6% on 24 hash bits, and 95.5% on 32 hash bits. For the UCMD dataset, with the use of data augmentation, our proposed approach achieved a mAP of up to 99.6% on 32 hash bits and 99.7% on 64 hash bits.

Download Full-text

Bangladeshi Native Vehicle Classification Based on Transfer Learning with Deep Convolutional Neural Network

Sensors ◽

10.3390/s21227545 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7545

Author(s):

Md Mahibul Hasan ◽

Zhijie Wang ◽

Muhammad Ather Iqbal Hussain ◽

Kaniz Fatima

Keyword(s):

Transfer Learning ◽

Data Augmentation ◽

Intelligent Transportation System ◽

Vehicle Classification ◽

Improve Performance ◽

Vehicle Type ◽

Proposed Model ◽

Vehicle Type Classification ◽

Type Classification

Vehicle type classification plays an essential role in developing an intelligent transportation system (ITS). Based on the modern accomplishments of deep learning (DL) on image classification, we proposed a model based on transfer learning, incorporating data augmentation, for the recognition and classification of Bangladeshi native vehicle types. An extensive dataset of Bangladeshi native vehicles, encompassing 10,440 images, was developed. Here, the images are categorized into 13 common vehicle classes in Bangladesh. The method utilized was a residual network (ResNet-50)-based model, with extra classification blocks added to improve performance. Here, vehicle type features were automatically extracted and categorized. While conducting the analysis, a variety of metrics was used for the evaluation, including accuracy, precision, recall, and F1 − Score. In spite of the changing physical properties of the vehicles, the proposed model achieved progressive accuracy. Our proposed method surpasses the existing baseline method as well as two pre-trained DL approaches, AlexNet and VGG-16. Based on result comparisons, we have seen that, in the classification of Bangladeshi native vehicle types, our suggested ResNet-50 pre-trained model achieves an accuracy of 98.00%.

Download Full-text

Improved Arabic Alphabet Characters Classification Using Convolutional Neural Networks (CNN)

Computational Intelligence and Neuroscience ◽

10.1155/2022/9965426 ◽

2022 ◽

Vol 2022 ◽

pp. 1-16

Author(s):

Nesrine Wagaa ◽

Hichem Kallel ◽

Nédra Mellouli

Keyword(s):

Neural Network ◽

Data Augmentation ◽

State Of The Art ◽

Optimization Algorithms ◽

Convolution Neural Network ◽

Proposed Model ◽

Augmentation Techniques ◽

High Recognition Accuracy ◽

Suitable Change

Handwritten characters recognition is a challenging research topic. A lot of works have been present to recognize letters of different languages. The availability of Arabic handwritten characters databases is limited. Motivated by this topic of research, we propose a convolution neural network for the classification of Arabic handwritten letters. Also, seven optimization algorithms are performed, and the best algorithm is reported. Faced with few available Arabic handwritten datasets, various data augmentation techniques are implemented to improve the robustness needed for the convolution neural network model. The proposed model is improved by using the dropout regularization method to avoid data overfitting problems. Moreover, suitable change is presented in the choice of optimization algorithms and data augmentation approaches to achieve a good performance. The model has been trained on two Arabic handwritten characters datasets AHCD and Hijja. The proposed algorithm achieved high recognition accuracy of 98.48% and 91.24% on AHCD and Hijja, respectively, outperforming other state-of-the-art models.

Download Full-text