Fine-grained classification based on multi-scale pyramid convolution networks

Gaihua Wang; Lei Cheng; Jinheng Lin; Yingying Dai; Tianlun Zhang

doi:10.1371/journal.pone.0254054

Fine-grained classification based on multi-scale pyramid convolution networks

PLoS ONE ◽

10.1371/journal.pone.0254054 ◽

2021 ◽

Vol 16 (7) ◽

pp. e0254054

Author(s):

Gaihua Wang ◽

Lei Cheng ◽

Jinheng Lin ◽

Yingying Dai ◽

Tianlun Zhang

Keyword(s):

Data Augmentation ◽

Convolution Kernel ◽

Residual Network ◽

Complementary Information ◽

Fine Grained ◽

Multi Scale ◽

Key Factor ◽

Object Part ◽

Weakly Supervised ◽

Subtle Feature

The large intra-class variance and small inter-class variance are the key factor affecting fine-grained image classification. Recently, some algorithms have been more accurate and efficient. However, these methods ignore the multi-scale information of the network, resulting in insufficient ability to capture subtle changes. To solve this problem, a weakly supervised fine-grained classification network based on multi-scale pyramid is proposed in this paper. It uses pyramid convolution kernel to replace ordinary convolution kernel in residual network, which can expand the receptive field of the convolution kernel and use complementary information of different scales. Meanwhile, the weakly supervised data augmentation network (WS-DAN) is used to prevent over fitting and improve the performance of the model. In addition, a new attention module, which includes spatial attention and channel attention, is introduced to pay more attention to the object part in the image. The comprehensive experiments are carried out on three public benchmarks. It shows that the proposed method can extract subtle feature and achieve classification effectively.

Weakly Supervised Ternary Stream Data Augmentation Fine-Grained Classification Network for Identifying Acute Lymphoblastic Leukemia

Diagnostics ◽

10.3390/diagnostics12010016 ◽

2021 ◽

Vol 12 (1) ◽

pp. 16

Author(s):

Yunfei Liu ◽

Pu Chen ◽

Junran Zhang ◽

Nian Liu ◽

Yan Liu

Keyword(s):

Acute Lymphoblastic Leukemia ◽

Peripheral Blood ◽

Data Augmentation ◽

Lymphoblastic Leukemia ◽

The Other ◽

Actual Distribution ◽

Fine Grained ◽

Blood Smears ◽

Weakly Supervised ◽

Peripheral Blood Smears

Due to the high incidence of acute lymphoblastic leukemia (ALL) worldwide as well as its rapid and fatal progression, timely microscopy screening of peripheral blood smears is essential for the rapid diagnosis of ALL. However, screening manually is time-consuming and tedious and may lead to missed or misdiagnosis due to subjective bias; on the other hand, artificially intelligent diagnostic algorithms are constrained by the limited sample size of the data and are prone to overfitting, resulting in limited applications. Conventional data augmentation is commonly adopted to expand the amount of training data, avoid overfitting, and improve the performance of deep models. However, in practical applications, random data augmentation, such as random image cropping or erasing, is difficult to realistically occur in specific tasks and may instead introduce tremendous background noises that modify actual distribution of data, thereby degrading model performance. In this paper, to assist in the early and accurate diagnosis of acute lymphoblastic leukemia, we present a ternary stream-driven weakly supervised data augmentation classification network (WT-DFN) to identify lymphoblasts in a fine-grained scale using microscopic images of peripheral blood smears. Concretely, for each training image, we first generate attention maps to represent the distinguishable part of the target by weakly supervised learning. Then, guided by these attention maps, we produce the other two streams via attention cropping and attention erasing to obtain the fine-grained distinctive features. The proposed WT-DFN improves the classification accuracy of the model from two aspects: (1) in the images can be seen details since cropping attention regions provide the accurate location of the object, which ensures our model looks at the object closer and discovers certain detailed features; (2) images can be seen more since erasing attention mechanism forces the model to extract more discriminative parts’ features. Validation suggests that the proposed method is capable of addressing the high intraclass variances located in lymphocyte classes, as well as the low interclass variances between lymphoblasts and other normal or reactive lymphocytes. The proposed method yields the best performance on the public dataset and the real clinical dataset among competitive methods.

Weakly Supervised Learning of Object-Part Attention Model for Fine-Grained Image Classification

2018 IEEE 18th International Conference on Communication Technology (ICCT) ◽

10.1109/icct.2018.8600125 ◽

2018 ◽

Cited By ~ 1

Author(s):

Chenxi Lei ◽

Linfeng Jiang ◽

Jingshen Ji ◽

Weilin Zhong ◽

Huilin Xiong

Keyword(s):

Image Classification ◽

Supervised Learning ◽

Weakly Supervised Learning ◽

Attention Model ◽

Fine Grained ◽

Object Part ◽

Weakly Supervised

GPR B-Scan Image Denoising via Multi-Scale Convolutional Autoencoder with Data Augmentation

Electronics ◽

10.3390/electronics10111269 ◽

2021 ◽

Vol 10 (11) ◽

pp. 1269

Author(s):

Jiabin Luo ◽

Wentai Lei ◽

Feifei Hou ◽

Chenghao Wang ◽

Qiang Ren ◽

...

Keyword(s):

Image Denoising ◽

Data Augmentation ◽

Noise Suppression ◽

Random Noise ◽

Similarity Index ◽

Structural Similarity ◽

Training Dataset ◽

Generative Adversarial Network ◽

Multi Scale ◽

Convolutional Autoencoder

Ground-penetrating radar (GPR), as a non-invasive instrument, has been widely used in civil engineering. In GPR B-scan images, there may exist random noise due to the influence of the environment and equipment hardware, which complicates the interpretability of the useful information. Many methods have been proposed to eliminate or suppress the random noise. However, the existing methods have an unsatisfactory denoising effect when the image is severely contaminated by random noise. This paper proposes a multi-scale convolutional autoencoder (MCAE) to denoise GPR data. At the same time, to solve the problem of training dataset insufficiency, we designed the data augmentation strategy, Wasserstein generative adversarial network (WGAN), to increase the training dataset of MCAE. Experimental results conducted on both simulated, generated, and field datasets demonstrated that the proposed scheme has promising performance for image denoising. In terms of three indexes: the peak signal-to-noise ratio (PSNR), the time cost, and the structural similarity index (SSIM), the proposed scheme can achieve better performance of random noise suppression compared with the state-of-the-art competing methods (e.g., CAE, BM3D, WNNM).

Multi-scale Information Extraction Residual Network for 3D Point Clouds Classification

2020 Chinese Automation Congress (CAC) ◽

10.1109/cac51589.2020.9327126 ◽

2020 ◽

Author(s):

Yumei Li ◽

Shan Meng ◽

Daoyuan Liang

Keyword(s):

Information Extraction ◽

Point Clouds ◽

Residual Network ◽

Multi Scale ◽

3D Point Clouds

Weakly-Supervised Recommended Traversable Area Segmentation Using Automatically Labeled Images for Autonomous Driving in Pedestrian Environment with No Edges

Sensors ◽

10.3390/s21020437 ◽

2021 ◽

Vol 21 (2) ◽

pp. 437

Author(s):

Yuya Onozuka ◽

Ryosuke Matsumi ◽

Motoki Shino

Keyword(s):

Visual Information ◽

Data Augmentation ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Weighting Method ◽

Personal Mobility ◽

Human Understanding ◽

Autonomous Mobility ◽

Weakly Supervised ◽

Traffic Rules

Detection of traversable areas is essential to navigation of autonomous personal mobility systems in unknown pedestrian environments. However, traffic rules may recommend or require driving in specified areas, such as sidewalks, in environments where roadways and sidewalks coexist. Therefore, it is necessary for such autonomous mobility systems to estimate the areas that are mechanically traversable and recommended by traffic rules and to navigate based on this estimation. In this paper, we propose a method for weakly-supervised recommended traversable area segmentation in environments with no edges using automatically labeled images based on paths selected by humans. This approach is based on the idea that a human-selected driving path more accurately reflects both mechanical traversability and human understanding of traffic rules and visual information. In addition, we propose a data augmentation method and a loss weighting method for detecting the appropriate recommended traversable area from a single human-selected path. Evaluation of the results showed that the proposed learning methods are effective for recommended traversable area detection and found that weakly-supervised semantic segmentation using human-selected path information is useful for recommended area detection in environments with no edges.

Non-Local and Multi-Scale Mechanisms for Image Inpainting

Sensors ◽

10.3390/s21093281 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3281

Author(s):

Xu He ◽

Yong Yin

Keyword(s):

Markov Random Fields ◽

Receptive Fields ◽

Image Inpainting ◽

Long Distance ◽

Visual Appearance ◽

Fine Grained ◽

Multi Scale ◽

Non Local ◽

Relationship Of

Recently, deep learning-based techniques have shown great power in image inpainting especially dealing with squared holes. However, they fail to generate plausible results inside the missing regions for irregular and large holes as there is a lack of understanding between missing regions and existing counterparts. To overcome this limitation, we combine two non-local mechanisms including a contextual attention module (CAM) and an implicit diversified Markov random fields (ID-MRF) loss with a multi-scale architecture which uses several dense fusion blocks (DFB) based on the dense combination of dilated convolution to guide the generative network to restore discontinuous and continuous large masked areas. To prevent color discrepancies and grid-like artifacts, we apply the ID-MRF loss to improve the visual appearance by comparing similarities of long-distance feature patches. To further capture the long-term relationship of different regions in large missing regions, we introduce the CAM. Although CAM has the ability to create plausible results via reconstructing refined features, it depends on initial predicted results. Hence, we employ the DFB to obtain larger and more effective receptive fields, which benefits to predict more precise and fine-grained information for CAM. Extensive experiments on two widely-used datasets demonstrate that our proposed framework significantly outperforms the state-of-the-art approaches both in quantity and quality.

SE-ECGNet: A Multi-scale Deep Residual Network with Squeeze-and-Excitation Module for ECG Signal Classification

2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm49941.2020.9313548 ◽

2020 ◽

Author(s):

Haozhen Zhang ◽

Wei Zhao ◽

Shuang Liu

Keyword(s):

Signal Classification ◽

Ecg Signal ◽

Residual Network ◽

Multi Scale

Non-Intrusive Load Disaggregation Based on a Multi-Scale Attention Residual Network

Applied Sciences ◽

10.3390/app10249132 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9132

Author(s):

Liguo Weng ◽

Xiaodong Zhang ◽

Junhao Qian ◽

Min Xia ◽

Yiqing Xu ◽

...

Keyword(s):

Smart Grids ◽

Recognition Rate ◽

Low Frequency ◽

Attention Mechanism ◽

Learning Ability ◽

Residual Network ◽

Multi Scale ◽

Energy Disaggregation ◽

Benchmark Datasets ◽

Load Disaggregation

Non-intrusive load disaggregation (NILD) is of great significance to the development of smart grids. Current energy disaggregation methods extract features from sequences, and this process easily leads to a loss of load features and difficulties in detecting, resulting in a low recognition rate of low-use electrical appliances. To solve this problem, a non-intrusive sequential energy disaggregation method based on a multi-scale attention residual network is proposed. Multi-scale convolutions are used to learn features, and the attention mechanism is used to enhance the learning ability of load features. The residual learning further improves the performance of the algorithm, avoids network degradation, and improves the precision of load decomposition. The experimental results on two benchmark datasets show that the proposed algorithm has more advantages than the existing algorithms in terms of load disaggregation accuracy and judgments of the on/off state, and the attention mechanism can further improve the disaggregation accuracy of low-frequency electrical appliances.

Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition

2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) ◽

10.1109/vcip49819.2020.9301763 ◽

2020 ◽

Author(s):

Hao Li ◽

Xiaopeng Zhang ◽

Qi Tian ◽

Hongkai Xiong

Keyword(s):

Data Augmentation ◽

Fine Grained ◽

Semantic Data

A Saliency-based Weakly-supervised Network for Fine-Grained Image Categorization

2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) ◽

10.1109/cisp-bmei51763.2020.9263683 ◽

2020 ◽

Author(s):

Yawen Han ◽

Fang Meng

Keyword(s):

Image Categorization ◽

Fine Grained ◽

Weakly Supervised