scholarly journals Fine-grained classification based on multi-scale pyramid convolution networks

PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0254054
Author(s):  
Gaihua Wang ◽  
Lei Cheng ◽  
Jinheng Lin ◽  
Yingying Dai ◽  
Tianlun Zhang

The large intra-class variance and small inter-class variance are the key factor affecting fine-grained image classification. Recently, some algorithms have been more accurate and efficient. However, these methods ignore the multi-scale information of the network, resulting in insufficient ability to capture subtle changes. To solve this problem, a weakly supervised fine-grained classification network based on multi-scale pyramid is proposed in this paper. It uses pyramid convolution kernel to replace ordinary convolution kernel in residual network, which can expand the receptive field of the convolution kernel and use complementary information of different scales. Meanwhile, the weakly supervised data augmentation network (WS-DAN) is used to prevent over fitting and improve the performance of the model. In addition, a new attention module, which includes spatial attention and channel attention, is introduced to pay more attention to the object part in the image. The comprehensive experiments are carried out on three public benchmarks. It shows that the proposed method can extract subtle feature and achieve classification effectively.


Diagnostics ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 16
Author(s):  
Yunfei Liu ◽  
Pu Chen ◽  
Junran Zhang ◽  
Nian Liu ◽  
Yan Liu

Due to the high incidence of acute lymphoblastic leukemia (ALL) worldwide as well as its rapid and fatal progression, timely microscopy screening of peripheral blood smears is essential for the rapid diagnosis of ALL. However, screening manually is time-consuming and tedious and may lead to missed or misdiagnosis due to subjective bias; on the other hand, artificially intelligent diagnostic algorithms are constrained by the limited sample size of the data and are prone to overfitting, resulting in limited applications. Conventional data augmentation is commonly adopted to expand the amount of training data, avoid overfitting, and improve the performance of deep models. However, in practical applications, random data augmentation, such as random image cropping or erasing, is difficult to realistically occur in specific tasks and may instead introduce tremendous background noises that modify actual distribution of data, thereby degrading model performance. In this paper, to assist in the early and accurate diagnosis of acute lymphoblastic leukemia, we present a ternary stream-driven weakly supervised data augmentation classification network (WT-DFN) to identify lymphoblasts in a fine-grained scale using microscopic images of peripheral blood smears. Concretely, for each training image, we first generate attention maps to represent the distinguishable part of the target by weakly supervised learning. Then, guided by these attention maps, we produce the other two streams via attention cropping and attention erasing to obtain the fine-grained distinctive features. The proposed WT-DFN improves the classification accuracy of the model from two aspects: (1) in the images can be seen details since cropping attention regions provide the accurate location of the object, which ensures our model looks at the object closer and discovers certain detailed features; (2) images can be seen more since erasing attention mechanism forces the model to extract more discriminative parts’ features. Validation suggests that the proposed method is capable of addressing the high intraclass variances located in lymphocyte classes, as well as the low interclass variances between lymphoblasts and other normal or reactive lymphocytes. The proposed method yields the best performance on the public dataset and the real clinical dataset among competitive methods.



Electronics ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 1269
Author(s):  
Jiabin Luo ◽  
Wentai Lei ◽  
Feifei Hou ◽  
Chenghao Wang ◽  
Qiang Ren ◽  
...  

Ground-penetrating radar (GPR), as a non-invasive instrument, has been widely used in civil engineering. In GPR B-scan images, there may exist random noise due to the influence of the environment and equipment hardware, which complicates the interpretability of the useful information. Many methods have been proposed to eliminate or suppress the random noise. However, the existing methods have an unsatisfactory denoising effect when the image is severely contaminated by random noise. This paper proposes a multi-scale convolutional autoencoder (MCAE) to denoise GPR data. At the same time, to solve the problem of training dataset insufficiency, we designed the data augmentation strategy, Wasserstein generative adversarial network (WGAN), to increase the training dataset of MCAE. Experimental results conducted on both simulated, generated, and field datasets demonstrated that the proposed scheme has promising performance for image denoising. In terms of three indexes: the peak signal-to-noise ratio (PSNR), the time cost, and the structural similarity index (SSIM), the proposed scheme can achieve better performance of random noise suppression compared with the state-of-the-art competing methods (e.g., CAE, BM3D, WNNM).



Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 437
Author(s):  
Yuya Onozuka ◽  
Ryosuke Matsumi ◽  
Motoki Shino

Detection of traversable areas is essential to navigation of autonomous personal mobility systems in unknown pedestrian environments. However, traffic rules may recommend or require driving in specified areas, such as sidewalks, in environments where roadways and sidewalks coexist. Therefore, it is necessary for such autonomous mobility systems to estimate the areas that are mechanically traversable and recommended by traffic rules and to navigate based on this estimation. In this paper, we propose a method for weakly-supervised recommended traversable area segmentation in environments with no edges using automatically labeled images based on paths selected by humans. This approach is based on the idea that a human-selected driving path more accurately reflects both mechanical traversability and human understanding of traffic rules and visual information. In addition, we propose a data augmentation method and a loss weighting method for detecting the appropriate recommended traversable area from a single human-selected path. Evaluation of the results showed that the proposed learning methods are effective for recommended traversable area detection and found that weakly-supervised semantic segmentation using human-selected path information is useful for recommended area detection in environments with no edges.



Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3281
Author(s):  
Xu He ◽  
Yong Yin

Recently, deep learning-based techniques have shown great power in image inpainting especially dealing with squared holes. However, they fail to generate plausible results inside the missing regions for irregular and large holes as there is a lack of understanding between missing regions and existing counterparts. To overcome this limitation, we combine two non-local mechanisms including a contextual attention module (CAM) and an implicit diversified Markov random fields (ID-MRF) loss with a multi-scale architecture which uses several dense fusion blocks (DFB) based on the dense combination of dilated convolution to guide the generative network to restore discontinuous and continuous large masked areas. To prevent color discrepancies and grid-like artifacts, we apply the ID-MRF loss to improve the visual appearance by comparing similarities of long-distance feature patches. To further capture the long-term relationship of different regions in large missing regions, we introduce the CAM. Although CAM has the ability to create plausible results via reconstructing refined features, it depends on initial predicted results. Hence, we employ the DFB to obtain larger and more effective receptive fields, which benefits to predict more precise and fine-grained information for CAM. Extensive experiments on two widely-used datasets demonstrate that our proposed framework significantly outperforms the state-of-the-art approaches both in quantity and quality.



2020 ◽  
Vol 10 (24) ◽  
pp. 9132
Author(s):  
Liguo Weng ◽  
Xiaodong Zhang ◽  
Junhao Qian ◽  
Min Xia ◽  
Yiqing Xu ◽  
...  

Non-intrusive load disaggregation (NILD) is of great significance to the development of smart grids. Current energy disaggregation methods extract features from sequences, and this process easily leads to a loss of load features and difficulties in detecting, resulting in a low recognition rate of low-use electrical appliances. To solve this problem, a non-intrusive sequential energy disaggregation method based on a multi-scale attention residual network is proposed. Multi-scale convolutions are used to learn features, and the attention mechanism is used to enhance the learning ability of load features. The residual learning further improves the performance of the algorithm, avoids network degradation, and improves the precision of load decomposition. The experimental results on two benchmark datasets show that the proposed algorithm has more advantages than the existing algorithms in terms of load disaggregation accuracy and judgments of the on/off state, and the attention mechanism can further improve the disaggregation accuracy of low-frequency electrical appliances.



Sign in / Sign up

Export Citation Format

Share Document