Automatic airway tree segmentation based on multi-scale context information

Purpose This paper aims to use fully convolutional network (FCN) to predict pixel-wise antipodal grasp affordances for unknown objects and improve the grasp detection performance through multi-scale feature fusion. Design/methodology/approach A modified FCN network is used as the backbone to extract pixel-wise features from the input image, which are further fused with multi-scale context information gathered by a three-level pyramid pooling module to make more robust predictions. Based on the proposed unify feature embedding framework, two head networks are designed to implement different grasp rotation prediction strategies (regression and classification), and their performances are evaluated and compared with a defined point metric. The regression network is further extended to predict the grasp rectangles for comparisons with previous methods and real-world robotic grasping of unknown objects. Findings The ablation study of the pyramid pooling module shows that the multi-scale information fusion significantly improves the model performance. The regression approach outperforms the classification approach based on same feature embedding framework on two data sets. The regression network achieves a state-of-the-art accuracy (up to 98.9%) and speed (4 ms per image) and high success rate (97% for household objects, 94.4% for adversarial objects and 95.3% for objects in clutter) in the unknown object grasping experiment. Originality/value A novel pixel-wise grasp affordance prediction network based on multi-scale feature fusion is proposed to improve the grasp detection performance. Two prediction approaches are formulated and compared based on the proposed framework. The proposed method achieves excellent performances on three benchmark data sets and real-world robotic grasping experiment.

Download Full-text

A hybrid multi-scale approach to automatic airway tree segmentation from CT scans

2013 IEEE 10th International Symposium on Biomedical Imaging ◽

10.1109/isbi.2013.6556772 ◽

2013 ◽

Cited By ~ 8

Author(s):

Ziyue Xu ◽

Ulas Bagci ◽

Brent Foster ◽

Daniel J. Mollura

Keyword(s):

Ct Scans ◽

Multi Scale ◽

Airway Tree

Download Full-text

Incorporating attentive multi-scale context information for image captioning

Multimedia Tools and Applications ◽

10.1007/s11042-021-11895-9 ◽

2022 ◽

Author(s):

Jeripothula Prudviraj ◽

Yenduri Sravani ◽

C. Krishna Mohan

Keyword(s):

Context Information ◽

Image Captioning ◽

Multi Scale

Download Full-text

Attention-Based Multi-Context Guiding for Few-Shot Semantic Segmentation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018441 ◽

2019 ◽

Vol 33 ◽

pp. 8441-8448 ◽

Cited By ~ 14

Author(s):

Tao Hu ◽

Pengwan Yang ◽

Chiliang Zhang ◽

Gang Yu ◽

Yadong Mu ◽

...

Keyword(s):

Deep Learning ◽

Feature Fusion ◽

State Of The Art ◽

Semantic Segmentation ◽

Research Topic ◽

Context Information ◽

Multi Scale ◽

Support Set ◽

Segmentation Task ◽

Context Features

Few-shot learning is a nascent research topic, motivated by the fact that traditional deep learning methods require tremendous amounts of data. The scarcity of annotated data becomes even more challenging in semantic segmentation since pixellevel annotation in segmentation task is more labor-intensive to acquire. To tackle this issue, we propose an Attentionbased Multi-Context Guiding (A-MCG) network, which consists of three branches: the support branch, the query branch, the feature fusion branch. A key differentiator of A-MCG is the integration of multi-scale context features between support and query branches, enforcing a better guidance from the support set. In addition, we also adopt a spatial attention along the fusion branch to highlight context information from several scales, enhancing self-supervision in one-shot learning. To address the fusion problem in multi-shot learning, Conv-LSTM is adopted to collaboratively integrate the sequential support features to elevate the final accuracy. Our architecture obtains state-of-the-art on unseen classes in a variant of PASCAL VOC12 dataset and performs favorably against previous work with large gains of 1.1%, 1.4% measured in mIoU in the 1-shot and 5-shot setting.

Download Full-text

Multi-Scale Dense Attention Network for Stereo Matching

Electronics ◽

10.3390/electronics9111881 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1881

Author(s):

Yuhui Chang ◽

Jiangtao Xu ◽

Zhiyuan Gao

Keyword(s):

Feature Extraction ◽

Stereo Matching ◽

State Of The Art ◽

Ground Truth ◽

Context Information ◽

Context Aware ◽

Feature Maps ◽

Attention Network ◽

Multi Scale ◽

Benchmark Datasets

To improve the accuracy of stereo matching, the multi-scale dense attention network (MDA-Net) is proposed. The network introduces two novel modules in the feature extraction stage to achieve better exploit of context information: dual-path upsampling (DU) block and attention-guided context-aware pyramid feature extraction (ACPFE) block. The DU block is introduced to fuse different scale feature maps. It introduces sub-pixel convolution to compensate for the loss of information caused by the traditional interpolation upsampling method. The ACPFE block is proposed to extract multi-scale context information. Pyramid atrous convolution is adopted to exploit multi-scale features and the channel-attention is used to fuse the multi-scale features. The proposed network has been evaluated on several benchmark datasets. The three-pixel-error evaluated over all ground truth pixels is 2.10% on KITTI 2015 dataset. The experiment results prove that MDA-Net achieves state-of-the-art accuracy on KITTI 2012 and 2015 datasets.

Download Full-text

Congested Crowd Counting via Adaptive Multi-Scale Context Learning

Sensors ◽

10.3390/s21113777 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3777

Author(s):

Yani Zhang ◽

Huailin Zhao ◽

Zuodong Duan ◽

Liangjun Huang ◽

Jiahao Deng ◽

...

Keyword(s):

Density Estimation ◽

State Of The Art ◽

Spatial Context ◽

Context Information ◽

Crowd Counting ◽

Multi Scale ◽

Context Learning ◽

Crowd Density Estimation ◽

Crowd Density ◽

Density Results

In this paper, we propose a novel congested crowd counting network for crowd density estimation, i.e., the Adaptive Multi-scale Context Aggregation Network (MSCANet). MSCANet efficiently leverages the spatial context information to accomplish crowd density estimation in a complicated crowd scene. To achieve this, a multi-scale context learning block, called the Multi-scale Context Aggregation module (MSCA), is proposed to first extract different scale information and then adaptively aggregate it to capture the full scale of the crowd. Employing multiple MSCAs in a cascaded manner, the MSCANet can deeply utilize the spatial context information and modulate preliminary features into more distinguishing and scale-sensitive features, which are finally applied to a 1 × 1 convolution operation to obtain the crowd density results. Extensive experiments on three challenging crowd counting benchmarks showed that our model yielded compelling performance against the other state-of-the-art methods. To thoroughly prove the generality of MSCANet, we extend our method to two relevant tasks: crowd localization and remote sensing object counting. The extension experiment results also confirmed the effectiveness of MSCANet.

Download Full-text

Multi-Scale Vehicle Detection in High-Resolution Aerial Images With Context Information

IEEE Access ◽

10.1109/access.2020.3036075 ◽

2020 ◽

Vol 8 ◽

pp. 208643-208657

Author(s):

Xianghui Li ◽

Xinde Li ◽

Hong Pan

Keyword(s):

High Resolution ◽

Vehicle Detection ◽

Aerial Images ◽

Context Information ◽

Multi Scale

Download Full-text

Global-and-Local Context Network for Semantic Segmentation of Street View Images

Sensors ◽

10.3390/s20102907 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2907 ◽

Cited By ~ 4

Author(s):

Chih-Yang Lin ◽

Yi-Cheng Chiu ◽

Hui-Fuang Ng ◽

Timothy K. Shih ◽

Kuan-Hung Lin

Keyword(s):

Network Architecture ◽

Spatial Information ◽

Semantic Segmentation ◽

Local Network ◽

Local Context ◽

Context Information ◽

Test Accuracy ◽

Multi Scale ◽

Street View ◽

Global And Local

Semantic segmentation of street view images is an important step in scene understanding for autonomous vehicle systems. Recent works have made significant progress in pixel-level labeling using Fully Convolutional Network (FCN) framework and local multi-scale context information. Rich global context information is also essential in the segmentation process. However, a systematic way to utilize both global and local contextual information in a single network has not been fully investigated. In this paper, we propose a global-and-local network architecture (GLNet) which incorporates global spatial information and dense local multi-scale context information to model the relationship between objects in a scene, thus reducing segmentation errors. A channel attention module is designed to further refine the segmentation results using low-level features from the feature map. Experimental results demonstrate that our proposed GLNet achieves 80.8% test accuracy on the Cityscapes test dataset, comparing favorably with existing state-of-the-art methods.

Download Full-text

Class-Specific Anchor Based and Context-Guided Multi-Class Object Detection in High Resolution Remote Sensing Imagery with a Convolutional Neural Network

Remote Sensing ◽

10.3390/rs11030272 ◽

2019 ◽

Vol 11 (3) ◽

pp. 272 ◽

Cited By ~ 4

Author(s):

Nan Mo ◽

Li Yan ◽

Ruixi Zhu ◽

Hong Xie

Keyword(s):

Neural Network ◽

Remote Sensing ◽

High Resolution ◽

Object Detection ◽

Convolutional Neural Network ◽

Large Scale ◽

Shape Change ◽

Context Information ◽

False Detection ◽

Multi Scale

In this paper, the problem of multi-scale geospatial object detection in High Resolution Remote Sensing Images (HRRSI) is tackled. The different flight heights, shooting angles and sizes of geographic objects in the HRRSI lead to large scale variance in geographic objects. The inappropriate anchor size to propose the objects and the indiscriminative ability of features for describing the objects are the main causes of missing detection and false detection in multi-scale geographic object detection. To address these challenges, we propose a class-specific anchor based and context-guided multi-class object detection method with a convolutional neural network (CNN), which can be divided into two parts: a class-specific anchor based region proposal network (RPN) and a discriminative feature with a context information classification network. A class-specific anchor block providing better initial values for RPN is proposed to generate the anchor of the most suitable scale for each category in order to increase the recall ratio. Meanwhile, we proposed to incorporate the context information into the original convolutional feature to improve the discriminative ability of the features and increase classification accuracy. Considering the quality of samples for classification, the soft filter is proposed to select effective boxes to improve the diversity of the samples for the classifier and avoid missing or false detection to some extent. We also introduced the focal loss in order to improve the classifier in classifying the hard samples. The proposed method is tested on a benchmark dataset of ten classes to prove the superiority. The proposed method outperforms some state-of-the-art methods with a mean average precision (mAP) of 90.4% and better detects the multi-scale objects, especially when objects show a minor shape change.

Download Full-text

Automatic Stereotype Activation Is Context Dependent

Social Psychology ◽

10.1027/1864-9335/a000019 ◽

2010 ◽

Vol 41 (3) ◽

pp. 131-136 ◽

Cited By ~ 35

Author(s):

Catharina Casper ◽

Klaus Rothermund ◽

Dirk Wentura

Keyword(s):

Lexical Decision ◽

Context Information ◽

Decision Task ◽

Social Categories ◽

Priming Effects ◽

Stereotype Activation ◽

Social Stereotypes ◽

Priming Paradigm ◽

Automatic Activation ◽

Mental Schemas

Processes involving an automatic activation of stereotypes in different contexts were investigated using a priming paradigm with the lexical decision task. The names of social categories were combined with background pictures of specific situations to yield a compound prime comprising category and context information. Significant category priming effects for stereotypic attributes (e.g., Bavarians – beer) emerged for fitting contexts (e.g., in combination with a picture of a marquee) but not for nonfitting contexts (e.g., in combination with a picture of a shop). Findings indicate that social stereotypes are organized as specific mental schemas that are triggered by a combination of category and context information.

Download Full-text