scholarly journals Correlation Matters: Multi-scale Fine-Grained Contextual Information Extraction for Hepatic Tumor Segmentation

Author(s):  
Shuchao Pang ◽  
Anan Du ◽  
Zhenmei Yu ◽  
Mehmet A. Orgun
Author(s):  
Xinhai Liu ◽  
Zhizhong Han ◽  
Yu-Shen Liu ◽  
Matthias Zwicker

Exploring contextual information in the local region is important for shape understanding and analysis. Existing studies often employ hand-crafted or explicit ways to encode contextual information of local regions. However, it is hard to capture fine-grained contextual information in hand-crafted or explicit manners, such as the correlation between different areas in a local region, which limits the discriminative ability of learned features. To resolve this issue, we propose a novel deep learning model for 3D point clouds, named Point2Sequence, to learn 3D shape features by capturing fine-grained contextual information in a novel implicit way. Point2Sequence employs a novel sequence learning model for point clouds to capture the correlations by aggregating multi-scale areas of each local region with attention. Specifically, Point2Sequence first learns the feature of each area scale in a local region. Then, it captures the correlation between area scales in the process of aggregating all area scales using a recurrent neural network (RNN) based encoder-decoder structure, where an attention mechanism is proposed to highlight the importance of different area scales. Experimental results show that Point2Sequence achieves state-of-the-art performance in shape classification and segmentation tasks.


2021 ◽  
Vol 12 ◽  
Author(s):  
Sarahi Rosas-Gonzalez ◽  
Taibou Birgui-Sekou ◽  
Moncef Hidane ◽  
Ilyess Zemmoura ◽  
Clovis Tauber

Accurate brain tumor segmentation is crucial for clinical assessment, follow-up, and subsequent treatment of gliomas. While convolutional neural networks (CNN) have become state of the art in this task, most proposed models either use 2D architectures ignoring 3D contextual information or 3D models requiring large memory capacity and extensive learning databases. In this study, an ensemble of two kinds of U-Net-like models based on both 3D and 2.5D convolutions is proposed to segment multimodal magnetic resonance images (MRI). The 3D model uses concatenated data in a modified U-Net architecture. In contrast, the 2.5D model is based on a multi-input strategy to extract low-level features from each modality independently and on a new 2.5D Multi-View Inception block that aims to merge features from different views of a 3D image aggregating multi-scale features. The Asymmetric Ensemble of Asymmetric U-Net (AE AU-Net) based on both is designed to find a balance between increasing multi-scale and 3D contextual information extraction and keeping memory consumption low. Experiments on 2019 dataset show that our model improves enhancing tumor sub-region segmentation. Overall, performance is comparable with state-of-the-art results, although with less learning data or memory requirements. In addition, we provide voxel-wise and structure-wise uncertainties of the segmentation results, and we have established qualitative and quantitative relationships between uncertainty and prediction errors. Dice similarity coefficient for the whole tumor, tumor core, and tumor enhancing regions on BraTS 2019 validation dataset were 0.902, 0.815, and 0.773. We also applied our method in BraTS 2018 with corresponding Dice score values of 0.908, 0.838, and 0.800.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3281
Author(s):  
Xu He ◽  
Yong Yin

Recently, deep learning-based techniques have shown great power in image inpainting especially dealing with squared holes. However, they fail to generate plausible results inside the missing regions for irregular and large holes as there is a lack of understanding between missing regions and existing counterparts. To overcome this limitation, we combine two non-local mechanisms including a contextual attention module (CAM) and an implicit diversified Markov random fields (ID-MRF) loss with a multi-scale architecture which uses several dense fusion blocks (DFB) based on the dense combination of dilated convolution to guide the generative network to restore discontinuous and continuous large masked areas. To prevent color discrepancies and grid-like artifacts, we apply the ID-MRF loss to improve the visual appearance by comparing similarities of long-distance feature patches. To further capture the long-term relationship of different regions in large missing regions, we introduce the CAM. Although CAM has the ability to create plausible results via reconstructing refined features, it depends on initial predicted results. Hence, we employ the DFB to obtain larger and more effective receptive fields, which benefits to predict more precise and fine-grained information for CAM. Extensive experiments on two widely-used datasets demonstrate that our proposed framework significantly outperforms the state-of-the-art approaches both in quantity and quality.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Chih-Wei Lin ◽  
Yu Hong ◽  
Jinfu Liu

Abstract Background Glioma is a malignant brain tumor; its location is complex and is difficult to remove surgically. To diagnosis the brain tumor, doctors can precisely diagnose and localize the disease using medical images. However, the computer-assisted diagnosis for the brain tumor diagnosis is still the problem because the rough segmentation of the brain tumor makes the internal grade of the tumor incorrect. Methods In this paper, we proposed an Aggregation-and-Attention Network for brain tumor segmentation. The proposed network takes the U-Net as the backbone, aggregates multi-scale semantic information, and focuses on crucial information to perform brain tumor segmentation. To this end, we proposed an enhanced down-sampling module and Up-Sampling Layer to compensate for the information loss. The multi-scale connection module is to construct the multi-receptive semantic fusion between encoder and decoder. Furthermore, we designed a dual-attention fusion module that can extract and enhance the spatial relationship of magnetic resonance imaging and applied the strategy of deep supervision in different parts of the proposed network. Results Experimental results show that the performance of the proposed framework is the best on the BraTS2020 dataset, compared with the-state-of-art networks. The performance of the proposed framework surpasses all the comparison networks, and its average accuracies of the four indexes are 0.860, 0.885, 0.932, and 1.2325, respectively. Conclusions The framework and modules of the proposed framework are scientific and practical, which can extract and aggregate useful semantic information and enhance the ability of glioma segmentation.


2020 ◽  
Vol 13 (1) ◽  
pp. 60
Author(s):  
Chenjie Wang ◽  
Chengyuan Li ◽  
Jun Liu ◽  
Bin Luo ◽  
Xin Su ◽  
...  

Most scenes in practical applications are dynamic scenes containing moving objects, so accurately segmenting moving objects is crucial for many computer vision applications. In order to efficiently segment all the moving objects in the scene, regardless of whether the object has a predefined semantic label, we propose a two-level nested octave U-structure network with a multi-scale attention mechanism, called U2-ONet. U2-ONet takes two RGB frames, the optical flow between these frames, and the instance segmentation of the frames as inputs. Each stage of U2-ONet is filled with the newly designed octave residual U-block (ORSU block) to enhance the ability to obtain more contextual information at different scales while reducing the spatial redundancy of the feature maps. In order to efficiently train the multi-scale deep network, we introduce a hierarchical training supervision strategy that calculates the loss at each level while adding knowledge-matching loss to keep the optimization consistent. The experimental results show that the proposed U2-ONet method can achieve a state-of-the-art performance in several general moving object segmentation datasets.


Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 5137
Author(s):  
Elham Eslami ◽  
Hae-Bum Yun

Automated pavement distress recognition is a key step in smart infrastructure assessment. Advances in deep learning and computer vision have improved the automated recognition of pavement distresses in road surface images. This task remains challenging due to the high variation of defects in shapes and sizes, demanding a better incorporation of contextual information into deep networks. In this paper, we show that an attention-based multi-scale convolutional neural network (A+MCNN) improves the automated classification of common distress and non-distress objects in pavement images by (i) encoding contextual information through multi-scale input tiles and (ii) employing a mid-fusion approach with an attention module for heterogeneous image contexts from different input scales. A+MCNN is trained and tested with four distress classes (crack, crack seal, patch, pothole), five non-distress classes (joint, marker, manhole cover, curbing, shoulder), and two pavement classes (asphalt, concrete). A+MCNN is compared with four deep classifiers that are widely used in transportation applications and a generic CNN classifier (as the control model). The results show that A+MCNN consistently outperforms the baselines by 1∼26% on average in terms of the F-score. A comprehensive discussion is also presented regarding how these classifiers perform differently on different road objects, which has been rarely addressed in the existing literature.


Sign in / Sign up

Export Citation Format

Share Document