scholarly journals Medical Image Segmentation using Squeeze-and-Expansion Transformers

Author(s):  
Shaohua Li ◽  
Xiuchao Sui ◽  
Xiangde Luo ◽  
Xinxing Xu ◽  
Yong Liu ◽  
...  

Medical image segmentation is important for computer-aided diagnosis. Good segmentation demands the model to see the big picture and fine details simultaneously, i.e., to learn image features that incorporate large context while keep high spatial resolutions. To approach this goal, the most widely used methods -- U-Net and variants, extract and fuse multi-scale features. However, the fused features still have small "effective receptive fields" with a focus on local image cues, limiting their performance. In this work, we propose Segtran, an alternative segmentation framework based on transformers, which have unlimited "effective receptive fields" even at high feature resolutions. The core of Segtran is a novel Squeeze-and-Expansion transformer: a squeezed attention block regularizes the self attention of transformers, and an expansion block learns diversified representations. Additionally, we propose a new positional encoding scheme for transformers, imposing a continuity inductive bias for images. Experiments were performed on 2D and 3D medical image segmentation tasks: optic disc/cup segmentation in fundus images (REFUGE'20 challenge), polyp segmentation in colonoscopy images, and brain tumor segmentation in MRI scans (BraTS'19 challenge). Compared with representative existing methods, Segtran consistently achieved the highest segmentation accuracy, and exhibited good cross-domain generalization capabilities.

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Zhuqing Yang

Medical image segmentation (IS) is a research field in image processing. Deep learning methods are used to automatically segment organs, tissues, or tumor regions in medical images, which can assist doctors in diagnosing diseases. Since most IS models based on convolutional neural network (CNN) are two-dimensional models, they are not suitable for three-dimensional medical imaging. On the contrary, the three-dimensional segmentation model has problems such as complex network structure and large amount of calculation. Therefore, this study introduces the self-excited compressed dilated convolution (SECDC) module on the basis of the 3D U-Net network and proposes an improved 3D U-Net network model. In the SECDC module, the calculation amount of the model can be reduced by 1 × 1 × 1 convolution. Combining normal convolution and cavity convolution with an expansion rate of 2 can dig out the multiview features of the image. At the same time, the 3D squeeze-and-excitation (3D-SE) module can realize automatic learning of the importance of each layer. The experimental results on the BraTS2019 dataset show that the Dice coefficient and other indicators obtained by the model used in this paper indicate that the overall tumor can reach 0.87, the tumor core can reach 0.84, and the most difficult to segment enhanced tumor can reach 0.80. From the evaluation indicators, it can be analyzed that the improved 3D U-Net model used can greatly reduce the amount of data while achieving better segmentation results, and the model has better robustness. This model can meet the clinical needs of brain tumor segmentation methods.


Author(s):  
Danbing Zou ◽  
Qikui Zhu ◽  
Pingkun Yan

Domain adaptation aims to alleviate the problem of retraining a pre-trained model when applying it to a different domain, which requires large amount of additional training data of the target domain. Such an objective is usually achieved by establishing connections between the source domain labels and target domain data. However, this imbalanced source-to-target one way pass may not eliminate the domain gap, which limits the performance of the pre-trained model. In this paper, we propose an innovative Dual-Scheme Fusion Network (DSFN) for unsupervised domain adaptation. By building both source-to-target and target-to-source connections, this balanced joint information flow helps reduce the domain gap to further improve the network performance. The mechanism is further applied to the inference stage, where both the original input target image and the generated source images are segmented with the proposed joint network. The results are fused to obtain more robust segmentation. Extensive experiments of unsupervised cross-modality medical image segmentation are conducted on two tasks -- brain tumor segmentation and cardiac structures segmentation. The experimental results show that our method achieved significant performance improvement over other state-of-the-art domain adaptation methods.


2019 ◽  
Author(s):  
Ali Hatamizadeh ◽  
Demetri Terzopoulos ◽  
Andriy Myronenko

AbstractFully convolutional neural networks (CNNs) have proven to be effective at representing and classifying textural information, thus transforming image intensity into output class masks that achieve semantic image segmentation. In medical image analysis, however, expert manual segmentation often relies on the boundaries of anatomical structures of interest. We propose boundary aware CNNs for medical image segmentation. Our networks are designed to account for organ boundary information, both by providing a special network edge branch and edge-aware loss terms, and they are trainable end-to-end. We validate their effectiveness on the task of brain tumor segmentation using the BraTS 2018 dataset. Our experiments reveal that our approach yields more accurate segmentation results, which makes it promising for more extensive application to medical image segmentation.


Author(s):  
S. Shirly ◽  
K. Ramesh

Background: Magnetic Resonance Imaging is most widely used for early diagnosis of abnormalities in human organs. Due to the technical advancement in digital image processing, automatic computer aided medical image segmentation has been widely used in medical diagnostics. </P><P> Discussion: Image segmentation is an image processing technique which is used for extracting image features, searching and mining the medical image records for better and accurate medical diagnostics. Commonly used segmentation techniques are threshold based image segmentation, clustering based image segmentation, edge based image segmentation, region based image segmentation, atlas based image segmentation, and artificial neural network based image segmentation. Conclusion: This survey aims at providing an insight about different 2-Dimensional and 3- Dimensional MRI image segmentation techniques and to facilitate better understanding to the people who are new in this field. This comparative study summarizes the benefits and limitations of various segmentation techniques.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Dominik Müller ◽  
Frank Kramer

Abstract Background The increased availability and usage of modern medical imaging induced a strong need for automatic medical image segmentation. Still, current image segmentation platforms do not provide the required functionalities for plain setup of medical image segmentation pipelines. Already implemented pipelines are commonly standalone software, optimized on a specific public data set. Therefore, this paper introduces the open-source Python library MIScnn. Implementation The aim of MIScnn is to provide an intuitive API allowing fast building of medical image segmentation pipelines including data I/O, preprocessing, data augmentation, patch-wise analysis, metrics, a library with state-of-the-art deep learning models and model utilization like training, prediction, as well as fully automatic evaluation (e.g. cross-validation). Similarly, high configurability and multiple open interfaces allow full pipeline customization. Results Running a cross-validation with MIScnn on the Kidney Tumor Segmentation Challenge 2019 data set (multi-class semantic segmentation with 300 CT scans) resulted into a powerful predictor based on the standard 3D U-Net model. Conclusions With this experiment, we could show that the MIScnn framework enables researchers to rapidly set up a complete medical image segmentation pipeline by using just a few lines of code. The source code for MIScnn is available in the Git repository: https://github.com/frankkramer-lab/MIScnn.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 523
Author(s):  
Kh Tohidul Islam ◽  
Sudanthi Wijewickrema ◽  
Stephen O’Leary

Multi-modal three-dimensional (3-D) image segmentation is used in many medical applications, such as disease diagnosis, treatment planning, and image-guided surgery. Although multi-modal images provide information that no single image modality alone can provide, integrating such information to be used in segmentation is a challenging task. Numerous methods have been introduced to solve the problem of multi-modal medical image segmentation in recent years. In this paper, we propose a solution for the task of brain tumor segmentation. To this end, we first introduce a method of enhancing an existing magnetic resonance imaging (MRI) dataset by generating synthetic computed tomography (CT) images. Then, we discuss a process of systematic optimization of a convolutional neural network (CNN) architecture that uses this enhanced dataset, in order to customize it for our task. Using publicly available datasets, we show that the proposed method outperforms similar existing methods.


2021 ◽  
Author(s):  
Nabila Abraham

Convolutional neural networks have been asserted to be fast and precise frameworks with great potential in image segmentation. Within the medical domain, image segmentation is a pre-cursor to several applications including surgical simulations, treatment planning and patient prognosis. In this thesis, we attempt to solve two major limitations of current segmentation practices: 1) dealing with unbalanced classes and 2) dealing with multiple modalities. In medical imaging, unbalanced classes present as the regions of interest that are typically significantly smaller in volume than the background class or other classes. We propose an improvement to the current gold standard cost function to boost the focus of the network to the smaller classes. Another problem within medical imaging is the variation in both anatomy and pathology across patients. Utilizing multiple imaging modalities provides complementary, segmentation-specific information and is commonly employed by radiologists when contouring data. We propose a image fusion strategy for multi-modal data that uses the variation in modality specific features to guide the task specific learning. Together, our contributions propose a framework to maximize the representational power of the dataset using models with less complexity and higher generalizability. Our contributions outperform baseline models for multi-class segmentation and are modular enough to be scaled up to deeper networks. We demonstrate the effectiveness of the proposed cost function and multimodal framework, both individually and together, on benchmark datasets including the Breast Ultrasound Dataset B (BUS) [1], the International Skin Imaging Collaboration (ISIC 2018) [2], [3] and the Brain Tumor Segmentation Challenge (BraTs 2018) [4]. In all experiments, the proposed methods match or outperform the baseline methods while employing simpler networks


2021 ◽  
Author(s):  
Nabila Abraham

Convolutional neural networks have been asserted to be fast and precise frameworks with great potential in image segmentation. Within the medical domain, image segmentation is a pre-cursor to several applications including surgical simulations, treatment planning and patient prognosis. In this thesis, we attempt to solve two major limitations of current segmentation practices: 1) dealing with unbalanced classes and 2) dealing with multiple modalities. In medical imaging, unbalanced classes present as the regions of interest that are typically significantly smaller in volume than the background class or other classes. We propose an improvement to the current gold standard cost function to boost the focus of the network to the smaller classes. Another problem within medical imaging is the variation in both anatomy and pathology across patients. Utilizing multiple imaging modalities provides complementary, segmentation-specific information and is commonly employed by radiologists when contouring data. We propose a image fusion strategy for multi-modal data that uses the variation in modality specific features to guide the task specific learning. Together, our contributions propose a framework to maximize the representational power of the dataset using models with less complexity and higher generalizability. Our contributions outperform baseline models for multi-class segmentation and are modular enough to be scaled up to deeper networks. We demonstrate the effectiveness of the proposed cost function and multimodal framework, both individually and together, on benchmark datasets including the Breast Ultrasound Dataset B (BUS) [1], the International Skin Imaging Collaboration (ISIC 2018) [2], [3] and the Brain Tumor Segmentation Challenge (BraTs 2018) [4]. In all experiments, the proposed methods match or outperform the baseline methods while employing simpler networks


Author(s):  
Mohammed A.-M. Salem ◽  
Alaa Atef ◽  
Alaa Salah ◽  
Marwa Shams

This chapter presents a survey on the techniques of medical image segmentation. Image segmentation methods are given in three groups based on image features used by the method. The advantages and disadvantages of the existing methods are evaluated, and the motivations to develop new techniques with respect to the addressed problems are given. Digital images and digital videos are pictures and films, respectively, which have been converted into a computer-readable binary format consisting of logical zeros and ones. An image is a still picture that does not change in time, whereas a video evolves in time and generally contains moving and/or changing objects. An important feature of digital images is that they are multidimensional signals, i.e., they are functions of more than a single variable. In the classical study of the digital signal processing the signals are usually one-dimensional functions of time. Images however, are functions of two, and perhaps three space dimensions in case of colored images, whereas a digital video as a function includes a third (or fourth) time dimension as well. A consequence of this is that digital image processing, meaning that significant computational and storage resources are required.


Sign in / Sign up

Export Citation Format

Share Document