3D Cascaded Convolutional Networks for Multi-vertebrae Segmentation

Author(s):  
Liu Xia ◽  
Liu Xiao ◽  
Gan Quan ◽  
Wang Bo

Background: Automatic approach to vertebrae segmentation from computed tomography (CT) images is very important in clinical applications. As the intricate appearance and variable architecture of vertebrae across the population, cognate constructions in close vicinity, pathology, and the interconnection between vertebrae and ribs it is a challenge to propose a 3D automatic vertebrae CT image segmentation method. Objective: The purpose of this study was to propose an automatic multi-vertebrae segmentation method for spinal CT images. Methods: Firstly, CLAHE-Threshold-Expansion was preprocessed to improve image quality and reduce input voxel points. Then, 3D coarse segmentation fully convolutional network and cascaded finely segmentation convolutional neural network were used to complete multi-vertebrae segmentation and classification. Results: The results of this paper were compared with the other methods on the same datasets. Experimental results demonstrated that the Dice similarity coefficient (DSC) in this paper is 94.84%, higher than the V-net and 3D U-net. Conclusion: Method of this paper has certain advantages in automatically and accurately segmenting vertebrae regions of CT images. Due to the easy acquisition of spine CT images. It was proven to be more conducive to clinical application of treatment that uses our segmentation model to obtain vertebrae regions, combining with the subsequent 3D reconstruction and printing work.

2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Xiaodong Huang ◽  
Hui Zhang ◽  
Li Zhuo ◽  
Xiaoguang Li ◽  
Jing Zhang

Extracting the tongue body accurately from a digital tongue image is a challenge for automated tongue diagnoses, as the blurred edge of the tongue body, interference of pathological details, and the huge difference in the size and shape of the tongue. In this study, an automated tongue image segmentation method using enhanced fully convolutional network with encoder-decoder structure was presented. In the frame of the proposed network, the deep residual network was adopted as an encoder to obtain dense feature maps, and a Receptive Field Block was assembled behind the encoder. Receptive Field Block can capture adequate global contextual prior because of its structure of the multibranch convolution layers with varying kernels. Moreover, the Feature Pyramid Network was used as a decoder to fuse multiscale feature maps for gathering sufficient positional information to recover the clear contour of the tongue body. The quantitative evaluation of the segmentation results of 300 tongue images from the SIPL-tongue dataset showed that the average Hausdorff Distance, average Symmetric Mean Absolute Surface Distance, average Dice Similarity Coefficient, average precision, average sensitivity, and average specificity were 11.2963, 3.4737, 97.26%, 95.66%, 98.97%, and 98.68%, respectively. The proposed method achieved the best performance compared with the other four deep-learning-based segmentation methods (including SegNet, FCN, PSPNet, and DeepLab v3+). There were also similar results on the HIT-tongue dataset. The experimental results demonstrated that the proposed method can achieve accurate tongue image segmentation and meet the practical requirements of automated tongue diagnoses.


2005 ◽  
Author(s):  
Aleksandra Popovic ◽  
Martin Engelhardt ◽  
Klaus Radermacher

Methods for segmentation of skull infiltrated tumors in Computed Tomography (CT) images using Insight Segmentation and Registration Toolkit ITK (www.itk.org) are presented. Pipelines of filters and algorithms from ITK are validated on the basis of different criteria: sensitivity, specificity, dice similarity coefficient, Chi-squared, and Hausdorff distance measure. The method to rate segmentation results in relation to validation metrics is presented together with analysis of importance of different goodness measures. Results for one simulated dataset and three patient are presented.


Author(s):  
Jialiang Jiang ◽  
Yong Luo ◽  
Feng Wang ◽  
Yuchuan Fu ◽  
Hang Yu ◽  
...  

: Purpose: To evaluate the accuracy and dosimetric effects for auto-segmentation of the CTV for GO in CT images based on FCN. Methods: An FCN-8s network architecture for auto-segmentation was built based on Caffe. CT images of 121 patients with GO who have received radiotherapy at the West China Hospital of Sichuan University were randomly selected for training and testing. Two methods were used to segment the CTV of GO: treating the two-part CTV as a whole anatomical region or considering the two parts of CTV as two independent regions. Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD) were used as evaluation criteria. The auto-segmented contours were imported into the original treatment plan to analysis the dosimetric characteristics. Results: The similarity comparison between manual contours and auto-segmental contours showed an average DSC value up to 0.83. The max HD values for segmenting two parts of CTV separately was a little bit smaller than treating CTV with one label (8.23±2.80 vs. 9.03±2.78). The dosimetric comparison between manual contours and auto-segmental contours showed there was a significant difference (p<0.05) with the lack of dose for auto-segmental CTV. Conclusion: Based on deep learning architecture, the automatic segmentation model for small target area can carry out auto contouring task well. Treating separate parts of one target as different anatomic regions can help to improve the auto-contouring quality. The dosimetric evaluation can provide us with different perspectives for further exploration of automatic sketching tools.


2019 ◽  
pp. 174749301989570 ◽  
Author(s):  
Kevin J Chung ◽  
Hulin Kuang ◽  
Alyssa Federico ◽  
Hyun Seok Choi ◽  
Linda Kasickova ◽  
...  

Background Manual segmentations of intracranial hemorrhage on non-contrast CT images are the gold-standard in measuring hematoma growth but are prone to rater variability. Aims We demonstrate that a convex optimization-based interactive segmentation approach can accurately and reliably measure intracranial hemorrhage growth. Methods Baseline and 16-h follow-up head non-contrast CT images of 46 subjects presenting with intracranial hemorrhage were selected randomly from the ANNEXA-4 trial imaging database. Three users semi-automatically segmented intracranial hemorrhage to measure hematoma volume for each timepoint using our proposed method. Segmentation accuracy was quantitatively evaluated compared to manual segmentations by using Dice similarity coefficient, Pearson correlation, and Bland–Altman analysis. Intra- and inter-rater reliability of the Dice similarity coefficient and intracranial hemorrhage volumes and volume change were assessed by the intraclass correlation coefficient and minimum detectable change. Results Among the three users, the mean Dice similarity coefficient, Pearson correlation, and mean difference ranged from 76.79% to 79.76%, 0.970 to 0.980 ( p < 0.001), and −1.5 to −0.4 ml, respectively, for all intracranial hemorrhage segmentations. Inter-rater intraclass correlation coefficients between the three users for Dice similarity coefficient and intracranial hemorrhage volume were 0.846 and 0.962, respectively, and the corresponding minimum detectable change was 2.51 ml. Inter-rater intraclass correlation coefficient for intracranial hemorrhage volume change ranged from 0.915 to 0.958 for each user compared to manual measurements, resulting in an minimum detectable change range of 2.14 to 4.26 ml. Conclusions We spatially and volumetrically validate a novel interactive segmentation method for delineating intracranial hemorrhage on head non-contrast CT images. Good spatial overlap, excellent volume correlation, and good repeatability suggest its usefulness for measuring intracranial hemorrhage volume and volume change on non-contrast CT images.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1546 ◽  
Author(s):  
Dariusz Kucharski ◽  
Pawel Kleczek ◽  
Joanna Jaworek-Korjakowska ◽  
Grzegorz Dyduch ◽  
Marek Gorgon

In this research, we present a semi-supervised segmentation solution using convolutional autoencoders to solve the problem of segmentation tasks having a small number of ground-truth images. We evaluate the proposed deep network architecture for the detection of nests of nevus cells in histopathological images of skin specimens is an important step in dermatopathology. The diagnostic criteria based on the degree of uniformity and symmetry of border irregularities are particularly vital in dermatopathology, in order to distinguish between benign and malignant skin lesions. However, to the best of our knowledge, it is the first described method to segment the nests region. The novelty of our approach is not only the area of research, but, furthermore, we address a problem with a small ground-truth dataset. We propose an effective computer-vision based deep learning tool that can perform the nests segmentation based on an autoencoder architecture with two learning steps. Experimental results verified the effectiveness of the proposed approach and its ability to segment nests areas with Dice similarity coefficient 0.81, sensitivity 0.76, and specificity 0.94, which is a state-of-the-art result.


2021 ◽  
Vol 11 (15) ◽  
pp. 6975
Author(s):  
Tao Zhang ◽  
Lun He ◽  
Xudong Li ◽  
Guoqing Feng

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.


Author(s):  
Shengsheng Qian ◽  
Jun Hu ◽  
Quan Fang ◽  
Changsheng Xu

In this article, we focus on fake news detection task and aim to automatically identify the fake news from vast amount of social media posts. To date, many approaches have been proposed to detect fake news, which includes traditional learning methods and deep learning-based models. However, there are three existing challenges: (i) How to represent social media posts effectively, since the post content is various and highly complicated; (ii) how to propose a data-driven method to increase the flexibility of the model to deal with the samples in different contexts and news backgrounds; and (iii) how to fully utilize the additional auxiliary information (the background knowledge and multi-modal information) of posts for better representation learning. To tackle the above challenges, we propose a novel Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks (KMAGCN) to capture the semantic representations by jointly modeling the textual information, knowledge concepts, and visual information into a unified framework for fake news detection. We model posts as graphs and use a knowledge-aware multi-modal adaptive graph learning principal for the effective feature learning. Compared with existing methods, the proposed KMAGCN addresses challenges from three aspects: (1) It models posts as graphs to capture the non-consecutive and long-range semantic relations; (2) it proposes a novel adaptive graph convolutional network to handle the variability of graph data; and (3) it leverages textual information, knowledge concepts and visual information jointly for model learning. We have conducted extensive experiments on three public real-world datasets and superior results demonstrate the effectiveness of KMAGCN compared with other state-of-the-art algorithms.


Diagnostics ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 893
Author(s):  
Yazan Qiblawey ◽  
Anas Tahir ◽  
Muhammad E. H. Chowdhury ◽  
Amith Khandakar ◽  
Serkan Kiranyaz ◽  
...  

Detecting COVID-19 at an early stage is essential to reduce the mortality risk of the patients. In this study, a cascaded system is proposed to segment the lung, detect, localize, and quantify COVID-19 infections from computed tomography images. An extensive set of experiments were performed using Encoder–Decoder Convolutional Neural Networks (ED-CNNs), UNet, and Feature Pyramid Network (FPN), with different backbone (encoder) structures using the variants of DenseNet and ResNet. The conducted experiments for lung region segmentation showed a Dice Similarity Coefficient (DSC) of 97.19% and Intersection over Union (IoU) of 95.10% using U-Net model with the DenseNet 161 encoder. Furthermore, the proposed system achieved an elegant performance for COVID-19 infection segmentation with a DSC of 94.13% and IoU of 91.85% using the FPN with DenseNet201 encoder. The proposed system can reliably localize infections of various shapes and sizes, especially small infection regions, which are rarely considered in recent studies. Moreover, the proposed system achieved high COVID-19 detection performance with 99.64% sensitivity and 98.72% specificity. Finally, the system was able to discriminate between different severity levels of COVID-19 infection over a dataset of 1110 subjects with sensitivity values of 98.3%, 71.2%, 77.8%, and 100% for mild, moderate, severe, and critical, respectively.


2021 ◽  
pp. 002203452110053
Author(s):  
H. Wang ◽  
J. Minnema ◽  
K.J. Batenburg ◽  
T. Forouzanfar ◽  
F.J. Hu ◽  
...  

Accurate segmentation of the jaw (i.e., mandible and maxilla) and the teeth in cone beam computed tomography (CBCT) scans is essential for orthodontic diagnosis and treatment planning. Although various (semi)automated methods have been proposed to segment the jaw or the teeth, there is still a lack of fully automated segmentation methods that can simultaneously segment both anatomic structures in CBCT scans (i.e., multiclass segmentation). In this study, we aimed to train and validate a mixed-scale dense (MS-D) convolutional neural network for multiclass segmentation of the jaw, the teeth, and the background in CBCT scans. Thirty CBCT scans were obtained from patients who had undergone orthodontic treatment. Gold standard segmentation labels were manually created by 4 dentists. As a benchmark, we also evaluated MS-D networks that segmented the jaw or the teeth (i.e., binary segmentation). All segmented CBCT scans were converted to virtual 3-dimensional (3D) models. The segmentation performance of all trained MS-D networks was assessed by the Dice similarity coefficient and surface deviation. The CBCT scans segmented by the MS-D network demonstrated a large overlap with the gold standard segmentations (Dice similarity coefficient: 0.934 ± 0.019, jaw; 0.945 ± 0.021, teeth). The MS-D network–based 3D models of the jaw and the teeth showed minor surface deviations when compared with the corresponding gold standard 3D models (0.390 ± 0.093 mm, jaw; 0.204 ± 0.061 mm, teeth). The MS-D network took approximately 25 s to segment 1 CBCT scan, whereas manual segmentation took about 5 h. This study showed that multiclass segmentation of jaw and teeth was accurate and its performance was comparable to binary segmentation. The MS-D network trained for multiclass segmentation would therefore make patient-specific orthodontic treatment more feasible by strongly reducing the time required to segment multiple anatomic structures in CBCT scans.


Sign in / Sign up

Export Citation Format

Share Document