scholarly journals Towards Cross-Modality Medical Image Segmentation with Online Mutual Knowledge Distillation

2020 ◽  
Vol 34 (01) ◽  
pp. 775-783
Author(s):  
Kang Li ◽  
Lequan Yu ◽  
Shujun Wang ◽  
Pheng-Ann Heng

The success of deep convolutional neural networks is partially attributed to the massive amount of annotated training data. However, in practice, medical data annotations are usually expensive and time-consuming to be obtained. Considering multi-modality data with the same anatomic structures are widely available in clinic routine, in this paper, we aim to exploit the prior knowledge (e.g., shape priors) learned from one modality (aka., assistant modality) to improve the segmentation performance on another modality (aka., target modality) to make up annotation scarcity. To alleviate the learning difficulties caused by modality-specific appearance discrepancy, we first present an Image Alignment Module (IAM) to narrow the appearance gap between assistant and target modality data. We then propose a novel Mutual Knowledge Distillation (MKD) scheme to thoroughly exploit the modality-shared knowledge to facilitate the target-modality segmentation. To be specific, we formulate our framework as an integration of two individual segmentors. Each segmentor not only explicitly extracts one modality knowledge from corresponding annotations, but also implicitly explores another modality knowledge from its counterpart in mutual-guided manner. The ensemble of two segmentors would further integrate the knowledge from both modalities and generate reliable segmentation results on target modality. Experimental results on the public multi-class cardiac segmentation data, i.e., MM-WHS 2017, show that our method achieves large improvements on CT segmentation by utilizing additional MRI data and outperforms other state-of-the-art multi-modality learning methods.

Author(s):  
Xin Liu ◽  
Kai Liu ◽  
Xiang Li ◽  
Jinsong Su ◽  
Yubin Ge ◽  
...  

The lack of sufficient training data in many domains, poses a major challenge to the construction of domain-specific machine reading comprehension (MRC) models with satisfying performance. In this paper, we propose a novel iterative multi-source mutual knowledge transfer framework for MRC. As an extension of the conventional knowledge transfer with one-to-one correspondence, our framework focuses on the many-to-many mutual transfer, which involves synchronous executions of multiple many-to-one transfers in an iterative manner.Specifically, to update a target-domain MRC model, we first consider other domain-specific MRC models as individual teachers, and employ knowledge distillation to train a multi-domain MRC model, which is differentially required to fit the training data and match the outputs of these individual models according to their domain-level similarities to the target domain. After being initialized by the multi-domain MRC model, the target-domain MRC model is fine-tuned to match both its training data and the output of its previous best model simultaneously via knowledge distillation. Compared with previous approaches, our framework can continuously enhance all domain-specific MRC models by enabling each model to iteratively and differentially absorb the domain-shared knowledge from others. Experimental results and in-depth analyses on several benchmark datasets demonstrate the effectiveness of our framework.


Author(s):  
Zaid Al-Huda ◽  
Donghai Zhai ◽  
Yan Yang ◽  
Riyadh Nazar Ali Algburi

Deep convolutional neural networks (DCNNs) trained on the pixel-level annotated images have achieved improvements in semantic segmentation. Due to the high cost of labeling training data, their applications may have great limitation. However, weakly supervised segmentation approaches can significantly reduce human labeling efforts. In this paper, we introduce a new framework to generate high-quality initial pixel-level annotations. By using a hierarchical image segmentation algorithm to predict the boundary map, we select the optimal scale of high-quality hierarchies. In the initialization step, scribble annotations and the saliency map are combined to construct a graphic model over the optimal scale segmentation. By solving the minimal cut problem, it can spread information from scribbles to unmarked regions. In the training process, the segmentation network is trained by using the initial pixel-level annotations. To iteratively optimize the segmentation, we use a graphical model to refine segmentation masks and retrain the segmentation network to get more precise pixel-level annotations. The experimental results on Pascal VOC 2012 dataset demonstrate that the proposed framework outperforms most of weakly supervised semantic segmentation methods and achieves the state-of-the-art performance, which is [Formula: see text] mIoU.


2022 ◽  
Vol 8 ◽  
Author(s):  
Runnan He ◽  
Shiqi Xu ◽  
Yashu Liu ◽  
Qince Li ◽  
Yang Liu ◽  
...  

Medical imaging provides a powerful tool for medical diagnosis. In the process of computer-aided diagnosis and treatment of liver cancer based on medical imaging, accurate segmentation of liver region from abdominal CT images is an important step. However, due to defects of liver tissue and limitations of CT imaging procession, the gray level of liver region in CT image is heterogeneous, and the boundary between the liver and those of adjacent tissues and organs is blurred, which makes the liver segmentation an extremely difficult task. In this study, aiming at solving the problem of low segmentation accuracy of the original 3D U-Net network, an improved network based on the three-dimensional (3D) U-Net, is proposed. Moreover, in order to solve the problem of insufficient training data caused by the difficulty of acquiring labeled 3D data, an improved 3D U-Net network is embedded into the framework of generative adversarial networks (GAN), which establishes a semi-supervised 3D liver segmentation optimization algorithm. Finally, considering the problem of poor quality of 3D abdominal fake images generated by utilizing random noise as input, deep convolutional neural networks (DCNN) based on feature restoration method is designed to generate more realistic fake images. By testing the proposed algorithm on the LiTS-2017 and KiTS19 dataset, experimental results show that the proposed semi-supervised 3D liver segmentation method can greatly improve the segmentation performance of liver, with a Dice score of 0.9424 outperforming other methods.


Author(s):  
Hengfei Cui ◽  
Chang Yuwen ◽  
Lei Jiang

AbstractTubular structure enhancement plays an utmost role in medical image segmentation as a pre-processing technique. In this work, an unsupervised 3D tubular structure segmentation technique is developed, which is mainly inspired by the idea of filter combination. Three well-known vessel filters, Frangi’s filter, the modified Frangi’s filter and the Multiscale Fractional Anisotropic Tensor (MFAT) filter, separately enhance the original images. Next, the enhanced images obtained using three different filters are combined. Different categories of vessel filters have the ability of complementarity, which is the main motivation of combining these three advanced filters. The combination of them ensures a high diversity of the enhancing results. Weighted mean and median ranking methods are used to conduct the operation of filter combination. Based on the optimized weights for all the three individual filters, fuzzy C-means method is then applied to segment the tubular structures. The proposed technique is tested on the public DRIVE and STARE datasets, the public synthetic vascular models (2011 and 2013 VascuSynth Sample), and real-patient Coronary Computed Tomography Angiography (CCTA) datasets. Experimental results demonstrate that the proposed technique outperforms the state-of-the-art filter combination-based segmentation methods. Moreover, our proposed method is able to yield better tubular structure segmentation results than that of each individual filter, which exhibits the superiority of the proposed method. In conclusion, the proposed method can be further used to facilitate vessel segmentation in medical practice.


2020 ◽  
Vol 10 (15) ◽  
pp. 5032
Author(s):  
Xiaochang Wu ◽  
Xiaolin Tian

Medical image segmentation is a classic challenging problem. The segmentation of parts of interest in cardiac medical images is a basic task for cardiac image diagnosis and guided surgery. The effectiveness of cardiac segmentation directly affects subsequent medical applications. Generative adversarial networks have achieved outstanding success in image segmentation compared with classic neural networks by solving the oversegmentation problem. Cardiac X-ray images are prone to weak edges, artifacts, etc. This paper proposes an adaptive generative adversarial network for cardiac segmentation to improve the segmentation rate of X-ray images by generative adversarial networks. The adaptive generative adversarial network consists of three parts: a feature extractor, a discriminator and a selector. In this method, multiple generators are trained in the feature extractor. The discriminator scores the features of different dimensions. The selector selects the appropriate features and adjusts the network for the next iteration. With the help of the discriminator, this method uses multinetwork joint feature extraction to achieve network adaptivity. This method allows features of multiple dimensions to be combined to perform joint training of the network to enhance its generalization ability. The results of cardiac segmentation experiments on X-ray chest radiographs show that this method has higher segmentation accuracy and less overfitting than other methods. In addition, the proposed network is more stable.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Saad Naeem ◽  
Noreen Jamil ◽  
Habib Ullah Khan ◽  
Shah Nazir

Neural networks employ massive interconnection of simple computing units called neurons to compute the problems that are highly nonlinear and could not be hard coded into a program. These neural networks are computation-intensive, and training them requires a lot of training data. Each training example requires heavy computations. We look at different ways in which we can reduce the heavy computation requirement and possibly make them work on mobile devices. In this paper, we survey various techniques that can be matched and combined in order to improve the training time of neural networks. Additionally, we also review some extra recommendations to make the process work for mobile devices as well. We finally survey deep compression technique that tries to solve the problem by network pruning, quantization, and encoding the network weights. Deep compression reduces the time required for training the network by first pruning the irrelevant connections, i.e., the pruning stage, which is then followed by quantizing the network weights via choosing centroids for each layer. Finally, at the third stage, it employs Huffman encoding algorithm to deal with the storage issue of the remaining weights.


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6353
Author(s):  
Pasquale Memmolo ◽  
Pierluigi Carcagnì ◽  
Vittorio Bianco ◽  
Francesco Merola ◽  
Andouglas Goncalves da Silva Junior ◽  
...  

Diatoms are among the dominant phytoplankters in marine and freshwater habitats, and important biomarkers of water quality, making their identification and classification one of the current challenges for environmental monitoring. To date, taxonomy of the species populating a water column is still conducted by marine biologists on the basis of their own experience. On the other hand, deep learning is recognized as the elective technique for solving image classification problems. However, a large amount of training data is usually needed, thus requiring the synthetic enlargement of the dataset through data augmentation. In the case of microalgae, the large variety of species that populate the marine environments makes it arduous to perform an exhaustive training that considers all the possible classes. However, commercial test slides containing one diatom element per class fixed in between two glasses are available on the market. These are usually prepared by expert diatomists for taxonomy purposes, thus constituting libraries of the populations that can be found in oceans. Here we show that such test slides are very useful for training accurate deep Convolutional Neural Networks (CNNs). We demonstrate the successful classification of diatoms based on a proper CNNs ensemble and a fully augmented dataset, i.e., creation starting from one single image per class available from a commercial glass slide containing 50 fixed species in a dry setting. This approach avoids the time-consuming steps of water sampling and labeling by skilled marine biologists. To accomplish this goal, we exploit the holographic imaging modality, which permits the accessing of a quantitative phase-contrast maps and a posteriori flexible refocusing due to its intrinsic 3D imaging capability. The network model is then validated by using holographic recordings of live diatoms imaged in water samples i.e., in their natural wet environmental condition.


2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
Zhuofu Deng ◽  
Binbin Wang ◽  
Zhiliang Zhu

Maxillary sinus segmentation plays an important role in the choice of therapeutic strategies for nasal disease and treatment monitoring. Difficulties in traditional approaches deal with extremely heterogeneous intensity caused by lesions, abnormal anatomy structures, and blurring boundaries of cavity. 2D and 3D deep convolutional neural networks have grown popular in medical image segmentation due to utilization of large labeled datasets to learn discriminative features. However, for 3D segmentation in medical images, 2D networks are not competent in extracting more significant spacial features, and 3D ones suffer from unbearable burden of computation, which results in great challenges to maxillary sinus segmentation. In this paper, we propose a deep neural network with an end-to-end manner to generalize a fully automatic 3D segmentation. At first, our proposed model serves a symmetrical encoder-decoder architecture for multitask of bounding box estimation and in-region 3D segmentation, which cannot reduce excessive computation requirements but eliminate false positives remarkably, promoting 3D segmentation applied in 3D convolutional neural networks. In addition, an overestimation strategy is presented to avoid overfitting phenomena in conventional multitask networks. Meanwhile, we introduce residual dense blocks to increase the depth of the proposed network and attention excitation mechanism to improve the performance of bounding box estimation, both of which bring little influence to computation cost. Especially, the structure of multilevel feature fusion in the pyramid network strengthens the ability of identification to global and local discriminative features in foreground and background achieving more advanced segmentation results. At last, to address problems of blurring boundary and class imbalance in medical images, a hybrid loss function is designed for multiple tasks. To illustrate the strength of our proposed model, we evaluated it against the state-of-the-art methods. Our model performed better significantly with an average Dice 0.947±0.031, VOE 10.23±5.29, and ASD 2.86±2.11, respectively, which denotes a promising technique with strong robust in practice.


Sensors ◽  
2019 ◽  
Vol 19 (4) ◽  
pp. 935 ◽  
Author(s):  
Yeong-Hyeon Byeon ◽  
Sung-Bum Pan ◽  
Keun-Chang Kwak

This paper conducts a comparative analysis of deep models in biometrics using scalogram of electrocardiogram (ECG). A scalogram is the absolute value of the continuous wavelet transform coefficients of a signal. Since biometrics using ECG signals are sensitive to noise, studies have been conducted by transforming signals into a frequency domain that is efficient for analyzing noisy signals. By transforming the signal from the time domain to the frequency domain using the wavelet, the 1-D signal becomes a 2-D matrix, and it could be analyzed at multiresolution. However, this process makes signal analysis morphologically complex. This means that existing simple classifiers could perform poorly. We investigate the possibility of using the scalogram of ECG as input to deep convolutional neural networks of deep learning, which exhibit optimal performance for the classification of morphological imagery. When training data is small or hardware is insufficient for training, transfer learning can be used with pretrained deep models to reduce learning time, and classify it well enough. In this paper, AlexNet, GoogLeNet, and ResNet are considered as deep models of convolutional neural network. The experiments are performed on two databases for performance evaluation. Physikalisch-Technische Bundesanstalt (PTB)-ECG is a well-known database, while Chosun University (CU)-ECG is directly built for this study using the developed ECG sensor. The ResNet was 0.73%—0.27% higher than AlexNet or GoogLeNet on PTB-ECG—and the ResNet was 0.94%—0.12% higher than AlexNet or GoogLeNet on CU-ECG.


Sign in / Sign up

Export Citation Format

Share Document