Optimal Scale of Hierarchical Image Segmentation with Scribbles Guidance for Weakly Supervised Semantic Segmentation

Author(s):  
Zaid Al-Huda ◽  
Donghai Zhai ◽  
Yan Yang ◽  
Riyadh Nazar Ali Algburi

Deep convolutional neural networks (DCNNs) trained on the pixel-level annotated images have achieved improvements in semantic segmentation. Due to the high cost of labeling training data, their applications may have great limitation. However, weakly supervised segmentation approaches can significantly reduce human labeling efforts. In this paper, we introduce a new framework to generate high-quality initial pixel-level annotations. By using a hierarchical image segmentation algorithm to predict the boundary map, we select the optimal scale of high-quality hierarchies. In the initialization step, scribble annotations and the saliency map are combined to construct a graphic model over the optimal scale segmentation. By solving the minimal cut problem, it can spread information from scribbles to unmarked regions. In the training process, the segmentation network is trained by using the initial pixel-level annotations. To iteratively optimize the segmentation, we use a graphical model to refine segmentation masks and retrain the segmentation network to get more precise pixel-level annotations. The experimental results on Pascal VOC 2012 dataset demonstrate that the proposed framework outperforms most of weakly supervised semantic segmentation methods and achieves the state-of-the-art performance, which is [Formula: see text] mIoU.

2021 ◽  
Vol 2099 (1) ◽  
pp. 012021
Author(s):  
A V Dobshik ◽  
A A Tulupov ◽  
V B Berikov

Abstract This paper presents an automatic algorithm for the segmentation of areas affected by an acute stroke in the non-contrast computed tomography brain images. The proposed algorithm is designed for learning in a weakly supervised scenario when some images are labeled accurately, and some images are labeled inaccurately. Wrong labels appear as a result of inaccuracy made by a radiologist in the process of manual annotation of computed tomography images. We propose methods for solving the segmentation problem in the case of inaccurately labeled training data. We use the U-Net neural network architecture with several modifications. Experiments on real computed tomography scans show that the proposed methods increase the segmentation accuracy.


Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 427 ◽  
Author(s):  
Sanxing Zhang ◽  
Zhenhuan Ma ◽  
Gang Zhang ◽  
Tao Lei ◽  
Rui Zhang ◽  
...  

Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.


Symmetry ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 145 ◽  
Author(s):  
Zheng Lu ◽  
Dali Chen

Weakly supervised and semi-supervised semantic segmentation has been widely used in the field of computer vision. Since it does not require groundtruth or it only needs a small number of groundtruths for training. Recently, some works use pseudo groundtruths which are generated by a classified network to train the model, however, this method is not suitable for medical image segmentation. To tackle this challenging problem, we use the GrabCut method to generate the pseudo groundtruths in this paper, and then we train the network based on a modified U-net model with the generated pseudo groundtruths, finally we utilize a small amount of groundtruths to fine tune the model. Extensive experiments on the challenging RIM-ONE and DRISHTI-GS benchmarks strongly demonstrate the effectiveness of our algorithm. We obtain state-of-art results on RIM-ONE and DRISHTI-GS databases.


Author(s):  
Junsheng Xiao ◽  
Huahu Xu ◽  
Honghao Gao ◽  
Minjie Bian ◽  
Yang Li

Weakly supervised semantic segmentation under image-level annotations is effectiveness for real-world applications. The small and sparse discriminative regions obtained from an image classification network that are typically used as the important initial location of semantic segmentation also form the bottleneck. Although deep convolutional neural networks (DCNNs) have exhibited promising performances for single-label image classification tasks, images of the real-world usually contain multiple categories, which is still an open problem. So, the problem of obtaining high-confidence discriminative regions from multi-label classification networks remains unsolved. To solve this problem, this article proposes an innovative three-step framework within the perspective of multi-object proposal generation. First, an image is divided into candidate boxes using the object proposal method. The candidate boxes are sent to a single-classification network to obtain the discriminative regions. Second, the discriminative regions are aggregated to obtain a high-confidence seed map. Third, the seed cues grow on the feature maps of high-level semantics produced by a backbone segmentation network. Experiments are carried out on the PASCAL VOC 2012 dataset to verify the effectiveness of our approach, which is shown to outperform other baseline image segmentation methods.


2021 ◽  
Vol 11 (4) ◽  
pp. 1464
Author(s):  
Chang Wook Seo ◽  
Yongduek Seo

There are various challenging issues in automating line art colorization. In this paper, we propose a GAN approach incorporating semantic segmentation image data. Our GAN-based method, named Seg2pix, can automatically generate high quality colorized images, aiming at computerizing one of the most tedious and repetitive jobs performed by coloring workers in the webtoon industry. The network structure of Seg2pix is mostly a modification of the architecture of Pix2pix, which is a convolution-based generative adversarial network for image-to-image translation. Through this method, we can generate high quality colorized images of a particular character with only a few training data. Seg2pix is designed to reproduce a segmented image, which becomes the suggestion data for line art colorization. The segmented image is automatically generated through a generative network with a line art image and a segmentation ground truth. In the next step, this generative network creates a colorized image from the line art and segmented image, which is generated from the former step of the generative network. To summarize, only one line art image is required for testing the generative model, and an original colorized image and segmented image are additionally required as the ground truth for training the model. These generations of the segmented image and colorized image proceed by an end-to-end method sharing the same loss functions. By using this method, we produce better qualitative results for automatic colorization of a particular character’s line art. This improvement can also be measured by quantitative results with Learned Perceptual Image Patch Similarity (LPIPS) comparison. We believe this may help artists exercise their creative expertise mainly in the area where computerization is not yet capable.


Author(s):  
P. Wang ◽  
W. Yao

Abstract. Competitive point cloud semantic segmentation results usually rely on a large amount of labeled data. However, data annotation is a time-consuming and labor-intensive task, particularly for three-dimensional point cloud data. Thus, obtaining accurate results with limited ground truth as training data is considerably important. As a simple and effective method, pseudo labels can use information from unlabeled data for training neural networks. In this study, we propose a pseudo-label-assisted point cloud segmentation method with very few sparsely sampled labels that are normally randomly selected for each class. An adaptive thresholding strategy was proposed to generate a pseudo-label based on the prediction probability. Pseudo-label learning is an iterative process, and pseudo labels were updated solely on ground-truth weak labels as the model converged to improve the training efficiency. Experiments using the ISPRS 3D sematic labeling benchmark dataset indicated that our proposed method achieved an equally competitive result compared to that using a full supervision scheme with only up to 2‰ of labeled points from the original training set, with an overall accuracy of 83.7% and an average F1 score of 70.2%.


2020 ◽  
Vol 19 (01) ◽  
pp. 147-165 ◽  
Author(s):  
Fan Jia ◽  
Jun Liu ◽  
Xue-Cheng Tai

Convolutional neural networks (CNNs) have achieved prominent performance in a series of image processing problems. CNNs become the first choice for dense classification problems such as semantic segmentation. However, CNNs predict the class of each pixel independently in semantic segmentation tasks, spatial regularity of the segmented objects is still a problem for these methods. Especially when given few training data, CNN could not perform well in the details, isolated and scattered small regions often appear in all kinds of CNN segmentation results. In this paper, we propose a method to add spatial regularization to the segmented objects. In our method, the spatial regularization such as total variation (TV) can be easily integrated into CNN network and it produces smooth edges and eliminate isolated points. We apply our proposed method to Unet and Segnet, which are well-established CNNs for image segmentation, and test them on WBC and CamVid datasets, respectively. The results show that the details of predictions are well improved by regularized networks.


2019 ◽  
Vol 11 (20) ◽  
pp. 2380 ◽  
Author(s):  
Liu ◽  
Luo ◽  
Huang ◽  
Hu ◽  
Sun ◽  
...  

Deep convolutional neural networks have promoted significant progress in building extraction from high-resolution remote sensing imagery. Although most of such work focuses on modifying existing image segmentation networks in computer vision, we propose a new network in this paper, Deep Encoding Network (DE-Net), that is designed for the very problem based on many lately introduced techniques in image segmentation. Four modules are used to construct DE-Net: the inceptionstyle downsampling modules combining a striding convolution layer and a max-pooling layer, the encoding modules comprising six linear residual blocks with a scaled exponential linear unit (SELU) activation function, the compressing modules reducing the feature channels, and a densely upsampling module that enables the network to encode spatial information inside feature maps. Thus, DE-Net achieves stateoftheart performance on the WHU Building Dataset in recall, F1-Score, and intersection over union (IoU) metrics without pretraining. It also outperformed several segmentation networks in our self-built Suzhou Satellite Building Dataset. The experimental results validate the effectiveness of DE-Net on building extraction from aerial imagery and satellite imagery. It also suggests that given enough training data, designing and training a network from scratch may excel fine-tuning models pre-trained on datasets unrelated to building extraction.


Sign in / Sign up

Export Citation Format

Share Document