Optimal Scale of Hierarchical Image Segmentation with Scribbles Guidance for Weakly Supervised Semantic Segmentation

Deep convolutional neural networks (DCNNs) trained on the pixel-level annotated images have achieved improvements in semantic segmentation. Due to the high cost of labeling training data, their applications may have great limitation. However, weakly supervised segmentation approaches can significantly reduce human labeling efforts. In this paper, we introduce a new framework to generate high-quality initial pixel-level annotations. By using a hierarchical image segmentation algorithm to predict the boundary map, we select the optimal scale of high-quality hierarchies. In the initialization step, scribble annotations and the saliency map are combined to construct a graphic model over the optimal scale segmentation. By solving the minimal cut problem, it can spread information from scribbles to unmarked regions. In the training process, the segmentation network is trained by using the initial pixel-level annotations. To iteratively optimize the segmentation, we use a graphical model to refine segmentation masks and retrain the segmentation network to get more precise pixel-level annotations. The experimental results on Pascal VOC 2012 dataset demonstrate that the proposed framework outperforms most of weakly supervised semantic segmentation methods and achieves the state-of-the-art performance, which is [Formula: see text] mIoU.

Download Full-text

Weakly Supervised Learning with Deep Convolutional Neural Networks for Semantic Segmentation: Understanding Semantic Layout of Images with Minimum Human Supervision

IEEE Signal Processing Magazine ◽

10.1109/msp.2017.2742558 ◽

2017 ◽

Vol 34 (6) ◽

pp. 39-49 ◽

Cited By ~ 12

Author(s):

Seunghoon Hong ◽

Suha Kwak ◽

Bohyung Han

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Deep Convolutional Neural Networks ◽

Weakly Supervised Learning ◽

Weakly Supervised

Download Full-text

Weakly supervised semantic segmentation of tomographic images in the diagnosis of stroke

Journal of Physics Conference Series ◽

10.1088/1742-6596/2099/1/012021 ◽

2021 ◽

Vol 2099 (1) ◽

pp. 012021

Author(s):

A V Dobshik ◽

A A Tulupov ◽

V B Berikov

Keyword(s):

Computed Tomography ◽

Network Architecture ◽

Semantic Segmentation ◽

Training Data ◽

Neural Network Architecture ◽

Brain Images ◽

Tomographic Images ◽

Computed Tomography Images ◽

Segmentation Accuracy ◽

Weakly Supervised

Abstract This paper presents an automatic algorithm for the segmentation of areas affected by an acute stroke in the non-contrast computed tomography brain images. The proposed algorithm is designed for learning in a weakly supervised scenario when some images are labeled accurately, and some images are labeled inaccurately. Wrong labels appear as a result of inaccuracy made by a radiologist in the process of manual annotation of computed tomography images. We propose methods for solving the segmentation problem in the case of inaccurately labeled training data. We use the U-Net neural network architecture with several modifications. Experiments on real computed tomography scans show that the proposed methods increase the segmentation accuracy.

Download Full-text

Semantic Image Segmentation with Deep Convolutional Neural Networks and Quick Shift

Symmetry ◽

10.3390/sym12030427 ◽

2020 ◽

Vol 12 (3) ◽

pp. 427 ◽

Cited By ~ 1

Author(s):

Sanxing Zhang ◽

Zhenhuan Ma ◽

Gang Zhang ◽

Tao Lei ◽

Rui Zhang ◽

...

Keyword(s):

Neural Networks ◽

Image Segmentation ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Input Image ◽

Feature Representation ◽

Segmentation Algorithm ◽

Deep Convolutional Neural Networks ◽

Semantic Image Segmentation

Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.

Download Full-text

Weakly Supervised and Semi-Supervised Semantic Segmentation for Optic Disc of Fundus Image

Symmetry ◽

10.3390/sym12010145 ◽

2020 ◽

Vol 12 (1) ◽

pp. 145 ◽

Cited By ~ 5

Author(s):

Zheng Lu ◽

Dali Chen

Keyword(s):

Computer Vision ◽

Image Segmentation ◽

Medical Image ◽

Semantic Segmentation ◽

Medical Image Segmentation ◽

Fundus Image ◽

Challenging Problem ◽

Weakly Supervised ◽

Fine Tune ◽

State Of Art

Weakly supervised and semi-supervised semantic segmentation has been widely used in the field of computer vision. Since it does not require groundtruth or it only needs a small number of groundtruths for training. Recently, some works use pseudo groundtruths which are generated by a classified network to train the model, however, this method is not suitable for medical image segmentation. To tackle this challenging problem, we use the GrabCut method to generate the pseudo groundtruths in this paper, and then we train the network based on a modified U-net model with the generated pseudo groundtruths, finally we utilize a small amount of groundtruths to fine tune the model. Extensive experiments on the challenging RIM-ONE and DRISHTI-GS benchmarks strongly demonstrate the effectiveness of our algorithm. We obtain state-of-art results on RIM-ONE and DRISHTI-GS databases.

Download Full-text

A Weakly Supervised Semantic Segmentation Network by Aggregating Seed Cues: The Multi-Object Proposal Generation Perspective

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3419842 ◽

2021 ◽

Vol 17 (1s) ◽

pp. 1-19

Author(s):

Junsheng Xiao ◽

Huahu Xu ◽

Honghao Gao ◽

Minjie Bian ◽

Yang Li

Keyword(s):

Image Classification ◽

Real World ◽

Semantic Segmentation ◽

Feature Maps ◽

High Confidence ◽

Deep Convolutional Neural Networks ◽

Object Proposal ◽

Initial Location ◽

Weakly Supervised ◽

High Level

Weakly supervised semantic segmentation under image-level annotations is effectiveness for real-world applications. The small and sparse discriminative regions obtained from an image classification network that are typically used as the important initial location of semantic segmentation also form the bottleneck. Although deep convolutional neural networks (DCNNs) have exhibited promising performances for single-label image classification tasks, images of the real-world usually contain multiple categories, which is still an open problem. So, the problem of obtaining high-confidence discriminative regions from multi-label classification networks remains unsolved. To solve this problem, this article proposes an innovative three-step framework within the perspective of multi-object proposal generation. First, an image is divided into candidate boxes using the object proposal method. The candidate boxes are sent to a single-classification network to obtain the discriminative regions. Second, the discriminative regions are aggregated to obtain a high-confidence seed map. Third, the seed cues grow on the feature maps of high-level semantics produced by a backbone segmentation network. Experiments are carried out on the PASCAL VOC 2012 dataset to verify the effectiveness of our approach, which is shown to outperform other baseline image segmentation methods.

Download Full-text

Seg2pix: Few Shot Training Line Art Colorization with Segmented Image Data

Applied Sciences ◽

10.3390/app11041464 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1464

Author(s):

Chang Wook Seo ◽

Yongduek Seo

Keyword(s):

Image Data ◽

Ground Truth ◽

Semantic Segmentation ◽

Training Data ◽

High Quality ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Segmented Image ◽

Image Translation ◽

Segmentation Image

There are various challenging issues in automating line art colorization. In this paper, we propose a GAN approach incorporating semantic segmentation image data. Our GAN-based method, named Seg2pix, can automatically generate high quality colorized images, aiming at computerizing one of the most tedious and repetitive jobs performed by coloring workers in the webtoon industry. The network structure of Seg2pix is mostly a modification of the architecture of Pix2pix, which is a convolution-based generative adversarial network for image-to-image translation. Through this method, we can generate high quality colorized images of a particular character with only a few training data. Seg2pix is designed to reproduce a segmented image, which becomes the suggestion data for line art colorization. The segmented image is automatically generated through a generative network with a line art image and a segmentation ground truth. In the next step, this generative network creates a colorized image from the line art and segmented image, which is generated from the former step of the generative network. To summarize, only one line art image is required for testing the generative model, and an original colorized image and segmented image are additionally required as the ground truth for training the model. These generations of the segmented image and colorized image proceed by an end-to-end method sharing the same loss functions. By using this method, we produce better qualitative results for automatic colorization of a particular character’s line art. This improvement can also be measured by quantitative results with Learned Perceptual Image Patch Similarity (LPIPS) comparison. We believe this may help artists exercise their creative expertise mainly in the area where computerization is not yet capable.

Download Full-text

WEAKLY SUPERVISED PSEUDO-LABEL ASSISTED LEARNING FOR ALS POINT CLOUD SEMANTIC SEGMENTATION

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2021-43-2021 ◽

2021 ◽

Vol V-2-2021 ◽

pp. 43-50

Author(s):

P. Wang ◽

W. Yao

Keyword(s):

Point Cloud ◽

Three Dimensional ◽

Ground Truth ◽

Semantic Segmentation ◽

Training Data ◽

Data Annotation ◽

Cloud Data ◽

Point Cloud Segmentation ◽

Prediction Probability ◽

Weakly Supervised

Abstract. Competitive point cloud semantic segmentation results usually rely on a large amount of labeled data. However, data annotation is a time-consuming and labor-intensive task, particularly for three-dimensional point cloud data. Thus, obtaining accurate results with limited ground truth as training data is considerably important. As a simple and effective method, pseudo labels can use information from unlabeled data for training neural networks. In this study, we propose a pseudo-label-assisted point cloud segmentation method with very few sparsely sampled labels that are normally randomly selected for each class. An adaptive thresholding strategy was proposed to generate a pseudo-label based on the prediction probability. Pseudo-label learning is an iterative process, and pseudo labels were updated solely on ground-truth weak labels as the model converged to improve the training efficiency. Experiments using the ISPRS 3D sematic labeling benchmark dataset indicated that our proposed method achieved an equally competitive result compared to that using a full supervision scheme with only up to 2‰ of labeled points from the original training set, with an overall accuracy of 83.7% and an average F1 score of 70.2%.

Download Full-text

Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks

Proceedings of the 23rd ACM international conference on Multimedia - MM '15 ◽

10.1145/2733373.2806322 ◽

2015 ◽

Cited By ~ 5

Author(s):

Yuhang Wang ◽

Jing Liu ◽

Yong Li ◽

Hanqing Lu

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Deep Convolutional Neural Networks ◽

Weakly Supervised

Download Full-text

A regularized convolutional neural network for semantic image segmentation

Analysis and Applications ◽

10.1142/s0219530519410148 ◽

2020 ◽

Vol 19 (01) ◽

pp. 147-165 ◽

Cited By ~ 1

Author(s):

Fan Jia ◽

Jun Liu ◽

Xue-Cheng Tai

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Segmentation ◽

Semantic Segmentation ◽

Training Data ◽

Classification Problems ◽

First Choice ◽

Spatial Regularity ◽

Isolated Points ◽

Spatial Regularization

Convolutional neural networks (CNNs) have achieved prominent performance in a series of image processing problems. CNNs become the first choice for dense classification problems such as semantic segmentation. However, CNNs predict the class of each pixel independently in semantic segmentation tasks, spatial regularity of the segmented objects is still a problem for these methods. Especially when given few training data, CNN could not perform well in the details, isolated and scattered small regions often appear in all kinds of CNN segmentation results. In this paper, we propose a method to add spatial regularization to the segmented objects. In our method, the spatial regularization such as total variation (TV) can be easily integrated into CNN network and it produces smooth edges and eliminate isolated points. We apply our proposed method to Unet and Segnet, which are well-established CNNs for image segmentation, and test them on WBC and CamVid datasets, respectively. The results show that the details of predictions are well improved by regularized networks.

Download Full-text

DE-Net: Deep Encoding Network for Building Extraction from High-Resolution Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs11202380 ◽

2019 ◽

Vol 11 (20) ◽

pp. 2380 ◽

Cited By ~ 7

Author(s):

Liu ◽

Luo ◽

Huang ◽

Hu ◽

Sun ◽

...

Keyword(s):

Remote Sensing ◽

Image Segmentation ◽

High Resolution ◽

Spatial Information ◽

Training Data ◽

Fine Tuning ◽

Building Extraction ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Remote Sensing Imagery

Deep convolutional neural networks have promoted significant progress in building extraction from high-resolution remote sensing imagery. Although most of such work focuses on modifying existing image segmentation networks in computer vision, we propose a new network in this paper, Deep Encoding Network (DE-Net), that is designed for the very problem based on many lately introduced techniques in image segmentation. Four modules are used to construct DE-Net: the inceptionstyle downsampling modules combining a striding convolution layer and a max-pooling layer, the encoding modules comprising six linear residual blocks with a scaled exponential linear unit (SELU) activation function, the compressing modules reducing the feature channels, and a densely upsampling module that enables the network to encode spatial information inside feature maps. Thus, DE-Net achieves stateoftheart performance on the WHU Building Dataset in recall, F1-Score, and intersection over union (IoU) metrics without pretraining. It also outperformed several segmentation networks in our self-built Suzhou Satellite Building Dataset. The experimental results validate the effectiveness of DE-Net on building extraction from aerial imagery and satellite imagery. It also suggests that given enough training data, designing and training a network from scratch may excel fine-tuning models pre-trained on datasets unrelated to building extraction.

Download Full-text