scholarly journals Image-level to Pixel-wise Labeling: From Theory to Practice

Author(s):  
Tiezhu Sun ◽  
Wei Zhang ◽  
Zhijie Wang ◽  
Lin Ma ◽  
Zequn Jie

Conventional convolutional neural networks (CNNs) have achieved great success in image semantic segmentation. Existing methods mainly focus on learning pixel-wise labels from an image directly. In this paper, we advocate tackling the pixel-wise segmentation problem by considering the image-level classification labels. Theoretically, we analyze and discuss the effects of image-level labels on pixel-wise segmentation from the perspective of information theory. In practice, an end-to-end segmentation model is built by fusing the image-level and pixel-wise labeling networks. A generative network is included to reconstruct the input image and further boost the segmentation model training with an auxiliary loss. Extensive experimental results on benchmark dataset demonstrate the effectiveness of the proposed method, where good image-level labels can significantly improve the pixel-wise segmentation accuracy.

2021 ◽  
Vol 4 (1) ◽  
pp. 71-79
Author(s):  
Borys Igorovych Tymchenko

Nowadays, means of preventive management in various spheres of human life are actively developing. The task of automated screening is to detect hidden problems at an early stage without human intervention, while the cost of responding to them is low. Visual inspection is often used to perform a screening task. Deep artificial neural networks are especially popular in image processing. One of the main problems when working with them is the need for a large amount of well-labeled data for training. In automated screening systems, available neural network approaches have limitations on the reliability of predictions due to the lack of accurately marked training data, as obtaining quality markup from professionals is very expensive, and sometimes not possible in principle. Therefore, there is a contradiction between increasing the requirements for the precision of predictions of neural network models without increasing the time spent on the one hand, and the need to reduce the cost of obtaining the markup of educational data. In this paper, we propose the parametric model of the segmentation dataset, which can be used to generate training data for model selection and benchmarking; and the multi-task learning method for training and inference of deep neural networks for semantic segmentation. Based on the proposed method, we develop a semi-supervised approach for segmentation of salient regions for classification task. The main advantage of the proposed method is that it uses semantically-similar general tasks, that have better labeling than original one, what allows users to reduce the cost of the labeling process. We propose to use classification task as a more general to the problem of semantic segmentation. As semantic segmentation aims to classify each pixel in the input image, classification aims to assign a class to all of the pixels in the input image. We evaluate our methods using the proposed dataset model, observing the Dice score improvement by seventeen percent. Additionally, we evaluate the robustness of the proposed method to different amount of the noise in labels and observe consistent improvement over baseline version.


Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 427 ◽  
Author(s):  
Sanxing Zhang ◽  
Zhenhuan Ma ◽  
Gang Zhang ◽  
Tao Lei ◽  
Rui Zhang ◽  
...  

Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Lai Song ◽  
Jiajin Yi ◽  
Jialin Peng

Semantic segmentation plays a crucial role in cardiac magnetic resonance (MR) image analysis. Although supervised deep learning methods have made significant performance improvements, they highly rely on a large amount of pixel-wise annotated data, which are often unavailable in clinical practices. Besides, top-performing methods usually have a vast number of parameters, which result in high computation complexity for model training and testing. This study addresses cardiac image segmentation in scenarios where few labeled data are available with a lightweight cross-consistency network named LCC-Net. Specifically, to reduce the risk of overfitting on small labeled datasets, we substitute computationally intensive standard convolutions with a lightweight module. To leverage plenty of unlabeled data, we introduce extreme consistency learning, which enforces equivariant constraints on the predictions of different perturbed versions of the input image. Cutting and mixing different training images, as an extreme perturbation on both the labeled and unlabeled data, are utilized to enhance the robust representation learning. Extensive comparisons demonstrate that the proposed model shows promising performance with high annotation- and computation-efficiency. With only two annotated subjects for model training, the LCC-Net obtains a performance gain of 14.4% in the mean Dice over the baseline U-Net trained from scratch.


Mathematics ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 545 ◽  
Author(s):  
Hsin-Jui Chen ◽  
Shanq-Jang Ruan ◽  
Sha-Wo Huang ◽  
Yan-Tsung Peng

Automatically locating the lung regions effectively and efficiently in digital chest X-ray (CXR) images is important in computer-aided diagnosis. In this paper, we propose an adaptive pre-processing approach for segmenting the lung regions from CXR images using convolutional neural networks-based (CNN-based) architectures. It is comprised of three steps. First, a contrast enhancement method specifically designed for CXR images is adopted. Second, adaptive image binarization is applied to CXR images to separate the image foreground and background. Third, CNN-based architectures are trained on the binarized images for image segmentation. The experimental results show that the proposed pre-processing approach is applicable and effective to various CNN-based architectures and can achieve comparable segmentation accuracy to that of state-of-the-art methods while greatly expediting the model training by up to 20.74 % and reducing storage space for CRX image datasets by down to 94.6 % on average.


2021 ◽  
Vol 7 (2) ◽  
pp. 391-394
Author(s):  
Richard Bieck ◽  
David Baur ◽  
Johann Berger ◽  
Tim Stelzner ◽  
Anna Völker ◽  
...  

Abstract We introduce a system that allows the immediate identification and inspection of fat and muscle structures around the lumbar spine as a means of orthopaedic diagnostics before surgical treatment. The system comprises a backend component that accepts MRI data from a web-based interactive frontend as REST requests. The MRI data is passed through a U-net model, fine-tuned on lumbar MRI images, to generate segmentation masks of fat and muscle areas. The result is sent back to the frontend that functions as an inspection tool. For the model training, 4000 MRI images from 108 patients were used in a k-fold cross-validation study with k = 10. The model training was performed over 25-30 epochs. We applied shift, scale, and rotation operations as well as elastic deformation and distortion functions for image augmentation and a combined objective function using Dice and Focal loss. The trained models reached a mean dice score of 0.83 and 0.52 and a mean area error tissue of 0.1 and 0.3 for muscle and fat tissue, respectively. The interactive webbased frontend as an inspection tool was evaluated by clinicians to be suitable for the exploration of patient data as well as the assessment of segmentation results. We developed a system that uses semantic segmentation to identify fat and muscle tissue areas in MRI images of the lumbar spine. Further improvements should focus on the segmentation accuracy of fat tissue, as it is a determining factor in surgical decisionmaking. To our knowledge, this is the first system that automatically provides semantic information of the respective lumbar tissues.


Author(s):  
Y. A. Lumban-Gaol ◽  
Z. Chen ◽  
M. Smit ◽  
X. Li ◽  
M. A. Erbaşu ◽  
...  

Abstract. Point cloud data have rich semantic representations and can benefit various applications towards a digital twin. However, they are unordered and anisotropically distributed, thus being unsuitable for a typical Convolutional Neural Networks (CNN) to handle. With the advance of deep learning, several neural networks claim to have solved the point cloud semantic segmentation problem. This paper evaluates three different neural networks for semantic segmentation of point clouds, namely PointNet++, PointCNN and DGCNN. A public indoor scene of the Amersfoort railway station is used as the study area. Unlike the typical indoor scenes and even more from the ubiquitous outdoor ones in currently available datasets, the station consists of objects such as the entrance gates, ticket machines, couches, and garbage cans. For the experiment, we use subsets from the data, remove the noise, evaluate the performance of the selected neural networks. The results indicate an overall accuracy of more than 90% for all the networks but vary in terms of mean class accuracy and mean Intersection over Union (IoU). The misclassification mainly occurs in the classes of couch and garbage can. Several factors that may contribute to the errors are analyzed, such as the quality of the data and the proportion of the number of points per class. The adaptability of the networks is also heavily dependent on the training location: the overall characteristics of the train station make a trained network for one location less suitable for another.


2020 ◽  
Author(s):  
Ali Hatamizadeh ◽  
Demetri Terzopoulos ◽  
Andriy Myronenko

AbstractTextures and edges contribute different information to image recognition. Edges and boundaries encode shape information, while textures manifest the appearance of regions. Despite the success of Convolutional Neural Networks (CNNs) in computer vision and medical image analysis applications, predominantly only texture abstractions are learned, which often leads to imprecise boundary delineations. In medical imaging, expert manual segmentation often relies on organ boundaries; for example, to manually segment a liver, a medical practitioner usually identifies edges first and subsequently fills in the segmentation mask. Motivated by these observations, we propose a plug-and-play module, dubbed Edge-Gated CNNs (EG-CNNs), that can be used with existing encoder-decoder architectures to process both edge and texture information. The EG-CNN learns to emphasize the edges in the encoder, to predict crisp boundaries by an auxiliary edge supervision, and to fuse its output with the original CNN output. We evaluate the effectiveness of the EG-CNN with various mainstream CNNs on two publicly available datasets, BraTS 19 and KiTS 19 for brain tumor and kidney semantic segmentation. We demonstrate how the addition of EG-CNN consistently improves segmentation accuracy and generalization performance.


Author(s):  
T. S. Akiyama ◽  
J. Marcato Junior ◽  
W. N. Gonçalves ◽  
P. O. Bressan ◽  
A. Eltner ◽  
...  

Abstract. The use of deep learning (DL) with convolutional neural networks (CNN) to monitor surface water can be a valuable supplement to costly and labour-intense standard gauging stations. This paper presents the application of a recent CNN semantic segmentation method (SegNet) to automatically segment river water in imagery acquired by RGB sensors. This approach can be used as a new supporting tool because there are only a few studies using DL techniques to monitor water resources. The study area is a medium-scale river (Wesenitz) located in the East of Germany. The captured images reflect different periods of the day over a period of approximately 50 days, allowing for the analysis of the river in different environmental conditions and situations. In the experiments, we evaluated the input image resolutions of 256 × 256 and 512 × 512 pixels to assess their influence on the performance of river segmentation. The performance of the CNN was measured with the pixel accuracy and IoU metrics revealing an accuracy of 98% and 97%, respectively, for both resolutions, indicating that our approach is efficient to segment water in RGB imagery.


2021 ◽  
Vol 19 (3) ◽  
pp. 26-39
Author(s):  
D. E. Shabalina ◽  
K. S. Lanchukovskaya ◽  
T. V. Liakh ◽  
K. V. Chaika

The article is devoted to evaluation of the applicability of existing semantic segmentation algorithms for the “Duckietown” simulator. The article explores classical semantic segmentation algorithms as well as ones based on neural networks. We also examined machine learning frameworks, taking into account all the limitations of the “Duckietown” simulator. According to the research results, we selected neural network algorithms based on U-Net, SegNet, DeepLab-v3, FC-DenceNet and PSPNet networks to solve the segmentation problem in the “Duckietown” project. U-Net and SegNet have been tested on the “Duckietown” simulator.


Author(s):  
Ryan Cotterell ◽  
Hinrich Schütze

Much like sentences are composed of words, words themselves are composed of smaller units. For example, the English word questionably can be analyzed as question+ able+ ly. However, this structural decomposition of the word does not directly give us a semantic representation of the word’s meaning. Since morphology obeys the principle of compositionality, the semantics of the word can be systematically derived from the meaning of its parts. In this work, we propose a novel probabilistic model of word formation that captures both the analysis of a word w into its constituent segments and the synthesis of the meaning of w from the meanings of those segments. Our model jointly learns to segment words into morphemes and compose distributional semantic vectors of those morphemes. We experiment with the model on English CELEX data and German DErivBase (Zeller et al., 2013) data. We show that jointly modeling semantics increases both segmentation accuracy and morpheme F1 by between 3% and 5%. Additionally, we investigate different models of vector composition, showing that recurrent neural networks yield an improvement over simple additive models. Finally, we study the degree to which the representations correspond to a linguist’s notion of morphological productivity.


Sign in / Sign up

Export Citation Format

Share Document