Image-level to Pixel-wise Labeling: From Theory to Practice

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/129 ◽

2018 ◽

Cited By ~ 1

Author(s):

Tiezhu Sun ◽

Wei Zhang ◽

Zhijie Wang ◽

Lin Ma ◽

Zequn Jie

Keyword(s):

Neural Networks ◽

Information Theory ◽

Semantic Segmentation ◽

Input Image ◽

Great Success ◽

Good Image ◽

Theory To Practice ◽

Segmentation Accuracy ◽

Model Training ◽

Segmentation Problem

Conventional convolutional neural networks (CNNs) have achieved great success in image semantic segmentation. Existing methods mainly focus on learning pixel-wise labels from an image directly. In this paper, we advocate tackling the pixel-wise segmentation problem by considering the image-level classification labels. Theoretically, we analyze and discuss the effects of image-level labels on pixel-wise segmentation from the perspective of information theory. In practice, an end-to-end segmentation model is built by fusing the image-level and pixel-wise labeling networks. A generative network is included to reconstruct the input image and further boost the segmentation model training with an auxiliary loss. Extensive experimental results on benchmark dataset demonstrate the effectiveness of the proposed method, where good image-level labels can significantly improve the pixel-wise segmentation accuracy.

Download Full-text

NEURAL NETWORK METHODS FOR PLANAR IMAGE ANALYSIS IN AUTOMATED SCREENING SYSTEMS

Applied Aspects of Information Technology ◽

10.15276/aait.01.2021.6 ◽

2021 ◽

Vol 4 (1) ◽

pp. 71-79

Author(s):

Borys Igorovych Tymchenko

Keyword(s):

Neural Network ◽

Neural Networks ◽

Early Stage ◽

Semantic Segmentation ◽

Input Image ◽

Training Data ◽

Classification Task ◽

Automated Screening ◽

The Cost ◽

Screening Systems

Nowadays, means of preventive management in various spheres of human life are actively developing. The task of automated screening is to detect hidden problems at an early stage without human intervention, while the cost of responding to them is low. Visual inspection is often used to perform a screening task. Deep artificial neural networks are especially popular in image processing. One of the main problems when working with them is the need for a large amount of well-labeled data for training. In automated screening systems, available neural network approaches have limitations on the reliability of predictions due to the lack of accurately marked training data, as obtaining quality markup from professionals is very expensive, and sometimes not possible in principle. Therefore, there is a contradiction between increasing the requirements for the precision of predictions of neural network models without increasing the time spent on the one hand, and the need to reduce the cost of obtaining the markup of educational data. In this paper, we propose the parametric model of the segmentation dataset, which can be used to generate training data for model selection and benchmarking; and the multi-task learning method for training and inference of deep neural networks for semantic segmentation. Based on the proposed method, we develop a semi-supervised approach for segmentation of salient regions for classification task. The main advantage of the proposed method is that it uses semantically-similar general tasks, that have better labeling than original one, what allows users to reduce the cost of the labeling process. We propose to use classification task as a more general to the problem of semantic segmentation. As semantic segmentation aims to classify each pixel in the input image, classification aims to assign a class to all of the pixels in the input image. We evaluate our methods using the proposed dataset model, observing the Dice score improvement by seventeen percent. Additionally, we evaluate the robustness of the proposed method to different amount of the noise in labels and observe consistent improvement over baseline version.

Download Full-text

Semantic Image Segmentation with Deep Convolutional Neural Networks and Quick Shift

Symmetry ◽

10.3390/sym12030427 ◽

2020 ◽

Vol 12 (3) ◽

pp. 427 ◽

Cited By ~ 1

Author(s):

Sanxing Zhang ◽

Zhenhuan Ma ◽

Gang Zhang ◽

Tao Lei ◽

Rui Zhang ◽

...

Keyword(s):

Neural Networks ◽

Image Segmentation ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Input Image ◽

Feature Representation ◽

Segmentation Algorithm ◽

Deep Convolutional Neural Networks ◽

Semantic Image Segmentation

Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.

Download Full-text

LCC-Net: A Lightweight Cross-Consistency Network for Semisupervised Cardiac MR Image Segmentation

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/9960199 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Lai Song ◽

Jiajin Yi ◽

Jialin Peng

Keyword(s):

Image Segmentation ◽

Semantic Segmentation ◽

Representation Learning ◽

Input Image ◽

Unlabeled Data ◽

Clinical Practices ◽

Mr Image ◽

Vast Number ◽

Computation Efficiency ◽

Model Training

Semantic segmentation plays a crucial role in cardiac magnetic resonance (MR) image analysis. Although supervised deep learning methods have made significant performance improvements, they highly rely on a large amount of pixel-wise annotated data, which are often unavailable in clinical practices. Besides, top-performing methods usually have a vast number of parameters, which result in high computation complexity for model training and testing. This study addresses cardiac image segmentation in scenarios where few labeled data are available with a lightweight cross-consistency network named LCC-Net. Specifically, to reduce the risk of overfitting on small labeled datasets, we substitute computationally intensive standard convolutions with a lightweight module. To leverage plenty of unlabeled data, we introduce extreme consistency learning, which enforces equivariant constraints on the predictions of different perturbed versions of the input image. Cutting and mixing different training images, as an extreme perturbation on both the labeled and unlabeled data, are utilized to enhance the robust representation learning. Extensive comparisons demonstrate that the proposed model shows promising performance with high annotation- and computation-efficiency. With only two annotated subjects for model training, the LCC-Net obtains a performance gain of 14.4% in the mean Dice over the baseline U-Net trained from scratch.

Download Full-text

Lung X-ray Segmentation using Deep Convolutional Neural Networks on Contrast-Enhanced Binarized Images

Mathematics ◽

10.3390/math8040545 ◽

2020 ◽

Vol 8 (4) ◽

pp. 545 ◽

Cited By ~ 2

Author(s):

Hsin-Jui Chen ◽

Shanq-Jang Ruan ◽

Sha-Wo Huang ◽

Yan-Tsung Peng

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks ◽

X Ray ◽

Processing Approach ◽

Segmentation Accuracy ◽

Enhancement Method ◽

Digital Chest ◽

Chest X Ray ◽

Model Training

Automatically locating the lung regions effectively and efficiently in digital chest X-ray (CXR) images is important in computer-aided diagnosis. In this paper, we propose an adaptive pre-processing approach for segmenting the lung regions from CXR images using convolutional neural networks-based (CNN-based) architectures. It is comprised of three steps. First, a contrast enhancement method specifically designed for CXR images is adopted. Second, adaptive image binarization is applied to CXR images to separate the image foreground and background. Third, CNN-based architectures are trained on the binarized images for image segmentation. The experimental results show that the proposed pre-processing approach is applicable and effective to various CNN-based architectures and can achieve comparable segmentation accuracy to that of state-of-the-art methods while greatly expediting the model training by up to 20.74 % and reducing storage space for CRX image datasets by down to 94.6 % on average.

Download Full-text

An interactive system for muscle and fat tissue identification of the lumbar spine using semantic segmentation

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2021-2099 ◽

2021 ◽

Vol 7 (2) ◽

pp. 391-394

Author(s):

Richard Bieck ◽

David Baur ◽

Johann Berger ◽

Tim Stelzner ◽

Anna Völker ◽

...

Keyword(s):

Lumbar Spine ◽

Semantic Segmentation ◽

Interactive System ◽

Fat Tissue ◽

Web Based ◽

Tissue Identification ◽

Distortion Functions ◽

Segmentation Accuracy ◽

Lumbar Mri ◽

Model Training

Abstract We introduce a system that allows the immediate identification and inspection of fat and muscle structures around the lumbar spine as a means of orthopaedic diagnostics before surgical treatment. The system comprises a backend component that accepts MRI data from a web-based interactive frontend as REST requests. The MRI data is passed through a U-net model, fine-tuned on lumbar MRI images, to generate segmentation masks of fat and muscle areas. The result is sent back to the frontend that functions as an inspection tool. For the model training, 4000 MRI images from 108 patients were used in a k-fold cross-validation study with k = 10. The model training was performed over 25-30 epochs. We applied shift, scale, and rotation operations as well as elastic deformation and distortion functions for image augmentation and a combined objective function using Dice and Focal loss. The trained models reached a mean dice score of 0.83 and 0.52 and a mean area error tissue of 0.1 and 0.3 for muscle and fat tissue, respectively. The interactive webbased frontend as an inspection tool was evaluated by clinicians to be suitable for the exploration of patient data as well as the assessment of segmentation results. We developed a system that uses semantic segmentation to identify fat and muscle tissue areas in MRI images of the lumbar spine. Further improvements should focus on the segmentation accuracy of fat tissue, as it is a determining factor in surgical decisionmaking. To our knowledge, this is the first system that automatically provides semantic information of the respective lumbar tissues.

Download Full-text

A COMPARATIVE STUDY OF POINT CLOUDS SEMANTIC SEGMENTATION USING THREE DIFFERENT NEURAL NETWORKS ON THE RAILWAY STATION DATASET

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2021-223-2021 ◽

2021 ◽

Vol XLIII-B3-2021 ◽

pp. 223-228

Author(s):

Y. A. Lumban-Gaol ◽

Z. Chen ◽

M. Smit ◽

X. Li ◽

M. A. Erbaşu ◽

...

Keyword(s):

Neural Networks ◽

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

Railway Station ◽

Train Station ◽

Cloud Data ◽

Indoor Scenes ◽

Segmentation Problem

Abstract. Point cloud data have rich semantic representations and can benefit various applications towards a digital twin. However, they are unordered and anisotropically distributed, thus being unsuitable for a typical Convolutional Neural Networks (CNN) to handle. With the advance of deep learning, several neural networks claim to have solved the point cloud semantic segmentation problem. This paper evaluates three different neural networks for semantic segmentation of point clouds, namely PointNet++, PointCNN and DGCNN. A public indoor scene of the Amersfoort railway station is used as the study area. Unlike the typical indoor scenes and even more from the ubiquitous outdoor ones in currently available datasets, the station consists of objects such as the entrance gates, ticket machines, couches, and garbage cans. For the experiment, we use subsets from the data, remove the noise, evaluate the performance of the selected neural networks. The results indicate an overall accuracy of more than 90% for all the networks but vary in terms of mean class accuracy and mean Intersection over Union (IoU). The misclassification mainly occurs in the classes of couch and garbage can. Several factors that may contribute to the errors are analyzed, such as the quality of the data and the proportion of the number of points per class. The adaptability of the networks is also heavily dependent on the training location: the overall characteristics of the train station make a trained network for one location less suitable for another.

Download Full-text

Edge-Gated CNNs for Volumetric Semantic Segmentation of Medical Images

10.1101/2020.03.14.992115 ◽

2020 ◽

Author(s):

Ali Hatamizadeh ◽

Demetri Terzopoulos ◽

Andriy Myronenko

Keyword(s):

Neural Networks ◽

Image Analysis ◽

Medical Image ◽

Medical Image Analysis ◽

Medical Practitioner ◽

Semantic Segmentation ◽

Manual Segmentation ◽

Shape Information ◽

Texture Information ◽

Segmentation Accuracy

AbstractTextures and edges contribute different information to image recognition. Edges and boundaries encode shape information, while textures manifest the appearance of regions. Despite the success of Convolutional Neural Networks (CNNs) in computer vision and medical image analysis applications, predominantly only texture abstractions are learned, which often leads to imprecise boundary delineations. In medical imaging, expert manual segmentation often relies on organ boundaries; for example, to manually segment a liver, a medical practitioner usually identifies edges first and subsequently fills in the segmentation mask. Motivated by these observations, we propose a plug-and-play module, dubbed Edge-Gated CNNs (EG-CNNs), that can be used with existing encoder-decoder architectures to process both edge and texture information. The EG-CNN learns to emphasize the edges in the encoder, to predict crisp boundaries by an auxiliary edge supervision, and to fuse its output with the original CNN output. We evaluate the effectiveness of the EG-CNN with various mainstream CNNs on two publicly available datasets, BraTS 19 and KiTS 19 for brain tumor and kidney semantic segmentation. We demonstrate how the addition of EG-CNN consistently improves segmentation accuracy and generalization performance.

Download Full-text

DEEP LEARNING APPLIED TO WATER SEGMENTATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-1189-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 1189-1193

Author(s):

T. S. Akiyama ◽

J. Marcato Junior ◽

W. N. Gonçalves ◽

P. O. Bressan ◽

A. Eltner ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Surface Water ◽

Water Resources ◽

River Water ◽

Environmental Conditions ◽

Semantic Segmentation ◽

Input Image ◽

Segmentation Method ◽

Supporting Tool

Abstract. The use of deep learning (DL) with convolutional neural networks (CNN) to monitor surface water can be a valuable supplement to costly and labour-intense standard gauging stations. This paper presents the application of a recent CNN semantic segmentation method (SegNet) to automatically segment river water in imagery acquired by RGB sensors. This approach can be used as a new supporting tool because there are only a few studies using DL techniques to monitor water resources. The study area is a medium-scale river (Wesenitz) located in the East of Germany. The captured images reflect different periods of the day over a period of approximately 50 days, allowing for the analysis of the river in different environmental conditions and situations. In the experiments, we evaluated the input image resolutions of 256 × 256 and 512 × 512 pixels to assess their influence on the performance of river segmentation. The performance of the CNN was measured with the pixel accuracy and IoU metrics revealing an accuracy of 98% and 97%, respectively, for both resolutions, indicating that our approach is efficient to segment water in RGB imagery.

Download Full-text

Semantic Image Segmentation in Duckietown

Vestnik NSU Series Information Technologies ◽

10.25205/1818-7900-2021-19-3-26-39 ◽

2021 ◽

Vol 19 (3) ◽

pp. 26-39

Author(s):

D. E. Shabalina ◽

K. S. Lanchukovskaya ◽

T. V. Liakh ◽

K. V. Chaika

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Image Segmentation ◽

Semantic Segmentation ◽

Network Algorithms ◽

Segmentation Algorithms ◽

Classical Semantic ◽

Learning Frameworks ◽

Segmentation Problem

The article is devoted to evaluation of the applicability of existing semantic segmentation algorithms for the “Duckietown” simulator. The article explores classical semantic segmentation algorithms as well as ones based on neural networks. We also examined machine learning frameworks, taking into account all the limitations of the “Duckietown” simulator. According to the research results, we selected neural network algorithms based on U-Net, SegNet, DeepLab-v3, FC-DenceNet and PSPNet networks to solve the segmentation problem in the “Duckietown” project. U-Net and SegNet have been tested on the “Duckietown” simulator.

Download Full-text

Joint Semantic Synthesis and Morphological Analysis of the Derived Word

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00003 ◽

2018 ◽

Vol 6 ◽

pp. 33-48 ◽

Cited By ~ 4

Author(s):

Ryan Cotterell ◽

Hinrich Schütze

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Probabilistic Model ◽

Morphological Analysis ◽

Semantic Representation ◽

English Word ◽

Word Formation ◽

Additive Models ◽

Segmentation Accuracy ◽

Morphological Productivity

Much like sentences are composed of words, words themselves are composed of smaller units. For example, the English word questionably can be analyzed as question+ able+ ly. However, this structural decomposition of the word does not directly give us a semantic representation of the word’s meaning. Since morphology obeys the principle of compositionality, the semantics of the word can be systematically derived from the meaning of its parts. In this work, we propose a novel probabilistic model of word formation that captures both the analysis of a word w into its constituent segments and the synthesis of the meaning of w from the meanings of those segments. Our model jointly learns to segment words into morphemes and compose distributional semantic vectors of those morphemes. We experiment with the model on English CELEX data and German DErivBase (Zeller et al., 2013) data. We show that jointly modeling semantics increases both segmentation accuracy and morpheme F1 by between 3% and 5%. Additionally, we investigate different models of vector composition, showing that recurrent neural networks yield an improvement over simple additive models. Finally, we study the degree to which the representations correspond to a linguist’s notion of morphological productivity.

Download Full-text