Supervoxel Convolution for Online 3D Semantic Segmentation

2021 ◽  
Vol 40 (3) ◽  
pp. 1-15
Author(s):  
Shi-Sheng Huang ◽  
Ze-Yu Ma ◽  
Tai-Jiang Mu ◽  
Hongbo Fu ◽  
Shi-Min Hu

Online 3D semantic segmentation, which aims to perform real-time 3D scene reconstruction along with semantic segmentation, is an important but challenging topic. A key challenge is to strike a balance between efficiency and segmentation accuracy. There are very few deep-learning-based solutions to this problem, since the commonly used deep representations based on volumetric-grids or points do not provide efficient 3D representation and organization structure for online segmentation. Observing that on-surface supervoxels, i.e., clusters of on-surface voxels, provide a compact representation of 3D surfaces and brings efficient connectivity structure via supervoxel clustering, we explore a supervoxel-based deep learning solution for this task. To this end, we contribute a novel convolution operation (SVConv) directly on supervoxels. SVConv can efficiently fuse the multi-view 2D features and 3D features projected on supervoxels during the online 3D reconstruction, and leads to an effective supervoxel-based convolutional neural network, termed as Supervoxel-CNN , enabling 2D-3D joint learning for 3D semantic prediction. With the Supervoxel-CNN , we propose a clustering-then-prediction online 3D semantic segmentation approach. The extensive evaluations on the public 3D indoor scene datasets show that our approach significantly outperforms the existing online semantic segmentation systems in terms of efficiency or accuracy.

Author(s):  
Vladimir V. Kniaz ◽  
Vladimir A. Knyaz ◽  
Evgeny V. Ippolitov ◽  
Mikhail M. Novikov ◽  
Lev Grodzitsky ◽  
...  

Author(s):  
E A Dmitriev ◽  
V V Myasnikov

This paper presents a pixel-by-pixel possibility estimation of 3D scene reconstruction from multiple images. This method estimates conjugate pairs number with convolutional neural networks for further 3D reconstruction using classic approach. We considered neural networks that showed good results in semantic segmentation problem. The efficiency criterion of an algorithm is the resulting estimation accuracy. We conducted all experiments on images from Unity 3d program. The results of experiments showed the effectiveness of our approach in 3D scene reconstruction problem.


2020 ◽  
Vol 9 (10) ◽  
pp. 601
Author(s):  
Ahram Song ◽  
Yongil Kim

Although semantic segmentation of remote-sensing (RS) images using deep-learning networks has demonstrated its effectiveness recently, compared with natural-image datasets, obtaining RS images under the same conditions to construct data labels is difficult. Indeed, small datasets limit the effective learning of deep-learning networks. To address this problem, we propose a combined U-net model that is trained using a combined weighted loss function and can handle heterogeneous datasets. The network consists of encoder and decoder blocks. The convolutional layers that form the encoder blocks are shared with the heterogeneous datasets, and the decoder blocks are assigned separate training weights. Herein, the International Society for Photogrammetry and Remote Sensing (ISPRS) Potsdam and Cityscape datasets are used as the RS and natural-image datasets, respectively. When the layers are shared, only visible bands of the ISPRS Potsdam data are used. Experimental results show that when same-sized heterogeneous datasets are used, the semantic segmentation accuracy of the Potsdam data obtained using our proposed method is lower than that obtained using only the Potsdam data (four bands) with other methods, such as SegNet, DeepLab-V3+, and the simplified version of U-net. However, the segmentation accuracy of the Potsdam images is improved when the larger Cityscape dataset is used. The combined U-net model can effectively train heterogeneous datasets and overcome the insufficient training data problem in the context of RS-image datasets. Furthermore, it is expected that the proposed method can not only be applied to segmentation tasks of aerial images but also to tasks with various purposes of using big heterogeneous datasets.


Author(s):  
Yang Zhao ◽  
Wei Tian ◽  
Hong Cheng

AbstractWith the fast-developing deep learning models in the field of autonomous driving, the research on the uncertainty estimation of deep learning models has also prevailed. Herein, a pyramid Bayesian deep learning method is proposed for the model uncertainty evaluation of semantic segmentation. Semantic segmentation is one of the most important perception problems in understanding visual scene, which is critical for autonomous driving. This study to optimize Bayesian SegNet for uncertainty evaluation. This paper first simplifies the network structure of Bayesian SegNet by reducing the number of MC-Dropout layer and then introduces the pyramid pooling module to improve the performance of Bayesian SegNet. mIoU and mPAvPU are used as evaluation matrics to test the proposed method on the public Cityscapes dataset. The experimental results show that the proposed method improves the sampling effect of the Bayesian SegNet, shortens the sampling time, and improves the network performance.


2019 ◽  
Author(s):  
Jungirl Seok ◽  
Jae-Jin Song ◽  
Ja-Won Koo ◽  
Hee Chan Kim ◽  
Byung Yoon Choi

AbstractObjectivesThe purpose of this study was to create a deep learning model for the detection and segmentation of major structures of the tympanic membrane.MethodsTotal 920 tympanic endoscopic images had been stored were obtained, retrospectively. We constructed a detection and segmentation model using Mask R-CNN with ResNet-50 backbone targeting three clinically meaningful structures: (1) tympanic membrane (TM); (2) malleus with side of tympanic membrane; and (3) suspected perforation area. The images were randomly divided into three sets – taining set, validation set, and test set – at a ratio of 0.6:0.2:0.2, resulting in 548, 187, and 185 images, respectively. After assignment, 548 tympanic membrane images were augmented 50 times each, reaching 27,400 images.ResultsAt the most optimized point of the model, it achieved a mean average precision of 92.9% on test set. When an intersection over Union (IoU) score of greater than 0.5 was used as the reference point, the tympanic membrane was 100% detectable, the accuracy of side of the tympanic membrane based on the malleus segmentation was 88.6% and detection accuracy of suspicious perforation was 91.4%.ConclusionsAnatomical segmentation may allow the inclusion of an explanation provided by deep learning as part of the results. This method is applicable not only to tympanic endoscope, but also to sinus endoscope, laryngoscope, and stroboscope. Finally, it will be the starting point for the development of automated medical records descriptor of endoscope images.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257001
Author(s):  
Rubi Quiñones ◽  
Francisco Munoz-Arriola ◽  
Sruti Das Choudhury ◽  
Ashok Samal

Cosegmentation is a newly emerging computer vision technique used to segment an object from the background by processing multiple images at the same time. Traditional plant phenotyping analysis uses thresholding segmentation methods which result in high segmentation accuracy. Although there are proposed machine learning and deep learning algorithms for plant segmentation, predictions rely on the specific features being present in the training set. The need for a multi-featured dataset and analytics for cosegmentation becomes critical to better understand and predict plants’ responses to the environment. High-throughput phenotyping produces an abundance of data that can be leveraged to improve segmentation accuracy and plant phenotyping. This paper introduces four datasets consisting of two plant species, Buckwheat and Sunflower, each split into control and drought conditions. Each dataset has three modalities (Fluorescence, Infrared, and Visible) with 7 to 14 temporal images that are collected in a high-throughput facility at the University of Nebraska-Lincoln. The four datasets (which will be collected under the CosegPP data repository in this paper) are evaluated using three cosegmentation algorithms: Markov random fields-based, Clustering-based, and Deep learning-based cosegmentation, and one commonly used segmentation approach in plant phenotyping. The integration of CosegPP with advanced cosegmentation methods will be the latest benchmark in comparing segmentation accuracy and finding areas of improvement for cosegmentation methodology.


2020 ◽  
Author(s):  
Cefa Karabağ ◽  
Martin L. Jones ◽  
Christopher J. Peddie ◽  
Anne E. Weston ◽  
Lucy M. Collinson ◽  
...  

AbstractIn this work, images of a HeLa cancer cell were semantically segmented with one traditional image-processing algorithm and three deep learning architectures: VGG16, ResNet18 and Inception-ResNet-v2. Three hundred slices, each 2000 × 2000 pixels, of a HeLa Cell were acquired with Serial Block Face Scanning Electron Microscopy. The deep learning architectures were pre-trained with ImageNet and then fine-tuned with transfer learning. The image-processing algorithm followed a pipeline of several traditional steps like edge detection, dilation and morphological operators. The algorithms were compared by measuring pixel-based segmentation accuracy and Jaccard index against a labelled ground truth. The results indicated a superior performance of the traditional algorithm (Accuracy = 99%, Jaccard = 93%) over the deep learning architectures: VGG16 (93%, 90%), ResNet18 (94%, 88%), Inception-ResNet-v2 (94%, 89%).


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
Kun Zhang ◽  
JunHong Fu ◽  
Liang Hua ◽  
Peijian Zhang ◽  
Yeqin Shao ◽  
...  

Histological assessment of glands is one of the major concerns in colon cancer grading. Considering that poorly differentiated colorectal glands cannot be accurately segmented, we propose an approach for segmentation of glands in colon cancer images, based on the characteristics of lumens and rough gland boundaries. First, we use a U-net for stain separation to obtain H-stain, E-stain, and background stain intensity maps. Subsequently, epithelial nucleus is identified on the histopathology images, and the lumen segmentation is performed on the background intensity map. Then, we use the axis of least inertia-based similar triangles as the spatial characteristics of lumens and epithelial nucleus, and a triangle membership is used to select glandular contour candidates from epithelial nucleus. By connecting lumens and epithelial nucleus, more accurate gland segmentation is performed based on the rough gland boundary. The proposed stain separation approach is unsupervised, and the stain separation makes the category information contained in the H&E image easy to identify and deal with the uneven stain intensity and the inconspicuous stain difference. In this project, we use deep learning to achieve stain separation by predicting the stain coefficient. Under the deep learning framework, we design a stain coefficient interval model to improve the stain generalization performance. Another innovation is that we propose the combination of the internal lumen contour of adenoma and the outer contour of epithelial cells to obtain a precise gland contour. We compare the performance of the proposed algorithm against that of several state-of-the-art technologies on publicly available datasets. The results show that the segmentation approach combining the characteristics of lumens and rough gland boundary has better segmentation accuracy.


Sign in / Sign up

Export Citation Format

Share Document