Image Labeling
Recently Published Documents


TOTAL DOCUMENTS

156
(FIVE YEARS 65)

H-INDEX

20
(FIVE YEARS 8)

2022 ◽  
Vol 14 (2) ◽  
pp. 861
Author(s):  
Han-Cheng Dan ◽  
Hao-Fan Zeng ◽  
Zhi-Heng Zhu ◽  
Ge-Wen Bai ◽  
Wei Cao

Image recognition based on deep learning generally demands a huge sample size for training, for which the image labeling becomes inevitably laborious and time-consuming. In the case of evaluating the pavement quality condition, many pavement distress patching images would need manual screening and labeling, meanwhile the subjectivity of the labeling personnel would greatly affect the accuracy of image labeling. In this study, in order for an accurate and efficient recognition of the pavement patching images, an interactive labeling method is proposed based on the U-Net convolutional neural network, using active learning combined with reverse and correction labeling. According to the calculation results in this paper, the sample size required by the interactive labeling is about half of the traditional labeling method for the same recognition precision. Meanwhile, the accuracy of interactive labeling method based on the mean intersection over union (mean_IOU) index is 6% higher than that of the traditional method using the same sample size and training epochs. In addition, the accuracy analysis of the noise and boundary of the prediction results shows that this method eliminates 92% of the noise in the predictions (the proportion of noise is reduced from 13.85% to 1.06%), and the image definition is improved by 14.1% in terms of the boundary gray area ratio. The interactive labeling is considered as a significantly valuable approach, as it reduces the sample size in each epoch of active learning, greatly alleviates the demand for manpower, and improves learning efficiency and accuracy.


2021 ◽  
Vol 13 (24) ◽  
pp. 5100
Author(s):  
Teerapong Panboonyuen ◽  
Kulsawasd Jitkajornwanich ◽  
Siam Lawawirojwong ◽  
Panu Srestasathiern ◽  
Peerapon Vateekul

Transformers have demonstrated remarkable accomplishments in several natural language processing (NLP) tasks as well as image processing tasks. Herein, we present a deep-learning (DL) model that is capable of improving the semantic segmentation network in two ways. First, utilizing the pre-training Swin Transformer (SwinTF) under Vision Transformer (ViT) as a backbone, the model weights downstream tasks by joining task layers upon the pretrained encoder. Secondly, decoder designs are applied to our DL network with three decoder designs, U-Net, pyramid scene parsing (PSP) network, and feature pyramid network (FPN), to perform pixel-level segmentation. The results are compared with other image labeling state of the art (SOTA) methods, such as global convolutional network (GCN) and ViT. Extensive experiments show that our Swin Transformer (SwinTF) with decoder designs reached a new state of the art on the Thailand Isan Landsat-8 corpus (89.8% F1 score), Thailand North Landsat-8 corpus (63.12% F1 score), and competitive results on ISPRS Vaihingen. Moreover, both our best-proposed methods (SwinTF-PSP and SwinTF-FPN) even outperformed SwinTF with supervised pre-training ViT on the ImageNet-1K in the Thailand, Landsat-8, and ISPRS Vaihingen corpora.


2021 ◽  
Vol 1 ◽  
Author(s):  
Patrick Bangert ◽  
Hankyu Moon ◽  
Jae Oh Woo ◽  
Sima Didari ◽  
Heng Hao

To train artificial intelligence (AI) systems on radiology images, an image labeling step is necessary. Labeling for radiology images usually involves a human radiologist manually drawing a (polygonal) shape onto the image and attaching a word to it. As datasets are typically large, this task is repetitive, time-consuming, error-prone, and expensive. The AI methodology of active learning (AL) can assist human labelers by continuously sorting the unlabeled images in order of information gain and thus getting the labeler always to label the most informative image next. We find that after about 10%, depending on the dataset, of the images in a realistic dataset are labeled, virtually all the information content has been learnt and the remaining images can be automatically labeled. These images can then be checked by the radiologist, which is far easier and faster to do. In this way, the entire dataset is labeled with much less human effort. We introduce AL in detail and expose the effectiveness using three real-life datasets. We contribute five distinct elements to the standard AL workflow creating an advanced methodology.


2021 ◽  
Author(s):  
Cong zhong Wu ◽  
Hao Dong ◽  
Xuan jie Lin ◽  
Han tong Jiang ◽  
Li quan Wang ◽  
...  

It is difficult to segment small objects and the edge of the object because of larger-scale variation, larger intra-class variance of background and foreground-background imbalance in the remote sensing imagery. In convolutional neural networks, high frequency signals may degenerate into completely different ones after downsampling. We define this phenomenon as aliasing. Meanwhile, although dilated convolution can expand the receptive field of feature map, a much more complex background can cause serious alarms. To alleviate the above problems, we propose an attention-based mechanism adaptive filtered segmentation network. Experimental results on the Deepglobe Road Extraction dataset and Inria Aerial Image Labeling dataset showed that our method can effectively improve the segmentation accuracy. The F1 value on the two data sets reached 82.67% and 85.71% respectively.


Mathematics ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 1393
Author(s):  
Jing Su ◽  
Hongyu Wang ◽  
Bing Yao

A variety of labelings on trees have emerged in order to attack the Graceful Tree Conjecture, but lack showing the connections between two labelings. In this paper, we propose two new labelings: vertex image-labeling and edge image-labeling, and combine new labelings to form matching-type image-labeling with multiple restrictions. The research starts from the set-ordered graceful labeling of the trees, and we give several generation methods and relationships for well-known labelings and two new labelings on trees.


2021 ◽  
Author(s):  
Xiaofeng Wang

Image and video content analysis is an interesting, meaningful and challenging topic. In recent years much of the research effort in the multimedia field focuses on indexing and retrieval. Semantic gap between low-level features and high-level content is a bottleneck in most systems. To bridge the semantic gap, new content analysis models need to be developed. In this thesis, algorithms based on a relatively new graphical model, called the conditional random field (CRF) model, are developed for two closely-related problems in content analysis: image labeling and video content analysis. The CRF model can represent spatial interactions in image labeling and temporal interactions in video content analysis. New feature functions are designed to better represent the feature distributions. The mixture feature functions are used in image labeling for databases with nature images, and the independent component analysis (ICA) mixture function is applied in sports video content analysis. The spatial dependence of image parts and the temporal dependence of video frames can be explored by the CRF model more effectively using new feature functions. For image labeling with large databases, the content-based image retrieval method is combined with the CRF image labeling model successfully.


2021 ◽  
Author(s):  
Xiaofeng Wang

Image and video content analysis is an interesting, meaningful and challenging topic. In recent years much of the research effort in the multimedia field focuses on indexing and retrieval. Semantic gap between low-level features and high-level content is a bottleneck in most systems. To bridge the semantic gap, new content analysis models need to be developed. In this thesis, algorithms based on a relatively new graphical model, called the conditional random field (CRF) model, are developed for two closely-related problems in content analysis: image labeling and video content analysis. The CRF model can represent spatial interactions in image labeling and temporal interactions in video content analysis. New feature functions are designed to better represent the feature distributions. The mixture feature functions are used in image labeling for databases with nature images, and the independent component analysis (ICA) mixture function is applied in sports video content analysis. The spatial dependence of image parts and the temporal dependence of video frames can be explored by the CRF model more effectively using new feature functions. For image labeling with large databases, the content-based image retrieval method is combined with the CRF image labeling model successfully.


2021 ◽  
Author(s):  
Maryam Nematollahi Arani

Object recognition has become a central topic in computer vision applications such as image search, robotics and vehicle safety systems. However, it is a challenging task due to the limited discriminative power of low-level visual features in describing the considerably diverse range of high-level visual semantics of objects. Semantic gap between low-level visual features and high-level concepts are a bottleneck in most systems. New content analysis models need to be developed to bridge the semantic gap. In this thesis, algorithms based on conditional random fields (CRF) from the class of probabilistic graphical models are developed to tackle the problem of multiclass image labeling for object recognition. Image labeling assigns a specific semantic category from a predefined set of object classes to each pixel in the image. By well capturing spatial interactions of visual concepts, CRF modeling has proved to be a successful tool for image labeling. This thesis proposes novel approaches to empowering the CRF modeling for robust image labeling. Our primary contributions are twofold. To better represent feature distributions of CRF potentials, new feature functions based on generalized Gaussian mixture models (GGMM) are designed and their efficacy is investigated. Due to its shape parameter, GGMM can provide a proper fit to multi-modal and skewed distribution of data in nature images. The new model proves more successful than Gaussian and Laplacian mixture models. It also outperforms a deep neural network model on Corel imageset by 1% accuracy. Further in this thesis, we apply scene level contextual information to integrate global visual semantics of the image with pixel-wise dense inference of fully-connected CRF to preserve small objects of foreground classes and to make dense inference robust to initial misclassifications of the unary classifier. Proposed inference algorithm factorizes the joint probability of labeling configuration and image scene type to obtain prediction update equations for labeling individual image pixels and also the overall scene type of the image. The proposed context-based dense CRF model outperforms conventional dense CRF model by about 2% in terms of labeling accuracy on MSRC imageset and by 4% on SIFT Flow imageset. Also, the proposed model obtains the highest scene classification rate of 86% on MSRC dataset.


Sign in / Sign up

Export Citation Format

Share Document