semantic label
Recently Published Documents


TOTAL DOCUMENTS

34
(FIVE YEARS 15)

H-INDEX

7
(FIVE YEARS 2)

Electronics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 60
Author(s):  
Paolo Andreini ◽  
Giorgio Ciano ◽  
Simone Bonechi ◽  
Caterina Graziani ◽  
Veronica Lachi ◽  
...  

In this paper, we use Generative Adversarial Networks (GANs) to synthesize high-quality retinal images along with the corresponding semantic label-maps, instead of real images during training of a segmentation network. Different from other previous proposals, we employ a two-step approach: first, a progressively growing GAN is trained to generate the semantic label-maps, which describes the blood vessel structure (i.e., the vasculature); second, an image-to-image translation approach is used to obtain realistic retinal images from the generated vasculature. The adoption of a two-stage process simplifies the generation task, so that the network training requires fewer images with consequent lower memory usage. Moreover, learning is effective, and with only a handful of training samples, our approach generates realistic high-resolution images, which can be successfully used to enlarge small available datasets. Comparable results were obtained by employing only synthetic images in place of real data during training. The practical viability of the proposed approach was demonstrated on two well-established benchmark sets for retinal vessel segmentation—both containing a very small number of training samples—obtaining better performance with respect to state-of-the-art techniques.


2021 ◽  
Author(s):  
David Rozenberszki ◽  
Gabor Soros ◽  
Szilvia Szeier ◽  
Andras Lorincz

2021 ◽  
Vol 13 (16) ◽  
pp. 3211
Author(s):  
Tian Tian ◽  
Zhengquan Chu ◽  
Qian Hu ◽  
Li Ma

Semantic segmentation is a fundamental task in remote sensing image interpretation, which aims to assign a semantic label for every pixel in the given image. Accurate semantic segmentation is still challenging due to the complex distributions of various ground objects. With the development of deep learning, a series of segmentation networks represented by fully convolutional network (FCN) has made remarkable progress on this problem, but the segmentation accuracy is still far from expectations. This paper focuses on the importance of class-specific features of different land cover objects, and presents a novel end-to-end class-wise processing framework for segmentation. The proposed class-wise FCN (C-FCN) is shaped in the form of an encoder-decoder structure with skip-connections, in which the encoder is shared to produce general features for all categories and the decoder is class-wise to process class-specific features. To be detailed, class-wise transition (CT), class-wise up-sampling (CU), class-wise supervision (CS), and class-wise classification (CC) modules are designed to achieve the class-wise transfer, recover the resolution of class-wise feature maps, bridge the encoder and modified decoder, and implement class-wise classifications, respectively. Class-wise and group convolutions are adopted in the architecture with regard to the control of parameter numbers. The method is tested on the public ISPRS 2D semantic labeling benchmark datasets. Experimental results show that the proposed C-FCN significantly improves the segmentation performances compared with many state-of-the-art FCN-based networks, revealing its potentials on accurate segmentation of complex remote sensing images.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hidayaturrahman ◽  
Emmanuel Dave ◽  
Derwin Suhartono ◽  
Aniati Murni Arymurthy

AbstractArguments facilitate humans to deliver their ideas. The outcome of the discussion heavily relies on the validity of the argument. If an argument is well-composed, it is more effective to grasp the core idea behind the argument. To grade the argument, machines can be utilized by decomposing into semantic label components. In natural language processing, multiple language models are available to perform this task. It is divided into context-free and contextual models. The majority of previous studies used hand-crafted features to perform argument component classification, while state of the art language models utilize machine learning. The majority of these language models ignore the context in an argument. This research paper aims to analyze whether by including the context in the classification process may improve the accuracy of the language model which will enhance the argumentation mining process as well. The same document corpus is fed into several language models. Word2Vec and GLoVe represent the context free models, while BERT and ELMo as context sensitive language models. Accuracy and time from each model are then compared to determine the importance of context. The result shows that contextual language models are proven to be able to boost classification accuracy by approximately 20%. However, time comes as a cost where contextual models require longer training and prediction time. The benefit from the increase in accuracy outweighs the burden of time. Thus, as a contextual task, argumentation mining is suggested to use contextual model where context must be included to achieve promising results.


2021 ◽  
Author(s):  
Jin Qu ◽  
Kazuma Hashimoto ◽  
Wenhao Liu ◽  
Caiming Xiong ◽  
Yingbo Zhou
Keyword(s):  

2020 ◽  
Author(s):  
Sjoerd Stuit ◽  
Timo Kootstra ◽  
David Terburg ◽  
Carlijn van den Boomen ◽  
Maarten van der Smagt ◽  
...  

Abstract Emotional facial expressions are important visual communication signals that indicate a sender’s intent and emotional state to an observer. As such, it is not surprising that reactions to different expressions are thought to be automatic and independent of awareness. What is surprising, is that studies show inconsistent results concerning such automatic reactions, particularly when using different face stimuli. We argue that automatic reactions to facial expressions can be better explained, and better understood, in terms of quantitative descriptions of their visual features rather than in terms of the semantic labels (e.g. angry) of the expressions. Here, we focused on overall spatial frequency (SF) and localized Histograms of Oriented Gradients (HOG) features. We used machine learning classification to reveal the SF and HOG features that are sufficient for classification of the first selected face out of two simultaneously presented faces. In other words, we show which visual features predict selection between two faces. Interestingly, the identified features serve as better predictors than the semantic label of the expressions. We therefore propose that our modelling approach can further specify which visual features drive the behavioural effects related to emotional expressions, which can help solve the inconsistencies found in this line of research.


2020 ◽  
Vol 34 (07) ◽  
pp. 12709-12716
Author(s):  
Renchun You ◽  
Zhiyao Guo ◽  
Lei Cui ◽  
Xiang Long ◽  
Yingze Bao ◽  
...  

Multi-label image and video classification are fundamental yet challenging tasks in computer vision. The main challenges lie in capturing spatial or temporal dependencies between labels and discovering the locations of discriminative features for each class. In order to overcome these challenges, we propose to use cross-modality attention with semantic graph embedding for multi-label classification. Based on the constructed label graph, we propose an adjacency-based similarity graph embedding method to learn semantic label embeddings, which explicitly exploit label relationships. Then our novel cross-modality attention maps are generated with the guidance of learned label embeddings. Experiments on two multi-label image classification datasets (MS-COCO and NUS-WIDE) show our method outperforms other existing state-of-the-arts. In addition, we validate our method on a large multi-label video classification dataset (YouTube-8M Segments) and the evaluation results demonstrate the generalization capability of our method.


2020 ◽  
Vol 10 (3) ◽  
pp. 1092 ◽  
Author(s):  
Bilel Benjdira ◽  
Adel Ammar ◽  
Anis Koubaa ◽  
Kais Ouni

Despite the significant advances noted in semantic segmentation of aerial imagery, a considerable limitation is blocking its adoption in real cases. If we test a segmentation model on a new area that is not included in its initial training set, accuracy will decrease remarkably. This is caused by the domain shift between the new targeted domain and the source domain used to train the model. In this paper, we addressed this challenge and proposed a new algorithm that uses Generative Adversarial Networks (GAN) architecture to minimize the domain shift and increase the ability of the model to work on new targeted domains. The proposed GAN architecture contains two GAN networks. The first GAN network converts the chosen image from the target domain into a semantic label. The second GAN network converts this generated semantic label into an image that belongs to the source domain but conserves the semantic map of the target image. This resulting image will be used by the semantic segmentation model to generate a better semantic label of the first chosen image. Our algorithm is tested on the ISPRS semantic segmentation dataset and improved the global accuracy by a margin up to 24% when passing from Potsdam domain to Vaihingen domain. This margin can be increased by addition of other labeled data from the target domain. To minimize the cost of supervision in the translation process, we proposed a methodology to use these labeled data efficiently.


Sign in / Sign up

Export Citation Format

Share Document