scholarly journals Depthwise Convolution Is All You Need for Learning Multiple Visual Domains

Author(s):  
Yunhui Guo ◽  
Yandong Li ◽  
Liqiang Wang ◽  
Tajana Rosing

There is a growing interest in designing models that can deal with images from different visual domains. If there exists a universal structure in different visual domains that can be captured via a common parameterization, then we can use a single model for all domains rather than one model per domain. A model aware of the relationships between different domains can also be trained to work on new domains with less resources. However, to identify the reusable structure in a model is not easy. In this paper, we propose a multi-domain learning architecture based on depthwise separable convolution. The proposed approach is based on the assumption that images from different domains share cross-channel correlations but have domain-specific spatial correlations. The proposed model is compact and has minimal overhead when being applied to new domains. Additionally, we introduce a gating mechanism to promote soft sharing between different domains. We evaluate our approach on Visual Decathlon Challenge, a benchmark for testing the ability of multi-domain models. The experiments show that our approach can achieve the highest score while only requiring 50% of the parameters compared with the state-of-the-art approaches.

2020 ◽  
Vol 34 (07) ◽  
pp. 11394-11401
Author(s):  
Shuzhao Li ◽  
Huimin Yu ◽  
Haoji Hu

In this paper, we propose an Appearance and Motion Enhancement Model (AMEM) for video-based person re-identification to enrich the two kinds of information contained in the backbone network in a more interpretable way. Concretely, human attribute recognition under the supervision of pseudo labels is exploited in an Appearance Enhancement Module (AEM) to help enrich the appearance and semantic information. A Motion Enhancement Module (MEM) is designed to capture the identity-discriminative walking patterns through predicting future frames. Despite a complex model with several auxiliary modules during training, only the backbone model plus two small branches are kept for similarity evaluation which constitute a simple but effective final model. Extensive experiments conducted on three popular video-based person ReID benchmarks demonstrate the effectiveness of our proposed model and the state-of-the-art performance compared with existing methods.


Author(s):  
Wei Ji ◽  
Xi Li ◽  
Yueting Zhuang ◽  
Omar El Farouk Bourahla ◽  
Yixin Ji ◽  
...  

Clothing segmentation is a challenging vision problem typically implemented within a fine-grained semantic segmentation framework. Different from conventional segmentation, clothing segmentation has some domain-specific properties such as texture richness, diverse appearance variations, non-rigid geometry deformations, and small sample learning. To deal with these points, we propose a semantic locality-aware segmentation model, which adaptively attaches an original clothing image with a semantically similar (e.g., appearance or pose) auxiliary exemplar by search. Through considering the interactions of the clothing image and its exemplar, more intrinsic knowledge about the locality manifold structures of clothing images is discovered to make the learning process of small sample problem more stable and tractable. Furthermore, we present a CNN model based on the deformable convolutions to extract the non-rigid geometry-aware features for clothing images. Experimental results demonstrate the effectiveness of the proposed model against the state-of-the-art approaches.


2012 ◽  
Vol 22 (3) ◽  
pp. 375-377 ◽  
Author(s):  
JURRIAAN HAGE

My main reason for wanting to read this book was to find out what a well-known publicist from the world of OO would have to say about the state of the art of domain specific languages (DSLs), in particular when it comes to type error feedback, functional programming, and the combination. As most readers will be aware, languages like Scheme and Haskell are very well suited to embed DSLs in: Scheme can be considered a core language to which new language facilities can be easily added by means of hygienic syntax macro's (Abelson et al. 1998), and there are so many papers on embedded DSLs in Haskell (Hudak, 1998), that any realistic selection would aggravate more people than I would please. Great was my disappointment when I read on page XXV that these topics were not discussed at all in the book. Although I can imagine that Fowler does not feel comfortable writing about subjects he is not sufficiently at home with, the question does arise whether the title of this book is sufficiently covered by its contents.


Cells ◽  
2019 ◽  
Vol 8 (12) ◽  
pp. 1635 ◽  
Author(s):  
Hilal Tayara ◽  
Kil To Chong

It is known that over 98% of the human genome is non-coding, and 93% of disease associated variants are located in these regions. Therefore, understanding the function of these regions is important. However, this task is challenging as most of these regions are not well understood in terms of their functions. In this paper, we introduce a novel computational model based on deep neural networks, called DQDNN, for quantifying the function of non-coding DNA regions. This model combines convolution layers for capturing regularity motifs at multiple scales and recurrent layers for capturing long term dependencies between the captured motifs. In addition, we show that integrating evolutionary information with raw genomic sequences improves the performance of the predictor significantly. The proposed model outperforms the state-of-the-art ones using raw genomics sequences only and also by integrating evolutionary information with raw genomics sequences. More specifically, the proposed model improves 96.9% and 98% of the targets in terms of area under the receiver operating characteristic curve and the precision-recall curve, respectively. In addition, the proposed model improved the prioritization of functional variants of expression quantitative trait loci (eQTLs) compared with the state-of-the-art models.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1938
Author(s):  
Linling Qiu ◽  
Han Li ◽  
Meihong Wang ◽  
Xiaoli Wang

With its increasing incidence, cancer has become one of the main causes of worldwide mortality. In this work, we mainly propose a novel attention-based neural network model named Gated Graph ATtention network (GGAT) for cancer prediction, where a gating mechanism (GM) is introduced to work with the attention mechanism (AM), to break through the previous work’s limitation of 1-hop neighbourhood reasoning. In this way, our GGAT is capable of fully mining the potential correlation between related samples, helping for improving the cancer prediction accuracy. Additionally, to simplify the datasets, we propose a hybrid feature selection algorithm to strictly select gene features, which significantly reduces training time without affecting prediction accuracy. To the best of our knowledge, our proposed GGAT achieves the state-of-the-art results in cancer prediction task on LIHC, LUAD, KIRC compared to other traditional machine learning methods and neural network models, and improves the accuracy by 1% to 2% on Cora dataset, compared to the state-of-the-art graph neural network methods.


Author(s):  
Kaiqi Wang ◽  
Ke Chen ◽  
Kui Jia

This paper proposes a deep cascade network to generate 3D geometry of an object on a point cloud, consisting of a set of permutation-insensitive points. Such a surface representation is easy to learn from, but inhibits exploiting rich low-dimensional topological manifolds of the object shape due to lack of geometric connectivity. For benefiting from its simple structure yet utilizing rich neighborhood information across points, this paper proposes a two-stage cascade model on point sets. Specifically, our method adopts the state-of-the-art point set autoencoder to generate a sparsely coarse shape first, and then locally refines it by encoding neighborhood connectivity on a graph representation. An ensemble of sparse refined surface is designed to alleviate the suffering from local minima caused by modeling complex geometric manifolds. Moreover, our model develops a dynamically-weighted loss function for jointly penalizing the generation output of cascade levels at different training stages in a coarse-to-fine manner. Comparative evaluation on the publicly benchmarking ShapeNet dataset demonstrates superior performance of the proposed model to the state-of-the-art methods on both single-view shape reconstruction and shape autoencoding applications.


2020 ◽  
Vol 34 (04) ◽  
pp. 4107-4114 ◽  
Author(s):  
Masoumeh Heidari Kapourchali ◽  
Bonny Banerjee

We propose an agent model capable of actively and selectively communicating with other agents to predict its environmental state efficiently. Selecting whom to communicate with is a challenge when the internal model of other agents is unobservable. Our agent learns a communication policy as a mapping from its belief state to with whom to communicate in an online and unsupervised manner, without any reinforcement. Human activity recognition from multimodal, multisource and heterogeneous sensor data is used as a testbed to evaluate the proposed model where each sensor is assumed to be monitored by an agent. The recognition accuracy on benchmark datasets is comparable to the state-of-the-art even though our model uses significantly fewer parameters and infers the state in a localized manner. The learned policy reduces number of communications. The agent is tolerant to communication failures and can recognize unreliable agents through their communication messages. To the best of our knowledge, this is the first work on learning communication policies by an agent for predicting its environmental state.


Author(s):  
G. Bellitto ◽  
F. Proietto Salanitri ◽  
S. Palazzo ◽  
F. Rundo ◽  
D. Giordano ◽  
...  

AbstractIn this work, we propose a 3D fully convolutional architecture for video saliency prediction that employs hierarchical supervision on intermediate maps (referred to as conspicuity maps) generated using features extracted at different abstraction levels. We provide the base hierarchical learning mechanism with two techniques for domain adaptation and domain-specific learning. For the former, we encourage the model to unsupervisedly learn hierarchical general features using gradient reversal at multiple scales, to enhance generalization capabilities on datasets for which no annotations are provided during training. As for domain specialization, we employ domain-specific operations (namely, priors, smoothing and batch normalization) by specializing the learned features on individual datasets in order to maximize performance. The results of our experiments show that the proposed model yields state-of-the-art accuracy on supervised saliency prediction. When the base hierarchical model is empowered with domain-specific modules, performance improves, outperforming state-of-the-art models on three out of five metrics on the DHF1K benchmark and reaching the second-best results on the other two. When, instead, we test it in an unsupervised domain adaptation setting, by enabling hierarchical gradient reversal layers, we obtain performance comparable to supervised state-of-the-art. Source code, trained models and example outputs are publicly available at https://github.com/perceivelab/hd2s.


2011 ◽  
Vol 26 (4) ◽  
pp. 365-410 ◽  
Author(s):  
Pietro Baroni ◽  
Martin Caminada ◽  
Massimiliano Giacomin

AbstractThis paper presents an overview on the state of the art of semantics for abstract argumentation, covering both some of the most influential literature proposals and some general issues concerning semantics definition and evaluation. As to the former point, the paper reviews Dung's original notions of complete, grounded, preferred, and stable semantics, as well as subsequently proposed notions like semi-stable, ideal, stage, and CF2 semantics, considering both the extension-based and the labelling-based approaches with respect to their definitions. As to the latter point, the paper presents an extensive set of general properties for semantics evaluation and analyzes the notions of argument justification and skepticism. The final part of the paper is focused on the discussion of some relationships between semantics properties and domain-specific requirements.


2020 ◽  
Vol 35 ◽  
Author(s):  
Mariela Morveli-Espinoza ◽  
Juan Carlos Nieves ◽  
Cesar Augusto Tacla

Abstract The aim of this article is to propose a model for the measurement of the strength of rhetorical arguments (i.e., threats, rewards, and appeals), which are used in persuasive negotiation dialogues when a proponent agent tries to convince his opponent to accept a proposal. Related articles propose a calculation based on the components of the rhetorical arguments, that is, the importance of the goal of the opponent and the certainty level of the beliefs that make up the argument. Our proposed model is based on the pre-conditions of credibility and preferability stated by Guerini and Castelfranchi. Thus, we suggest the use of two new criteria for the strength calculation: the credibility of the proponent and the status of the goal of the opponent in the goal processing cycle. We use three scenarios in order to illustrate our proposal. Besides, the model is empirically evaluated and the results demonstrate that the proposed model is more efficient than previous works of the state of the art in terms of numbers of negotiation cycles, number of exchanged arguments, and number of reached agreements.


Sign in / Sign up

Export Citation Format

Share Document