scholarly journals Zero-Shot Learning from Adversarial Feature Residual to Compact Visual Feature

2020 ◽  
Vol 34 (07) ◽  
pp. 11547-11554
Author(s):  
Bo Liu ◽  
Qiulei Dong ◽  
Zhanyi Hu

Recently, many zero-shot learning (ZSL) methods focused on learning discriminative object features in an embedding feature space, however, the distributions of the unseen-class features learned by these methods are prone to be partly overlapped, resulting in inaccurate object recognition. Addressing this problem, we propose a novel adversarial network to synthesize compact semantic visual features for ZSL, consisting of a residual generator, a prototype predictor, and a discriminator. The residual generator is to generate the visual feature residual, which is integrated with a visual prototype predicted via the prototype predictor for synthesizing the visual feature. The discriminator is to distinguish the synthetic visual features from the real ones extracted from an existing categorization CNN. Since the generated residuals are generally numerically much smaller than the distances among all the prototypes, the distributions of the unseen-class features synthesized by the proposed network are less overlapped. In addition, considering that the visual features from categorization CNNs are generally inconsistent with their semantic features, a simple feature selection strategy is introduced for extracting more compact semantic visual features. Extensive experimental results on six benchmark datasets demonstrate that our method could achieve a significantly better performance than existing state-of-the-art methods by ∼1.2-13.2% in most cases.

Author(s):  
Yang Liu ◽  
Quanxue Gao ◽  
Jin Li ◽  
Jungong Han ◽  
Ling Shao

Zero-shot learning (ZSL) has been widely researched and get successful in machine learning. Most existing ZSL methods aim to accurately recognize objects of unseen classes by learning a shared mapping from the feature space to a semantic space. However, such methods did not investigate in-depth whether the mapping can precisely reconstruct the original visual feature. Motivated by the fact that the data have low intrinsic dimensionality e.g. low-dimensional subspace. In this paper, we formulate a novel framework named Low-rank Embedded Semantic AutoEncoder (LESAE) to jointly seek a low-rank mapping to link visual features with their semantic representations. Taking the encoder-decoder paradigm, the encoder part aims to learn a low-rank mapping from the visual feature to the semantic space, while decoder part manages to reconstruct the original data with the learned mapping. In addition, a non-greedy iterative algorithm is adopted to solve our model. Extensive experiments on six benchmark datasets demonstrate its superiority over several state-of-the-art algorithms.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Zhixun Zhao ◽  
Xiaocai Zhang ◽  
Fang Chen ◽  
Liang Fang ◽  
Jinyan Li

Abstract Background DNA N4-methylcytosine (4mC) is a critical epigenetic modification and has various roles in the restriction-modification system. Due to the high cost of experimental laboratory detection, computational methods using sequence characteristics and machine learning algorithms have been explored to identify 4mC sites from DNA sequences. However, state-of-the-art methods have limited performance because of the lack of effective sequence features and the ad hoc choice of learning algorithms to cope with this problem. This paper is aimed to propose new sequence feature space and a machine learning algorithm with feature selection scheme to address the problem. Results The feature importance score distributions in datasets of six species are firstly reported and analyzed. Then the impact of the feature selection on model performance is evaluated by independent testing on benchmark datasets, where ACC and MCC measurements on the performance after feature selection increase by 2.3% to 9.7% and 0.05 to 0.19, respectively. The proposed method is compared with three state-of-the-art predictors using independent test and 10-fold cross-validations, and our method outperforms in all datasets, especially improving the ACC by 3.02% to 7.89% and MCC by 0.06 to 0.15 in the independent test. Two detailed case studies by the proposed method have confirmed the excellent overall performance and correctly identified 24 of 26 4mC sites from the C.elegans gene, and 126 out of 137 4mC sites from the D.melanogaster gene. Conclusions The results show that the proposed feature space and learning algorithm with feature selection can improve the performance of DNA 4mC prediction on the benchmark datasets. The two case studies prove the effectiveness of our method in practical situations.


2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Sujin Lee ◽  
Incheol Kim

Video captioning refers to the task of generating a natural language sentence that explains the content of the input video clips. This study proposes a deep neural network model for effective video captioning. Apart from visual features, the proposed model learns additionally semantic features that describe the video content effectively. In our model, visual features of the input video are extracted using convolutional neural networks such as C3D and ResNet, while semantic features are obtained using recurrent neural networks such as LSTM. In addition, our model includes an attention-based caption generation network to generate the correct natural language captions based on the multimodal video feature sequences. Various experiments, conducted with the two large benchmark datasets, Microsoft Video Description (MSVD) and Microsoft Research Video-to-Text (MSR-VTT), demonstrate the performance of the proposed model.


Author(s):  
Kai Zhu ◽  
Wei Zhai ◽  
Yang Cao

Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples. It is a challenging task since the dense prediction can only be achieved under the guidance of latent features defined by sparse annotations. Existing meta-learning based method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space. To address this issue, this paper presents an adaptive tuning framework, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme, augmenting category-specific descriptors for label prediction. Specifically, a novel self-supervised inner-loop is firstly devised as the base learner to extract the underlying semantic features from the support image. Then, gradient maps are calculated by back-propagating self-supervised loss through the obtained features, and leveraged as guidance for augmenting the corresponding elements in the embedding space. Finally, with the ability to continuously learn from different episodes, an optimization-based meta-learner is adopted as outer loop of our proposed framework to gradually refine the segmentation results. Extensive experiments on benchmark PASCAL-5i and COCO-20i datasets demonstrate the superiority of our proposed method over state-of-the-art.


2021 ◽  
Author(s):  
Faizan Ur Rahman ◽  
Soosan Beheshti

Transforming data to feature space using a kernel function can result in better expression of its features, resulting in better separability for some datasets. The parameters of the kernel function govern the structure of data in feature space and need to be optimized simultaneously while also estimating the number of clusters in a dataset. The proposed method denoted by kernel k-Minimum Average Central Error (kernel k-MACE), esti- mates the number of clusters in a dataset while simultaneously clustering the dataset in feature space by finding the optimum value of the Gaussian kernel parameter σk. A cluster initialization technique has also been proposed based on an existing method for k-means clustering. Simulations show that for self-generated datasets with Gaus- sian clusters having 10% - 50% overlap and for real benchmark datasets, the proposed method outperforms multiple state-of-the-art unsupervised clustering methods including k-MACE, the clustering scheme that inspired kernel k-MACE.


2021 ◽  
Author(s):  
Faizan Ur Rahman ◽  
Soosan Beheshti

Transforming data to feature space using a kernel function can result in better expression of its features, resulting in better separability for some datasets. The parameters of the kernel function govern the structure of data in feature space and need to be optimized simultaneously while also estimating the number of clusters in a dataset. The proposed method denoted by kernel k-Minimum Average Central Error (kernel k-MACE), esti- mates the number of clusters in a dataset while simultaneously clustering the dataset in feature space by finding the optimum value of the Gaussian kernel parameter σk. A cluster initialization technique has also been proposed based on an existing method for k-means clustering. Simulations show that for self-generated datasets with Gaus- sian clusters having 10% - 50% overlap and for real benchmark datasets, the proposed method outperforms multiple state-of-the-art unsupervised clustering methods including k-MACE, the clustering scheme that inspired kernel k-MACE.


2022 ◽  
Vol 40 (1) ◽  
pp. 1-22
Author(s):  
Lianghao Xia ◽  
Chao Huang ◽  
Yong Xu ◽  
Huance Xu ◽  
Xiang Li ◽  
...  

As the deep learning techniques have expanded to real-world recommendation tasks, many deep neural network based Collaborative Filtering (CF) models have been developed to project user-item interactions into latent feature space, based on various neural architectures, such as multi-layer perceptron, autoencoder, and graph neural networks. However, the majority of existing collaborative filtering systems are not well designed to handle missing data. Particularly, in order to inject the negative signals in the training phase, these solutions largely rely on negative sampling from unobserved user-item interactions and simply treating them as negative instances, which brings the recommendation performance degradation. To address the issues, we develop a C ollaborative R eflection-Augmented A utoencoder N etwork (CRANet), that is capable of exploring transferable knowledge from observed and unobserved user-item interactions. The network architecture of CRANet is formed of an integrative structure with a reflective receptor network and an information fusion autoencoder module, which endows our recommendation framework with the ability of encoding implicit user’s pairwise preference on both interacted and non-interacted items. Additionally, a parametric regularization-based tied-weight scheme is designed to perform robust joint training of the two-stage CRANetmodel. We finally experimentally validate CRANeton four diverse benchmark datasets corresponding to two recommendation tasks, to show that debiasing the negative signals of user-item interactions improves the performance as compared to various state-of-the-art recommendation techniques. Our source code is available at https://github.com/akaxlh/CRANet.


2020 ◽  
Author(s):  
Chong Wu ◽  
Zhenan Feng ◽  
Jiangbin Zheng ◽  
Houwang Zhang ◽  
Jiawang Cao ◽  
...  

<div><div><div><p>We present a novel graph convolutional method called star topology convolution (STC). This method makes graph convolution more similar to conventional convolutional neural networks (CNNs) in Euclidean feature space. Unlike most existing spectral convolutional methods, this method learns subgraphs which have a star topology rather than a fixed graph. It has fewer parameters in its convolutional filter and is inductive so that it is more flexible and can be applied to large and evolving graphs. As for CNNs in Euclidean feature space, the convolutional filter is localized and maintains a good weight sharing property. By introducing deep layers, the method can learn global features like a CNN. To validate the method, STC was compared to state-of-the-art spectral convolutional and spatial convolutional methods in a supervised learning setting on three benchmark datasets: Cora, Citeseer and Pubmed. The experimental results show that STC outperforms the other methods. STC was also applied to protein identification tasks and outperformed traditional and advanced protein identification methods.</p></div></div></div>


Author(s):  
Wei Li ◽  
Haiyu Song ◽  
Hongda Zhang ◽  
Houjie Li ◽  
Pengjie Wang

The ever-increasing size of images has made automatic image annotation one of the most important tasks in the fields of machine learning and computer vision. Despite continuous efforts in inventing new annotation algorithms and new models, results of the state-of-the-art image annotation methods are often unsatisfactory. In this paper, to further improve annotation refinement performance, a novel approach based on weighted mutual information to automatically refine the original annotations of images is proposed. Unlike the traditional refinement model using only visual feature, the proposed model use semantic embedding to properly map labels and visual features to a meaningful semantic space. To accurately measure the relevance between the particular image and its original annotations, the proposed model utilize all available information including image-to-image, label-to-label and image-to-label. Experimental results conducted on three typical datasets show not only the validity of the refinement, but also the superiority of the proposed algorithm over existing ones. The improvement largely benefits from our proposed mutual information method and utilizing all available information.


Author(s):  
Jinfu Ren ◽  
Yang Liu ◽  
Jiming Liu

In this paper, we propose a novel oversampling strategy dubbed Entropy-based Wasserstein Generative Adversarial Network (EWGAN) to generate data samples for minority classes in imbalanced learning. First, we construct an entropyweighted label vector for each class to characterize the data imbalance in different classes. Then we concatenate this entropyweighted label vector with the original feature vector of each data sample, and feed it into the WGAN model to train the generator. After the generator is trained, we concatenate the entropy-weighted label vector with random noise feature vectors, and feed them into the generator to generate data samples for minority classes. Experimental results on two benchmark datasets show that the samples generated by the proposed oversampling strategy can help to improve the classification performance when the data are highly imbalanced. Furthermore, the proposed strategy outperforms other state-of-the-art oversampling algorithms in terms of the classification accuracy.


Sign in / Sign up

Export Citation Format

Share Document