scholarly journals Long-tailed Visual Recognition via Gaussian Clouded Logit Adjustment

Author(s):  
Mengke Li ◽  
Yiu-ming Cheung ◽  
Yang Lu

<p>Long-tailed data is still a big challenge for deep neural networks, even though they have achieved great success on balanced data. We observe that vanilla training on long-tailed data with cross-entropy loss makes the instance-rich head classes severely squeeze the spatial distribution of the tail classes, which leads to difficulty in classifying tail class samples. Furthermore, the original cross-entropy loss can only propagate gradient short-lively because the gradient in softmax form rapidly approaches zero as the logit difference increases. This phenomenon is called softmax saturation. It is unfavorable for training on balanced data, but can be utilized to adjust the validity of the samples in long-tailed data, thereby solving the distorted embedding space of long-tailed problems. To this end, this paper therefore proposes the Gaussian clouded logit adjustment by Gaussian perturbing different class logits with varied amplitude. We define the amplitude of perturbation as cloud size and set relatively large cloud sizes to tail classes. The large cloud size can reduce the softmax saturation and thereby making tail class samples more active as well as enlarging the embedding space. To alleviate the bias in the classifier, we accordingly propose the class-based effective number sampling strategy with classifier re-training. Extensive experiments on benchmark datasets validate the superior performance of the proposed method.</p><br>

2021 ◽  
Author(s):  
Mengke Li ◽  
Yiu-ming Cheung ◽  
Yang Lu

<p>Long-tailed data is still a big challenge for deep neural networks, even though they have achieved great success on balanced data. We observe that vanilla training on long-tailed data with cross-entropy loss makes the instance-rich head classes severely squeeze the spatial distribution of the tail classes, which leads to difficulty in classifying tail class samples. Furthermore, the original cross-entropy loss can only propagate gradient short-lively because the gradient in softmax form rapidly approaches zero as the logit difference increases. This phenomenon is called softmax saturation. It is unfavorable for training on balanced data, but can be utilized to adjust the validity of the samples in long-tailed data, thereby solving the distorted embedding space of long-tailed problems. To this end, this paper therefore proposes the Gaussian clouded logit adjustment by Gaussian perturbing different class logits with varied amplitude. We define the amplitude of perturbation as cloud size and set relatively large cloud sizes to tail classes. The large cloud size can reduce the softmax saturation and thereby making tail class samples more active as well as enlarging the embedding space. To alleviate the bias in the classifier, we accordingly propose the class-based effective number sampling strategy with classifier re-training. Extensive experiments on benchmark datasets validate the superior performance of the proposed method.</p><br>


Author(s):  
Chen Qi ◽  
Shibo Shen ◽  
Rongpeng Li ◽  
Zhifeng Zhao ◽  
Qing Liu ◽  
...  

AbstractNowadays, deep neural networks (DNNs) have been rapidly deployed to realize a number of functionalities like sensing, imaging, classification, recognition, etc. However, the computational-intensive requirement of DNNs makes it difficult to be applicable for resource-limited Internet of Things (IoT) devices. In this paper, we propose a novel pruning-based paradigm that aims to reduce the computational cost of DNNs, by uncovering a more compact structure and learning the effective weights therein, on the basis of not compromising the expressive capability of DNNs. In particular, our algorithm can achieve efficient end-to-end training that transfers a redundant neural network to a compact one with a specifically targeted compression rate directly. We comprehensively evaluate our approach on various representative benchmark datasets and compared with typical advanced convolutional neural network (CNN) architectures. The experimental results verify the superior performance and robust effectiveness of our scheme. For example, when pruning VGG on CIFAR-10, our proposed scheme is able to significantly reduce its FLOPs (floating-point operations) and number of parameters with a proportion of 76.2% and 94.1%, respectively, while still maintaining a satisfactory accuracy. To sum up, our scheme could facilitate the integration of DNNs into the common machine-learning-based IoT framework and establish distributed training of neural networks in both cloud and edge.


2021 ◽  
Vol 2050 (1) ◽  
pp. 012006
Author(s):  
Xili Dai ◽  
Chunmei Ma ◽  
Jingwei Sun ◽  
Tao Zhang ◽  
Haigang Gong ◽  
...  

Abstract Training deep neural networks from only a few examples has been an interesting topic that motivated few shot learning. In this paper, we study the fine-grained image classification problem in a challenging few-shot learning setting, and propose the Self-Amplificated Network (SAN), a method based on meta-learning to tackle this problem. The SAN model consists of three parts, which are the Encoder, Amplification and Similarity Modules. The Encoder Module encodes a fine-grained image input into a feature vector. The Amplification Module is used to amplify subtle differences between fine-grained images based on the self attention mechanism which is composed of multi-head attention. The Similarity Module measures how similar the query image and the support set are in order to determine the classification result. In-depth experiments on three benchmark datasets have showcased that our network achieves superior performance over the competing baselines.


2020 ◽  
Vol 34 (04) ◽  
pp. 5085-5092 ◽  
Author(s):  
Wan-Duo Kurt Ma ◽  
J. P. Lewis ◽  
W. Bastiaan Kleijn

We introduce the HSIC (Hilbert-Schmidt independence criterion) bottleneck for training deep neural networks. The HSIC bottleneck is an alternative to the conventional cross-entropy loss and backpropagation that has a number of distinct advantages. It mitigates exploding and vanishing gradients, resulting in the ability to learn very deep networks without skip connections. There is no requirement for symmetric feedback or update locking. We find that the HSIC bottleneck provides performance on MNIST/FashionMNIST/CIFAR10 classification comparable to backpropagation with a cross-entropy target, even when the system is not encouraged to make the output resemble the classification labels. Appending a single layer trained with SGD (without backpropagation) to reformat the information further improves performance.


Author(s):  
Lei Feng ◽  
Senlin Shu ◽  
Zhuoyi Lin ◽  
Fengmao Lv ◽  
Li Li ◽  
...  

Trained with the standard cross entropy loss, deep neural networks can achieve great performance on correctly labeled data. However, if the training data is corrupted with label noise, deep models tend to overfit the noisy labels, thereby achieving poor generation performance. To remedy this issue, several loss functions have been proposed and demonstrated to be robust to label noise. Although most of the robust loss functions stem from Categorical Cross Entropy (CCE) loss, they fail to embody the intrinsic relationships between CCE and other loss functions. In this paper, we propose a general framework dubbed Taylor cross entropy loss to train deep models in the presence of label noise. Specifically, our framework enables to weight the extent of fitting the training labels by controlling the order of Taylor Series for CCE, hence it can be robust to label noise. In addition, our framework clearly reveals the intrinsic relationships between CCE and other loss functions, such as Mean Absolute Error (MAE) and Mean Squared Error (MSE). Moreover, we present a detailed theoretical analysis to certify the robustness of this framework. Extensive experimental results on benchmark datasets demonstrate that our proposed approach significantly outperforms the state-of-the-art counterparts.


2021 ◽  
pp. 78-89
Author(s):  
Panle Li ◽  
Xiaohui He ◽  
Dingjun Song ◽  
Zihao Ding ◽  
Mengjia Qiao ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 255
Author(s):  
Yi Zhang ◽  
Shizhou Zhang ◽  
Ying Li ◽  
Yanning Zhang

Recently, both single modality and cross modality near-duplicate image detection tasks have received wide attention in the community of pattern recognition and computer vision. Existing deep neural networks-based methods have achieved remarkable performance in this task. However, most of the methods mainly focus on the learning of each image from the image pair, thus leading to less use of the information between the near duplicate image pairs to some extent. In this paper, to make more use of the correlations between image pairs, we propose a spatial transformer comparing convolutional neural network (CNN) model to compare near-duplicate image pairs. Specifically, we firstly propose a comparing CNN framework, which is equipped with a cross-stream to fully learn the correlation information between image pairs, while considering the features of each image. Furthermore, to deal with the local deformations led by cropping, translation, scaling, and non-rigid transformations, we additionally introduce a spatial transformer comparing CNN model by incorporating a spatial transformer module to the comparing CNN architecture. To demonstrate the effectiveness of the proposed method on both the single-modality and cross-modality (Optical-InfraRed) near-duplicate image pair detection tasks, we conduct extensive experiments on three popular benchmark datasets, namely CaliforniaND (ND means near duplicate), Mir-Flickr Near Duplicate, and TNO Multi-band Image Data Collection. The experimental results show that the proposed method can achieve superior performance compared with many state-of-the-art methods on both tasks.


Author(s):  
Devendra Singh Sachan ◽  
Manzil Zaheer ◽  
Ruslan Salakhutdinov

In this paper, we study bidirectional LSTM network for the task of text classification using both supervised and semisupervised approaches. Several prior works have suggested that either complex pretraining schemes using unsupervised methods such as language modeling (Dai and Le 2015; Miyato, Dai, and Goodfellow 2016) or complicated models (Johnson and Zhang 2017) are necessary to achieve a high classification accuracy. However, we develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results compared with more complex approaches. Furthermore, in addition to cross-entropy loss, by using a combination of entropy minimization, adversarial, and virtual adversarial losses for both labeled and unlabeled data, we report state-of-theart results for text classification task on several benchmark datasets. In particular, on the ACL-IMDB sentiment analysis and AG-News topic classification datasets, our method outperforms current approaches by a substantial margin. We also show the generality of the mixed objective function by improving the performance on relation extraction task.1


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Narjes Rohani ◽  
Changiz Eslahchi

Abstract Drug-Drug Interaction (DDI) prediction is one of the most critical issues in drug development and health. Proposing appropriate computational methods for predicting unknown DDI with high precision is challenging. We proposed "NDD: Neural network-based method for drug-drug interaction prediction" for predicting unknown DDIs using various information about drugs. Multiple drug similarities based on drug substructure, target, side effect, off-label side effect, pathway, transporter, and indication data are calculated. At first, NDD uses a heuristic similarity selection process and then integrates the selected similarities with a nonlinear similarity fusion method to achieve high-level features. Afterward, it uses a neural network for interaction prediction. The similarity selection and similarity integration parts of NDD have been proposed in previous studies of other problems. Our novelty is to combine these parts with new neural network architecture and apply these approaches in the context of DDI prediction. We compared NDD with six machine learning classifiers and six state-of-the-art graph-based methods on three benchmark datasets. NDD achieved superior performance in cross-validation with AUPR ranging from 0.830 to 0.947, AUC from 0.954 to 0.994 and F-measure from 0.772 to 0.902. Moreover, cumulative evidence in case studies on numerous drug pairs, further confirm the ability of NDD to predict unknown DDIs. The evaluations corroborate that NDD is an efficient method for predicting unknown DDIs. The data and implementation of NDD are available at https://github.com/nrohani/NDD.


Sign in / Sign up

Export Citation Format

Share Document