scholarly journals Learning Compact Model for Large-Scale Multi-Label Data

Author(s):  
Tong Wei ◽  
Yu-Feng Li

Large-scale multi-label learning (LMLL) aims to annotate relevant labels from a large number of candidates for unseen data. Due to the high dimensionality in both feature and label spaces in LMLL, the storage overheads of LMLL models are often costly. This paper proposes a POP (joint label and feature Parameter OPtimization) method. It tries to filter out redundant model parameters to facilitate compact models. Our key insights are as follows. First, we investigate labels that have little impact on the commonly used LMLL performance metrics and only preserve a small number of dominant parameters for these labels. Second, for the remaining influential labels, we reduce spurious feature parameters that have little contribution to the generalization capability of models, and preserve parameters for only discriminative features. The overall problem is formulated as a constrained optimization problem pursuing minimal model size. In order to solve the resultant difficult optimization, we show that a relaxation of the optimization can be efficiently solved using binary search and greedy strategies. Experiments verify that the proposed method clearly reduces the model size compared to state-of-the-art LMLL approaches, in addition, achieves highly competitive performance.

Author(s):  
Tong Wei ◽  
Yu-Feng Li

Large-scale multi-label learning annotates relevant labels for unseen data from a huge number of candidate labels. It is well known that in large-scale multi-label learning, labels exhibit a long tail distribution in which a significant fraction of labels are tail labels. Nonetheless, how tail labels make impact on the performance metrics in large-scale multi-label learning was not explicitly quantified. In this paper, we disclose that whatever labels are randomly missing or misclassified, tail labels impact much less than common labels in terms of commonly used performance metrics (Top-$k$ precision and nDCG@$k$). With the observation above, we develop a low-complexity large-scale multi-label learning algorithm with the goal of facilitating fast prediction and compact models by trimming tail labels adaptively. Experiments clearly verify that both the prediction time and the model size are significantly reduced without sacrificing much predictive performance for state-of-the-art approaches.


Author(s):  
Guiying Li ◽  
Chao Qian ◽  
Chunhui Jiang ◽  
Xiaofen Lu ◽  
Ke Tang

Layer-wise magnitude-based pruning (LMP) is a very popular method for deep neural network (DNN) compression. However, tuning the layer-specific thresholds is a difficult task, since the space of threshold candidates is exponentially large and the evaluation is very expensive. Previous methods are mainly by hand and require expertise. In this paper, we propose an automatic tuning approach based on optimization, named OLMP. The idea is to transform the threshold tuning problem into a constrained optimization problem (i.e., minimizing the size of the pruned model subject to a constraint on the accuracy loss), and then use powerful derivative-free optimization algorithms to solve it. To compress a trained DNN, OLMP is conducted within a new iterative pruning and adjusting pipeline. Empirical results show that OLMP can achieve the best pruning ratio on LeNet-style models (i.e., 114 times for LeNet-300-100 and 298 times for LeNet-5) compared with some state-of-the- art DNN pruning methods, and can reduce the size of an AlexNet-style network up to 82 times without accuracy loss.


2020 ◽  
Vol 34 (01) ◽  
pp. 19-26 ◽  
Author(s):  
Chong Chen ◽  
Min Zhang ◽  
Yongfeng Zhang ◽  
Weizhi Ma ◽  
Yiqun Liu ◽  
...  

Recent studies on recommendation have largely focused on exploring state-of-the-art neural networks to improve the expressiveness of models, while typically apply the Negative Sampling (NS) strategy for efficient learning. Despite effectiveness, two important issues have not been well-considered in existing methods: 1) NS suffers from dramatic fluctuation, making sampling-based methods difficult to achieve the optimal ranking performance in practical applications; 2) although heterogeneous feedback (e.g., view, click, and purchase) is widespread in many online systems, most existing methods leverage only one primary type of user feedback such as purchase. In this work, we propose a novel non-sampling transfer learning solution, named Efficient Heterogeneous Collaborative Filtering (EHCF) for Top-N recommendation. It can not only model fine-grained user-item relations, but also efficiently learn model parameters from the whole heterogeneous data (including all unlabeled data) with a rather low time complexity. Extensive experiments on three real-world datasets show that EHCF significantly outperforms state-of-the-art recommendation methods in both traditional (single-behavior) and heterogeneous scenarios. Moreover, EHCF shows significant improvements in training efficiency, making it more applicable to real-world large-scale systems. Our implementation has been released 1 to facilitate further developments on efficient whole-data based neural methods.


2016 ◽  
Vol 56 (1) ◽  
pp. 67 ◽  
Author(s):  
Amanda Prorok ◽  
M. Ani Hsieh ◽  
Vijay Kumar

We present a method that distributes a swarm of heterogeneous robots among a set of tasks that require specialized capabilities in order to be completed. We model the system of heterogeneous robots as a community of species, where each species (robot type) is defined by the traits (capabilities) that it owns. Our method is based on a continuous abstraction of the swarm at a macroscopic level as we model robots switching between tasks. We formulate an optimization problem that produces an optimal set of transition rates for each species, so that the desired trait distribution is reached as quickly as possible. Since our method is based on the derivation of an analytical gradient, it is very efficient with respect to state-of-the-art methods. Building on this result, we propose a real-time optimization method that enables an online adaptation of transition rates. Our approach is well-suited for real-time applications that rely on online redistribution of large-scale robotic systems.


Energies ◽  
2020 ◽  
Vol 13 (19) ◽  
pp. 5141
Author(s):  
Andrzej J. Osiadacz ◽  
Niccolo Isoli

The main goal of this paper is to prove that bi-objective optimization of high-pressure gas networks ensures grater system efficiency than scalar optimization. The proposed algorithm searches for a trade-off between minimization of the running costs of compressors and maximization of gas networks capacity (security of gas supply to customers). The bi-criteria algorithm was developed using a gradient projection method to solve the nonlinear constrained optimization problem, and a hierarchical vector optimization method. To prove the correctness of the algorithm, three existing networks have been solved. A comparison between the scalar optimization and bi-criteria optimization results confirmed the advantages of the bi-criteria optimization approach.


2012 ◽  
Vol 2012 ◽  
pp. 1-5
Author(s):  
A. V. Wildemann ◽  
A. A. Tashkinov ◽  
V. A. Bronnikov

This paper introduces an approach for parameters identification of a statistical predicting model with the use of the available individual data. Unknown parameters are separated into two groups: the ones specifying the average trend over large set of individuals and the ones describing the details of a concrete person. In order to calculate the vector of unknown parameters, a multidimensional constrained optimization problem is solved minimizing the discrepancy between real data and the model prediction over the set of feasible solutions. Both the individual retrospective data and factors influencing the individual dynamics are taken into account. The application of the method for predicting the movement of a patient with congenital motility disorders is considered.


2020 ◽  
Vol 12 (5) ◽  
pp. 27
Author(s):  
Bouchta x RHANIZAR

We consider the constrained optimization problem  defined by: $$f(x^*) = \min_{x \in  X} f(x) \eqno (1)$$ where the function  $f$ : $ \pmb{\mathbb{R}}^{n} \longrightarrow \pmb{\mathbb{R}}$ is convex  on a closed convex set X. In this work, we will give a new method to solve problem (1) without bringing it back to an unconstrained problem. We study the convergence of this new method and give numerical examples.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8337
Author(s):  
Hyeokhyen Kwon ◽  
Gregory D. Abowd ◽  
Thomas Plötz

Supervised training of human activity recognition (HAR) systems based on body-worn inertial measurement units (IMUs) is often constrained by the typically rather small amounts of labeled sample data. Systems like IMUTube have been introduced that employ cross-modality transfer approaches to convert videos of activities of interest into virtual IMU data. We demonstrate for the first time how such large-scale virtual IMU datasets can be used to train HAR systems that are substantially more complex than the state-of-the-art. Complexity is thereby represented by the number of model parameters that can be trained robustly. Our models contain components that are dedicated to capture the essentials of IMU data as they are of relevance for activity recognition, which increased the number of trainable parameters by a factor of 1100 compared to state-of-the-art model architectures. We evaluate the new model architecture on the challenging task of analyzing free-weight gym exercises, specifically on classifying 13 dumbbell execises. We have collected around 41 h of virtual IMU data using IMUTube from exercise videos available from YouTube. The proposed model is trained with the large amount of virtual IMU data and calibrated with a mere 36 min of real IMU data. The trained model was evaluated on a real IMU dataset and we demonstrate the substantial performance improvements of 20% absolute F1 score compared to the state-of-the-art convolutional models in HAR.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Qiang Ma ◽  
Ling Xing

AbstractPerceptual video hashing represents video perceptual content by compact hash. The binary hash is sensitive to content distortion manipulations, but robust to perceptual content preserving operations. Currently, boundary between sensitivity and robustness is often ambiguous and it is decided by an empirically defined threshold. This may result in large false positive rates when received video is to be judged similar or dissimilar in some circumstances, e.g., video content authentication. In this paper, we propose a novel perceptual hashing method for video content authentication based on maximized robustness. The developed idea of maximized robustness means that robustness is maximized on condition that security requirement of hash is first met. We formulate the video hashing as a constrained optimization problem, in which coefficients of features offset and robustness are to be learned. Then we adopt a stochastic optimization method to solve the optimization. Experimental results show that the proposed hashing is quite suitable for video content authentication in terms of security and robustness.


2020 ◽  
Vol 10 (2) ◽  
pp. 36-55 ◽  
Author(s):  
Hamid A Jadad ◽  
Abderezak Touzene ◽  
Khaled Day

Recently, much research has focused on the improvement of mobile app performance and their power optimization, by offloading computation from mobile devices to public cloud computing platforms. However, the scalability of these offloading services on a large scale is still a challenge. This article describes a solution to this scalability problem by proposing a middleware that provides offloading as a service (OAS) to large-scale implementation of mobile users and apps. The proposed middleware OAS uses adaptive VM allocation and deallocation algorithms based on a CPU rate prediction model. Furthermore, it dynamically schedules the requests using a load-balancing algorithm to ensure meeting QoS requirements at a lower cost. The authors have tested the proposed algorithm by conducting multiple simulations and compared our results with state-of-the-art algorithms based on various performance metrics under multiple load conditions. The results show that OAS achieves better response time with a minimum number of VMs and reduces 50% of the cost compared to existing approaches.


Sign in / Sign up

Export Citation Format

Share Document