scholarly journals Projective Quadratic Regression for Online Learning

2020 ◽  
Vol 34 (04) ◽  
pp. 5093-5100
Author(s):  
Wenye Ma

This paper considers online convex optimization (OCO) problems - the paramount framework for online learning algorithm design. The loss function of learning task in OCO setting is based on streaming data so that OCO is a powerful tool to model large scale applications such as online recommender systems. Meanwhile, real-world data are usually of extreme high-dimensional due to modern feature engineering techniques so that the quadratic regression is impractical. Factorization Machine as well as its variants are efficient models for capturing feature interactions with low-rank matrix model but they can't fulfill the OCO setting due to their non-convexity. In this paper, We propose a projective quadratic regression (PQR) model. First, it can capture the import second-order feature information. Second, it is a convex model, so the requirements of OCO are fulfilled and the global optimal solution can be achieved. Moreover, existing modern online optimization methods such as Online Gradient Descent (OGD) or Follow-The-Regularized-Leader (FTRL) can be applied directly. In addition, by choosing a proper hyper-parameter, we show that it has the same order of space and time complexity as the linear model and thus can handle high-dimensional data. Experimental results demonstrate the performance of the proposed PQR model in terms of accuracy and efficiency by comparing with the state-of-the-art methods.

2021 ◽  
Vol 15 (3) ◽  
pp. 1-28
Author(s):  
Xueyan Liu ◽  
Bo Yang ◽  
Hechang Chen ◽  
Katarzyna Musial ◽  
Hongxu Chen ◽  
...  

Stochastic blockmodel (SBM) is a widely used statistical network representation model, with good interpretability, expressiveness, generalization, and flexibility, which has become prevalent and important in the field of network science over the last years. However, learning an optimal SBM for a given network is an NP-hard problem. This results in significant limitations when it comes to applications of SBMs in large-scale networks, because of the significant computational overhead of existing SBM models, as well as their learning methods. Reducing the cost of SBM learning and making it scalable for handling large-scale networks, while maintaining the good theoretical properties of SBM, remains an unresolved problem. In this work, we address this challenging task from a novel perspective of model redefinition. We propose a novel redefined SBM with Poisson distribution and its block-wise learning algorithm that can efficiently analyse large-scale networks. Extensive validation conducted on both artificial and real-world data shows that our proposed method significantly outperforms the state-of-the-art methods in terms of a reasonable trade-off between accuracy and scalability. 1


2020 ◽  
Vol 2 (2) ◽  
pp. 96-136
Author(s):  
Navoneel Chakrabarty ◽  
Sanket Biswas

Imbalanced data refers to a problem in machine learning where there exists unequal distribution of instances for each classes. Performing a classification task on such data can often turn bias in favour of the majority class. The bias gets multiplied in cases of high dimensional data. To settle this problem, there exists many real-world data mining techniques like over-sampling and under-sampling, which can reduce the Data Imbalance. Synthetic Minority Oversampling Technique (SMOTe) provided one such state-of-the-art and popular solution to tackle class imbalancing, even on high-dimensional data platform. In this work, a novel and consistent oversampling algorithm has been proposed that can further enhance the performance of classification, especially on binary imbalanced datasets. It has been named as NMOTe (Navo Minority Oversampling Technique), an upgraded and superior alternative to the existing techniques. A critical analysis and comprehensive overview on the literature has been done to get a deeper insight into the problem statements and nurturing the need to obtain the most optimal solution. The performance of NMOTe on some standard datasets has been established in this work to get a statistical understanding on why it has edged the existing state-of-the-art to become the most robust technique for solving the two-class data imbalance problem.


2021 ◽  
pp. 1-10
Author(s):  
Lei Shu ◽  
Kun Huang ◽  
Wenhao Jiang ◽  
Wenming Wu ◽  
Hongling Liu

It is easy to lead to poor generalization in machine learning tasks using real-world data directly, since such data is usually high-dimensional dimensionality and limited. Through learning the low dimensional representations of high-dimensional data, feature selection can retain useful features for machine learning tasks. Using these useful features effectively trains machine learning models. Hence, it is a challenge for feature selection from high-dimensional data. To address this issue, in this paper, a hybrid approach consisted of an autoencoder and Bayesian methods is proposed for a novel feature selection. Firstly, Bayesian methods are embedded in the proposed autoencoder as a special hidden layer. This of doing is to increase the precision during selecting non-redundant features. Then, the other hidden layers of the autoencoder are used for non-redundant feature selection. Finally, compared with the mainstream approaches for feature selection, the proposed method outperforms them. We find that the way consisted of autoencoders and probabilistic correction methods is more meaningful than that of stacking architectures or adding constraints to autoencoders as regards feature selection. We also demonstrate that stacked autoencoders are more suitable for large-scale feature selection, however, sparse autoencoders are beneficial for a smaller number of feature selection. We indicate that the value of the proposed method provides a theoretical reference to analyze the optimality of feature selection.


Author(s):  
Tingting Ren ◽  
Xiuyi Jia ◽  
Weiwei Li ◽  
Shu Zhao

Label distribution learning (LDL) can be viewed as the generalization of multi-label learning. This novel paradigm focuses on the relative importance of different labels to a particular instance. Most previous LDL methods either ignore the correlation among labels, or only exploit the label correlations in a global way. In this paper, we utilize both the global and local relevance among labels to provide more information for training model and propose a novel label distribution learning algorithm. In particular, a label correlation matrix based on low-rank approximation is applied to capture the global label correlations. In addition, the label correlation among local samples are adopted to modify the label correlation matrix. The experimental results on real-world data sets show that the proposed algorithm outperforms state-of-the-art LDL methods.


2019 ◽  
Author(s):  
Diego Galeano ◽  
Alberto Paccanaro

AbstractPair-input associations for drug-side effects are obtained through expensive placebo-controlled experiments in human clinical trials. An important challenge in computational pharmacology is to predict missing associations given a few entries in the drug-side effect matrix, as these predictions can be used to direct further clinical trials. Here we introduce the Geometric Sparse Matrix Completion (GSMC) model for predicting drug side effects. Our high-rank matrix completion model learns non-negative sparse matrices of coefficients for drugs and side effects by imposing smoothness priors that exploit a set of pharmacological side information graphs, including information about drug chemical structures, drug interactions, molecular targets, and disease indications. Our learning algorithm is based on the diagonally rescaled gradient descend principle of non-negative matrix factorization. We prove that it converges to a globally optimal solution with a first-order rate of convergence. Experiments on large-scale side effect data from human clinical trials show that our method achieves better prediction performance than six state-of-the-art methods for side effect prediction while offering biological interpretability and favouring explainable predictions.


2019 ◽  
Vol 36 (04) ◽  
pp. 1950016
Author(s):  
Zhiyong Huang ◽  
Ziyan Luo ◽  
Naihua Xiu

The least-squares is a common and important method in linear regression. However, it often leads to overfitting phenomenon as dealing with high-dimensional problems, and various regularization schemes regarding prior information for specific problems are studied to make up such a deficiency. In the sense of Kendall’s [Formula: see text] from the community of nonparametric analysis, we establish a new model wherein the ordinary least-squares is equipped with perfect positive correlation constraint, sought to maintain the concordance of the rankings of the observations and the systematic components. By sorting the observations into an ascending order, we reduce the perfect positive correlation constraint into a linear inequality system. The resulting linearly constrained least-squares problem together with its dual problem is shown to be solvable. In particular, we introduce a mild assumption on the observations and the measurement matrix which rules out the zero vector from the optimal solution set. This indicates that our proposed model is statistically meaningful. To handle large-scale instances, we propose an efficient alternating direction method of multipliers (ADMM) to solve the proposed model from the dual perspective. The effectiveness of our model compared to ordinary least-squares is evaluated in terms of rank correlation coefficient between outputs and the systematic components, and the efficiency of our dual algorithm is demonstrated with the comparison to three efficient solvers via CVX in terms of computation time, solution accuracy and rank correlation coefficient.


2019 ◽  
Vol 29 (07) ◽  
pp. 2050112
Author(s):  
Renuka Kamdar ◽  
Priyanka Paliwal ◽  
Yogendra Kumar

The goal to provide faster and optimal solution to complex and high-dimensional problem is pushing the technical envelope related to new algorithms. While many approaches use centralized strategies, the concept of multi-agent systems (MASS) is creating a new option related to distributed analyses for the optimization problems. A novel learning algorithm for solving the global numerical optimization problems is proposed. The proposed learning algorithm integrates the multi-agent system and the hybrid butterfly–particle swarm optimization (BFPSO) algorithm. Thus it is named as multi-agent-based BFPSO (MABFPSO). In order to obtain the optimal solution quickly, each agent competes and cooperates with its neighbors and it can also learn by using its knowledge. Making use of these agent–agent interactions and sensitivity and probability mechanism of BFPSO, MABFPSO realizes the purpose of optimizing the value of objective function. The designed MABFPSO algorithm is tested on specific benchmark functions. Simulations of the proposed algorithm have been performed for the optimization of functions of 2, 20 and 30 dimensions. The comparative simulation results with conventional PSO approaches demonstrate that the proposed algorithm is a potential candidate for optimization of both low-and high-dimensional functions. The optimization strategy is general and can be used to solve other power system optimization problems as well.


2012 ◽  
Vol 24 (12) ◽  
pp. 3371-3394 ◽  
Author(s):  
Guangcan Liu ◽  
Shuicheng Yan

We address the scalability issues in low-rank matrix learning problems. Usually these problems resort to solving nuclear norm regularized optimization problems (NNROPs), which often suffer from high computational complexities if based on existing solvers, especially in large-scale settings. Based on the fact that the optimal solution matrix to an NNROP is often low rank, we revisit the classic mechanism of low-rank matrix factorization, based on which we present an active subspace algorithm for efficiently solving NNROPs by transforming large-scale NNROPs into small-scale problems. The transformation is achieved by factorizing the large solution matrix into the product of a small orthonormal matrix (active subspace) and another small matrix. Although such a transformation generally leads to nonconvex problems, we show that a suboptimal solution can be found by the augmented Lagrange alternating direction method. For the robust PCA (RPCA) (Candès, Li, Ma, & Wright, 2009 ) problem, a typical example of NNROPs, theoretical results verify the suboptimality of the solution produced by our algorithm. For the general NNROPs, we empirically show that our algorithm significantly reduces the computational complexity without loss of optimality.


Author(s):  
Yuanyu Wan ◽  
Nan Wei ◽  
Lijun Zhang

By employing time-varying proximal functions, adaptive subgradient methods (ADAGRAD) have improved the regret bound and been widely used in online learning and optimization. However, ADAGRAD with full matrix proximal functions (ADA-FULL) cannot deal with large-scale problems due to the impractical time and space complexities, though it has better performance when gradients are correlated. In this paper, we propose ADA-FD, an efficient variant of ADA-FULL based on a deterministic matrix sketching technique called frequent directions. Following ADA-FULL, we incorporate our ADA-FD into both primal-dual subgradient method and composite mirror descent method to develop two efficient methods. By maintaining and manipulating low-rank matrices, at each iteration, the space complexity is reduced from $O(d^2)$ to $O(\tau d)$ and the time complexity is reduced from $O(d^3)$ to $O(\tau^2d)$, where $d$ is the dimensionality of the data and $\tau \ll d$ is the sketching size. Theoretical analysis reveals that the regret of our methods is close to that of ADA-FULL as long as the outer product matrix of gradients is approximately low-rank. Experimental results show that our ADA-FD is comparable to ADA-FULL and outperforms other state-of-the-art algorithms in online convex optimization as well as in training convolutional neural networks (CNN).


Author(s):  
Dezhong Yao ◽  
Peilin Zhao ◽  
Tuan-Anh Nguyen Pham ◽  
Gao Cong

We investigate how to adopt dual random projection for high-dimensional similarity learning. For a high-dimensional similarity learning problem, projection is usually adopted to map high-dimensional features into low-dimensional space, in order to reduce the computational cost. However, dimensionality reduction method sometimes results in unstable performance due to the suboptimal solution in original space. In this paper, we propose a dual random projection framework for similarity learning to recover the original optimal solution from subspace optimal solution. Previous dual random projection methods usually make strong assumptions about the data, which need to be low rank or have a large margin. Those assumptions limit dual random projection applications in similarity learning. Thus, we adopt a dual-sparse regularized random projection method that introduces a sparse regularizer into the reduced dual problem. As the original dual solution is a sparse one, applying a sparse regularizer in the reduced space relaxes the low-rank assumption. Experimental results show that our method enjoys higher effectiveness and efficiency than state-of-the-art solutions.


Sign in / Sign up

Export Citation Format

Share Document