Parallel Training for Large-Scale Echo State Networks via Alternating Direction Method of Multipliers

Author(s):  
Wu Ai ◽  
Shuling Li ◽  
Jian Wu ◽  
Huazhou Chen ◽  
Quanxi Feng ◽  
...  
Author(s):  
Krešimir Mihić ◽  
Mingxi Zhu ◽  
Yinyu Ye

Abstract The Alternating Direction Method of Multipliers (ADMM) has gained a lot of attention for solving large-scale and objective-separable constrained optimization. However, the two-block variable structure of the ADMM still limits the practical computational efficiency of the method, because one big matrix factorization is needed at least once even for linear and convex quadratic programming. This drawback may be overcome by enforcing a multi-block structure of the decision variables in the original optimization problem. Unfortunately, the multi-block ADMM, with more than two blocks, is not guaranteed to be convergent. On the other hand, two positive developments have been made: first, if in each cyclic loop one randomly permutes the updating order of the multiple blocks, then the method converges in expectation for solving any system of linear equations with any number of blocks. Secondly, such a randomly permuted ADMM also works for equality-constrained convex quadratic programming even when the objective function is not separable. The goal of this paper is twofold. First, we add more randomness into the ADMM by developing a randomly assembled cyclic ADMM (RAC-ADMM) where the decision variables in each block are randomly assembled. We discuss the theoretical properties of RAC-ADMM and show when random assembling helps and when it hurts, and develop a criterion to guarantee that it converges almost surely. Secondly, using the theoretical guidance on RAC-ADMM, we conduct multiple numerical tests on solving both randomly generated and large-scale benchmark quadratic optimization problems, which include continuous, and binary graph-partition and quadratic assignment, and selected machine learning problems. Our numerical tests show that the RAC-ADMM, with a variable-grouping strategy, could significantly improve the computation efficiency on solving most quadratic optimization problems.


2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Hansi K. Abeynanda ◽  
G. H. J. Lanel

Distributed optimization is a very important concept with applications in control theory and many related fields, as it is high fault-tolerant and extremely scalable compared with centralized optimization. Centralized solution methods are not suitable for many application domains that consist of large number of networked systems. In general, these large-scale networked systems cooperatively find an optimal solution to a common global objective during the optimization process. Thus, it gives us an opportunity to analyze distributed optimization techniques that is demanded in most distributed optimization settings. This paper presents an analysis that provides an overview of decomposition methods as well as currently existing distributed methods and techniques that are employed in large-scale networked systems. A detailed analysis on gradient like methods, subgradient methods, and methods of multipliers including the alternating direction method of multipliers is presented. These methods are analyzed empirically by using numerical examples. Moreover, an example highlighting the fact that the gradient method fails to solve distributed problems in some circumstances is discussed under numerical results. A numerical implementation is used to demonstrate that the alternating direction method of multipliers can solve this particular problem, by revealing its robustness compared with the gradient method. Finally, we conclude the paper with possible future research directions.


2019 ◽  
Vol 2019 ◽  
pp. 1-12
Author(s):  
Yu Li ◽  
Qiming Zou ◽  
Xiaoru Ji ◽  
Chanyuan Zhang ◽  
Ke Lu

Model Predictive Control (MPC) can effectively handle control problem with disturbances, multicontrol variables, and complex constraints and is widely used in various control systems. In MPC, the control input at each time step is obtained by solving an online optimization problem, which will cause a time delay in real time on embedded computers with limited computational resources. In this paper, we utilize adaptive Alternating Direction Method of Multipliers (a-ADMM) to accelerate the solution of MPC. This method adaptively adjusts penalty parameter to balance the value of primal residual and dual residual. The performance of this approach is profiled via the control of a quadcopter with 12 states and 4 controls and prediction horizon ranging from 10 to 40. The simulation results demonstrate that the MPC based on a-ADMM has a significant improvement in real-time and convergence performance and thus is more suitable for solving large-scale optimal control problems.


2019 ◽  
Vol 9 (20) ◽  
pp. 4291 ◽  
Author(s):  
Mahammad Humayoo ◽  
Xueqi Cheng

Regularization is a popular technique in machine learning for model estimation and for avoiding overfitting. Prior studies have found that modern ordered regularization can be more effective in handling highly correlated, high-dimensional data than traditional regularization. The reason stems from the fact that the ordered regularization can reject irrelevant variables and yield an accurate estimation of the parameters. How to scale up the ordered regularization problems when facing large-scale training data remains an unanswered question. This paper explores the problem of parameter estimation with the ordered ℓ 2 -regularization via Alternating Direction Method of Multipliers (ADMM), called ADMM-O ℓ 2 . The advantages of ADMM-O ℓ 2 include (i) scaling up the ordered ℓ 2 to a large-scale dataset, (ii) predicting parameters correctly by excluding irrelevant variables automatically, and (iii) having a fast convergence rate. Experimental results on both synthetic data and real data indicate that ADMM-O ℓ 2 can perform better than or comparable to several state-of-the-art baselines.


Author(s):  
Xuchao Zhang ◽  
Liang Zhao ◽  
Zhiqian Chen ◽  
Chang-Tien Lu

Self-paced learning (SPL) mimics the cognitive process of humans, who generally learn from easy samples to hard ones. One key issue in SPL is the training process required for each instance weight depends on the other samples and thus cannot easily be run in a distributed manner in a large-scale dataset. In this paper, we reformulate the self-paced learning problem into a distributed setting and propose a novel Distributed Self-Paced Learning method (DSPL) to handle large scale datasets. Specifically, both the model and instance weights can be optimized in parallel for each batch based on a consensus alternating direction method of multipliers. We also prove the convergence of our algorithm under mild conditions. Extensive experiments on both synthetic and real datasets demonstrate that our approach is superior to those of existing methods.


Sign in / Sign up

Export Citation Format

Share Document