Parallel Training for Large-Scale Echo State Networks via Alternating Direction Method of Multipliers

Abstract The Alternating Direction Method of Multipliers (ADMM) has gained a lot of attention for solving large-scale and objective-separable constrained optimization. However, the two-block variable structure of the ADMM still limits the practical computational efficiency of the method, because one big matrix factorization is needed at least once even for linear and convex quadratic programming. This drawback may be overcome by enforcing a multi-block structure of the decision variables in the original optimization problem. Unfortunately, the multi-block ADMM, with more than two blocks, is not guaranteed to be convergent. On the other hand, two positive developments have been made: first, if in each cyclic loop one randomly permutes the updating order of the multiple blocks, then the method converges in expectation for solving any system of linear equations with any number of blocks. Secondly, such a randomly permuted ADMM also works for equality-constrained convex quadratic programming even when the objective function is not separable. The goal of this paper is twofold. First, we add more randomness into the ADMM by developing a randomly assembled cyclic ADMM (RAC-ADMM) where the decision variables in each block are randomly assembled. We discuss the theoretical properties of RAC-ADMM and show when random assembling helps and when it hurts, and develop a criterion to guarantee that it converges almost surely. Secondly, using the theoretical guidance on RAC-ADMM, we conduct multiple numerical tests on solving both randomly generated and large-scale benchmark quadratic optimization problems, which include continuous, and binary graph-partition and quadratic assignment, and selected machine learning problems. Our numerical tests show that the RAC-ADMM, with a variable-grouping strategy, could significantly improve the computation efficiency on solving most quadratic optimization problems.

Download Full-text

Multi-agent distributed large-scale optimization by inexact consensus alternating direction method of multipliers

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2014.6854783 ◽

2014 ◽

Cited By ~ 2

Author(s):

Tsung-Hui Chang ◽

Mingyi Hong ◽

Xiangfeng Wang

Keyword(s):

Large Scale ◽

Alternating Direction Method ◽

Large Scale Optimization ◽

Method Of Multipliers ◽

Alternating Direction ◽

Multi Agent ◽

Scale Optimization

Download Full-text

A Study on Distributed Optimization over Large-Scale Networked Systems

Journal of Mathematics ◽

10.1155/2021/5540262 ◽

2021 ◽

Vol 2021 ◽

pp. 1-19

Author(s):

Hansi K. Abeynanda ◽

G. H. J. Lanel

Keyword(s):

Gradient Method ◽

Large Scale ◽

Distributed Optimization ◽

Alternating Direction Method ◽

Optimization Techniques ◽

Networked Systems ◽

Future Research ◽

Subgradient Methods ◽

Method Of Multipliers ◽

Alternating Direction

Distributed optimization is a very important concept with applications in control theory and many related fields, as it is high fault-tolerant and extremely scalable compared with centralized optimization. Centralized solution methods are not suitable for many application domains that consist of large number of networked systems. In general, these large-scale networked systems cooperatively find an optimal solution to a common global objective during the optimization process. Thus, it gives us an opportunity to analyze distributed optimization techniques that is demanded in most distributed optimization settings. This paper presents an analysis that provides an overview of decomposition methods as well as currently existing distributed methods and techniques that are employed in large-scale networked systems. A detailed analysis on gradient like methods, subgradient methods, and methods of multipliers including the alternating direction method of multipliers is presented. These methods are analyzed empirically by using numerical examples. Moreover, an example highlighting the fact that the gradient method fails to solve distributed problems in some circumstances is discussed under numerical results. A numerical implementation is used to demonstrate that the alternating direction method of multipliers can solve this particular problem, by revealing its robustness compared with the gradient method. Finally, we conclude the paper with possible future research directions.

Download Full-text

Fast Model Predictive Control Based on Adaptive Alternating Direction Method of Multipliers

Journal of Chemistry ◽

10.1155/2019/8035204 ◽

2019 ◽

Vol 2019 ◽

pp. 1-12

Author(s):

Yu Li ◽

Qiming Zou ◽

Xiaoru Ji ◽

Chanyuan Zhang ◽

Ke Lu

Keyword(s):

Model Predictive Control ◽

Real Time ◽

Predictive Control ◽

Large Scale ◽

Alternating Direction Method ◽

Control Input ◽

Method Of Multipliers ◽

Time Step ◽

Prediction Horizon ◽

Alternating Direction

Model Predictive Control (MPC) can effectively handle control problem with disturbances, multicontrol variables, and complex constraints and is widely used in various control systems. In MPC, the control input at each time step is obtained by solving an online optimization problem, which will cause a time delay in real time on embedded computers with limited computational resources. In this paper, we utilize adaptive Alternating Direction Method of Multipliers (a-ADMM) to accelerate the solution of MPC. This method adaptively adjusts penalty parameter to balance the value of primal residual and dual residual. The performance of this approach is profiled via the control of a quadcopter with 12 states and 4 controls and prediction horizon ranging from 10 to 40. The simulation results demonstrate that the MPC based on a-ADMM has a significant improvement in real-time and convergence performance and thus is more suitable for solving large-scale optimal control problems.

Download Full-text

Cost-Sensitive Alternating Direction Method of Multipliers for Large-Scale Classification

Lecture Notes in Computer Science - Intelligent Data Engineering and Automated Learning – IDEAL 2017 ◽

10.1007/978-3-319-68935-7_35 ◽

2017 ◽

pp. 315-325

Author(s):

Huihui Wang ◽

Yinghuan Shi ◽

Xingguo Chen ◽

Yang Gao

Keyword(s):

Large Scale ◽

Alternating Direction Method ◽

Method Of Multipliers ◽

Alternating Direction ◽

Scale Classification

Download Full-text

Parameter Estimation with the Ordered ℓ2 Regularization via an Alternating Direction Method of Multipliers

Applied Sciences ◽

10.3390/app9204291 ◽

2019 ◽

Vol 9 (20) ◽

pp. 4291 ◽

Cited By ~ 1

Author(s):

Mahammad Humayoo ◽

Xueqi Cheng

Keyword(s):

Parameter Estimation ◽

Large Scale ◽

Scale Up ◽

Synthetic Data ◽

Real Data ◽

Alternating Direction Method ◽

Training Data ◽

Accurate Estimation ◽

Method Of Multipliers ◽

Alternating Direction

Regularization is a popular technique in machine learning for model estimation and for avoiding overfitting. Prior studies have found that modern ordered regularization can be more effective in handling highly correlated, high-dimensional data than traditional regularization. The reason stems from the fact that the ordered regularization can reject irrelevant variables and yield an accurate estimation of the parameters. How to scale up the ordered regularization problems when facing large-scale training data remains an unanswered question. This paper explores the problem of parameter estimation with the ordered ℓ 2 -regularization via Alternating Direction Method of Multipliers (ADMM), called ADMM-O ℓ 2 . The advantages of ADMM-O ℓ 2 include (i) scaling up the ordered ℓ 2 to a large-scale dataset, (ii) predicting parameters correctly by excluding irrelevant variables automatically, and (iii) having a fast convergence rate. Experimental results on both synthetic data and real data indicate that ADMM-O ℓ 2 can perform better than or comparable to several state-of-the-art baselines.

Download Full-text

A computation study on an integrated alternating direction method of multipliers for large scale optimization

Optimization Letters ◽

10.1007/s11590-017-1116-y ◽

2017 ◽

Vol 12 (1) ◽

pp. 3-15 ◽

Cited By ~ 2

Author(s):

Masoud Zarepisheh ◽

Lei Xing ◽

Yinyu Ye

Keyword(s):

Large Scale ◽

Alternating Direction Method ◽

Large Scale Optimization ◽

Method Of Multipliers ◽

Alternating Direction ◽

Scale Optimization

Download Full-text

Optimal transmission line switching for large-scale power systems using the Alternating Direction Method of Multipliers

2014 Power Systems Computation Conference ◽

10.1109/pscc.2014.7038447 ◽

2014 ◽

Cited By ~ 5

Author(s):

Olli Makela ◽

Joseph Warrington ◽

Manfred Morari ◽

Goran Andersson

Keyword(s):

Power Systems ◽

Transmission Line ◽

Large Scale ◽

Alternating Direction Method ◽

Method Of Multipliers ◽

Alternating Direction ◽

Line Switching

Download Full-text

Distributed Self-Paced Learning in Alternating Direction Method of Multipliers

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/437 ◽

2018 ◽

Author(s):

Xuchao Zhang ◽

Liang Zhao ◽

Zhiqian Chen ◽

Chang-Tien Lu

Keyword(s):

Cognitive Process ◽

Large Scale ◽

Alternating Direction Method ◽

The Self ◽

Method Of Multipliers ◽

Training Process ◽

Learning Problem ◽

Mild Conditions ◽

Alternating Direction ◽

Large Scale Dataset

Self-paced learning (SPL) mimics the cognitive process of humans, who generally learn from easy samples to hard ones. One key issue in SPL is the training process required for each instance weight depends on the other samples and thus cannot easily be run in a distributed manner in a large-scale dataset. In this paper, we reformulate the self-paced learning problem into a distributed setting and propose a novel Distributed Self-Paced Learning method (DSPL) to handle large scale datasets. Specifically, both the model and instance weights can be optimized in parallel for each batch based on a consensus alternating direction method of multipliers. We also prove the convergence of our algorithm under mild conditions. Extensive experiments on both synthetic and real datasets demonstrate that our approach is superior to those of existing methods.

Download Full-text