scholarly journals A FE-ADMM algorithm for Lavrentiev-regularized state-constrained elliptic control problem

2019 ◽  
Vol 25 ◽  
pp. 5
Author(s):  
Zixuan Chen ◽  
Xiaoliang Song ◽  
Xuping Zhang ◽  
Bo Yu

In this paper, Elliptic control problems with pointwise box constraints on the state is considered, where the corresponding Lagrange multipliers in general only represent regular Borel measure functions. To tackle this difficulty, the Lavrentiev regularization is employed to deal with the state constraints. To numerically discretize the resulted problem, full piecewise linear finite element discretization is employed. Estimation of the error produced by regularization and discretization is done. The error order of full discretization is not inferior to that of variational discretization because of the Lavrentiev-regularization. Taking the discretization error into account, algorithms of high precision do not make much sense. Utilizing efficient first-order algorithms to solve discretized problems to moderate accuracy is sufficient. Then a heterogeneous alternating direction method of multipliers (hADMM) is proposed. Different from the classical ADMM, our hADMM adopts two different weighted norms in two subproblems respectively. Additionally, to get more accurate solution, a two-phase strategy is presented, in which the primal-dual active set (PDAS) method is used as a postprocessor of the hADMM. Numerical results not only verify error estimates but also show the efficiency of the hADMM and the two-phase strategy.

2021 ◽  
Author(s):  
◽  
Yiming Peng

<p>Reinforcement Learning (RL) problems appear in diverse real-world applications and are gaining substantial attention in academia and industry. Policy Direct Search (PDS) is widely recognized as an effective approach to RL problems. However, existing PDS algorithms have some major limitations. First, many step-wise Policy Gradient Search (PGS) algorithms cannot effectively utilize informative historical gradients to accurately estimate policy gradients. Second, although evolutionary PDS algorithms do not rely on accurate policy gradient estimations and can explore learning environments effectively, they are not sample efficient at learning policies in the form of deep neural networks. Third, existing PGS algorithms often diverge easily due to the lack of reliable and flexible techniques for value function learning. Fourth, existing PGS algorithms have not provided suitable mechanisms to learn proper state features automatically.  To address these limitations, the overall goal of this thesis is to develop effective policy direct search algorithms for tackling challenging RL problems through technical innovations in four key areas. First, the thesis aims to improve the accuracy of policy gradient estimation by utilizing historical gradients through a Primal-Dual Approximation technique. Second, the thesis targets on surpassing the state-of-the-art performance by properly balancing the exploration-exploitation trade-off via Covariance Matrix Adaption Evolutionary Strategy (CMA-ES) and Proximal Policy Optimization (PPO). Third, the thesis seeks to stabilize value function learning via a self-organized Sandpile Model (SM) meanwhile generalize the compatible condition to support flexible value function learning. Fourth, the thesis endeavors to develop innovative evolutionary feature learning techniques that are capable of automatically extracting useful state features so as to enhance various cutting-edge PGS algorithms.  In the thesis, we explore the four key technical areas by studying policies with increasing complexity. First of all, we start the research from a simple linear policy representation, and then proceed to a complex neural network based policy representation. Next, we consider a more complicated situation where policy learning is coupled with a value function learning. Subsequently, we consider policies modeled as a concatenation of two interrelated networks, one for feature learning and one for action selection.  To achieve the first goal, this thesis proposes a new policy gradient learning framework where a series of historical gradients are jointly exploited to obtain accurate policy gradient estimations via the Primal-Dual Approximation technique. Under the framework, three new PGS algorithms for step-wise policy training have been derived from three widely used PGS algorithms; meanwhile, the convergence properties of these new algorithms have been theoretically analyzed. The empirical results on several benchmark control problems further show that the newly proposed algorithms can significantly outperform their base algorithms.  To achieve the second goal, this thesis develops a new sample efficient evolutionary deep policy optimization algorithm based on CMA-ES and PPO. The algorithm has a layer-wise learning mechanism to improve computational efficiency in comparison to CMA-ES. Additionally, it uses a performance lower bound based surrogate model for fitness evaluation to significantly reduce the sample cost to the state-of-the-art level. More importantly, the best policy found by CMA-ES at every generation is further improved by PPO to properly balance exploration and exploitation. The experimental results confirm that the proposed algorithm outperforms various cutting-edge algorithms on many benchmark continuous control problems.  To achieve the third goal, this thesis develops new value function learning methods that are both reliable and flexible so as to further enhance the effectiveness of policy gradient search. Two Actor-Critic (AC) algorithms have been successfully developed from a commonly-used PGS algorithm, i.e., Regular Actor-Critic (RAC). The first algorithm adopts SM to stabilize value function learning, and the second algorithm generalizes the logarithm function used by the compatible condition to provide a flexible family of new compatible functions. The experimental results show that, with the help of reliable and flexible value function learning, the newly developed algorithms are more effective than RAC on several benchmark control problems.  To achieve the fourth goal, this thesis develops innovative NeuroEvolution algorithms for automated feature learning to enhance various cutting-edge PGS algorithms. The newly developed algorithms not only can extract useful state features but also learn good policies. The experimental analysis demonstrates that the newly proposed algorithms can achieve better performance on large-scale RL problems in comparison to both well-known PGS algorithms and NeuroEvolution techniques. Our experiments also confirm that the state features learned by NeuroEvolution on one RL task can be easily transferred to boost learning performance on similar but different tasks.</p>


Robotica ◽  
2009 ◽  
Vol 28 (4) ◽  
pp. 525-537 ◽  
Author(s):  
Yunong Zhang ◽  
Kene Li

SUMMARYIn this paper, to diminish discontinuity points arising in the infinity-norm velocity minimization scheme, a bi-criteria velocity minimization scheme is presented based on a new neural network solver, i.e., an LVI-based primal-dual neural network. Such a kinematic planning scheme of redundant manipulators can incorporate joint physical limits, such as, joint limits and joint velocity limits simultaneously. Moreover, the presented kinematic planning scheme can be reformulated as a quadratic programming (QP) problem. As a real-time QP solver, the LVI-based primal-dual neural network is developed with a simple piecewise linear structure and high computational efficiency. Computer simulations performed based on a PUMA560 manipulator model are presented to illustrate the validity and advantages of such a bi-criteria velocity minimization neural planning scheme for redundant robot arms.


Author(s):  
F Bakhtar ◽  
S. Y. Rassam ◽  
G Zhang

In the course of expansion of steam in turbines the state path crosses the saturation line and the fluid nucleates to become a two-phase mixture. These conditions can be reproduced under blow-down conditions by the equipment employed. This paper is the fourth of a set describing an investigation into the performance of a cascade of rotor tip section blading in wet steam and presents the results of droplet measurements which have been carried out by light extinction.


1999 ◽  
Vol 09 (06) ◽  
pp. 799-823 ◽  
Author(s):  
BARBARA CECON ◽  
MAURIZIO PAOLINI ◽  
MARIANGELA ROMEO

In this paper we consider the so-called prescribed curvature problem approximated by a singularly perturbed double obstacle variational inequality. We extend Ref. 10 with the introduction of the same nonregular potential used for the evolution problem in Ref. 9 and prove an optimal [Formula: see text] error estimate for nondegenerate minimizers (where ε represents the perturbation parameter). Following Ref. 10 the result relies on the construction of precise barriers suggested by formal asymptotics combined with the use of the maximum principle. Key ingredients are the construction of a sub(super)solution containing appropriate shape corrections and the use of a modified distance function based on the principal eigenfunction of the second variation of the prescribed curvature functional. This analysis is next extended to a piecewise linear finite element discretization of the elliptic PDE of bistable type to prove the same error estimate for discrete minima using the Rannacher–Scott L∞-estimates and under appropriate restrictions on the mesh size ([Formula: see text] with σ>5/2).


Author(s):  
Z. Mansoori ◽  
Z. Tayarani Yoosefabadi ◽  
M. Saffar-Avval ◽  
F. Behzad ◽  
G. Ahmadi

A numerical model for solving a fully developed, turbulent, smooth stratified two-phase gas-liquid flow in pipeline is developed. This model is capable to determine pressure drop and liquid height. In addition wall and interfacial shear stress, flow field and temperature field for both phases could be predicted successfully. The method solves the two dimensional momentum and energy equations for both phases and accounts for the effects of turbulence through the use of high Reynolds k – ε two-equation model of turbulence. The bipolar coordinate system is aided to fit the pipe wall and the interface. Also, grid refinement near the interface and near the pipe wall was used for accurate solution near the boundaries. The predicted data by this method is compared with some available Experimental data as well as one and two dimensional modeling results. It is concluded that a perfect agreement between this model and experimental results could be achieved and the ability of this model for prediction of data is more acceptable than one and some two dimensional models. In conclusion this model could have important application for optimization of transportation rates and estimation of corrosion in pipeline.


1995 ◽  
Vol 05 (01) ◽  
pp. 271-273
Author(s):  
M. KOCH ◽  
R. TETZLAFF ◽  
D. WOLF

We studied the power spectrum of the normalized voltage across the capacitor parallel to a piecewise-linear resistor of Chua’s circuit in the “chaos-chaos intermittency” state [Anishchenko et al., 1992]. The investigations included various initial conditions and circuit parameter values without and with external excitation. In all cases we found spectra showing a 1/ω2-decay over more than four decades.


Sign in / Sign up

Export Citation Format

Share Document