scholarly journals Reinforcement-learning-based control of confined cylinder wakes with stability analyses

2021 ◽  
Vol 932 ◽  
Author(s):  
Jichao Li ◽  
Mengqi Zhang

This work studies the application of a reinforcement learning (RL)-based flow control strategy to the flow past a cylinder confined between two walls to suppress vortex shedding. The control action is blowing and suction of two synthetic jets on the cylinder. The theme of this study is to investigate how to use and embed physical information of the flow in the RL-based control. First, global linear stability and sensitivity analyses based on the time-mean flow and the steady flow (which is a solution to the Navier–Stokes equations) are conducted in a range of blockage ratios and Reynolds numbers. It is found that the most sensitive region in the wake extends itself when either parameter increases in the parameter range we investigated here. Then, we use these physical results to help design RL-based control policies. We find that the controlled wake converges to the unstable steady base flow, where the vortex shedding can be successfully suppressed. A persistent oscillating control seems necessary to maintain this unstable state. The RL algorithm is able to outperform a gradient-based optimisation method (optimised in a certain period of time) in the long run. Furthermore, when the flow stability information is embedded in the reward function to penalise the instability, the controlled flow may become more stable. Finally, according to the sensitivity analyses, the control is most efficient when the probes are placed in the most sensitive region. The control can be successful even when few probes are properly placed in this manner.

2018 ◽  
Vol 859 ◽  
pp. 516-542 ◽  
Author(s):  
Calum S. Skene ◽  
Peter J. Schmid

A linear numerical study is conducted to quantify the effect of swirl on the response behaviour of premixed lean flames to general harmonic excitation in the inlet, upstream of combustion. This study considers axisymmetric M-flames and is based on the linearised compressible Navier–Stokes equations augmented by a simple one-step irreversible chemical reaction. Optimal frequency response gains for both axisymmetric and non-axisymmetric perturbations are computed via a direct–adjoint methodology and singular value decompositions. The high-dimensional parameter space, containing perturbation and base-flow parameters, is explored by taking advantage of generic sensitivity information gained from the adjoint solutions. This information is then tailored to specific parametric sensitivities by first-order perturbation expansions of the singular triplets about the respective parameters. Valuable flow information, at a negligible computational cost, is gained by simple weighted scalar products between direct and adjoint solutions. We find that for non-swirling flows, a mode with azimuthal wavenumber $m=2$ is the most efficiently driven structure. The structural mechanism underlying the optimal gains is shown to be the Orr mechanism for $m=0$ and a blend of Orr and other mechanisms, such as lift-up, for other azimuthal wavenumbers. Further to this, velocity and pressure perturbations are shown to make up the optimal input and output showing that the thermoacoustic mechanism is crucial in large energy amplifications. For $m=0$ these velocity perturbations are mainly longitudinal, but for higher wavenumbers azimuthal velocity fluctuations become prominent, especially in the non-swirling case. Sensitivity analyses are carried out with respect to the Mach number, Reynolds number and swirl number, and the accuracy of parametric gradients of the frequency response curve is assessed. The sensitivity analysis reveals that increases in Reynolds and Mach numbers yield higher gains, through a decrease in temperature diffusion. A rise in mean-flow swirl is shown to diminish the gain, with increased damping for higher azimuthal wavenumbers. This leads to a reordering of the most effectively amplified mode, with the axisymmetric ($m=0$) mode becoming the dominant structure at moderate swirl numbers.


2019 ◽  
Vol 868 ◽  
pp. 26-65 ◽  
Author(s):  
Colin Leclercq ◽  
Fabrice Demourant ◽  
Charles Poussot-Vassal ◽  
Denis Sipp

This work proposes a feedback-loop strategy to suppress intrinsic oscillations of resonating flows in the fully nonlinear regime. The frequency response of the flow is obtained from the resolvent operator about the mean flow, extending the framework initially introduced by McKeon & Sharma (J. Fluid Mech., vol. 658, 2010, pp. 336–382) to study receptivity mechanisms in turbulent flows. Using this linear time-invariant model of the nonlinear flow, modern control methods such as structured ${\mathcal{H}}_{\infty }$-synthesis can be used to design a controller. The approach is successful in damping self-sustained oscillations associated with specific eigenmodes of the mean-flow spectrum. Despite excellent performance, the linear controller is however unable to completely suppress flow oscillations, and the controlled flow is effectively attracted towards a new dynamical equilibrium. This new attractor is characterized by a different mean flow, which can in turn be used to design a second controller. The method can then be iterated on subsequent mean flows, until the coupled system eventually converges to the base flow. An intuitive parallel can be drawn with Newton’s iteration: at each step, a linearized model of the flow response to a perturbation of the input is sought, and a new linear controller is designed, aiming at further reducing the fluctuations. The method is illustrated on the well-known case of two-dimensional incompressible open-cavity flow at Reynolds number $Re=7500$, where the fully developed flow is initially quasiperiodic (2-torus state). The base flow is reached after five iterations. The present work demonstrates that nonlinear control problems may be solved without resorting to nonlinear reduced-order models. It also shows that physically relevant linear models can be systematically derived for nonlinear flows, without resorting to black-box identification from input–output data; the key ingredient being frequency-domain models based on the linearized Navier–Stokes equations about the mean flow. Applicability to amplifier flows and turbulent dynamics has, however, yet to be investigated.


2013 ◽  
Vol 726 ◽  
pp. 439-475 ◽  
Author(s):  
H. Posson ◽  
N. Peake

AbstractThis paper is concerned with modelling the effects of swirling flow on turbomachinery noise. We develop an acoustic analogy to predict sound generation in a swirling and sheared base flow in an annular duct, including the presence of moving solid surfaces to account for blade rows. In so doing we have extended a number of classical earlier results, including Ffowcs Williams & Hawkings’ equation in a medium at rest with moving surfaces, and Lilley’s equation for a sheared but non-swirling jet. By rearranging the Navier–Stokes equations we find a single equation, in the form of a sixth-order differential operator acting on the fluctuating pressure field on the left-hand side and a series of volume and surface source terms on the right-hand side; the form of these source terms depends strongly on the presence of swirl and radial shear. The integral form of this equation is then derived, using the Green’s function tailored to the base flow in the (rigid) duct. As is often the case in duct acoustics, it is then convenient to move into temporal, axial and azimuthal Fourier space, where the Green’s function is computed numerically. This formulation can then be applied to a number of turbomachinery noise sources. For definiteness here we consider the noise produced downstream when a steady distortion flow is incident on the fan from upstream, and compare our results with those obtained using a simplistic but commonly used Doppler correction method. We show that in all but the simplest case the full inclusion of swirl within an acoustic analogy, as described in this paper, is required.


2010 ◽  
Vol 660 ◽  
pp. 37-54 ◽  
Author(s):  
OLAF MARXEN ◽  
ULRICH RIST

The mutual interaction of laminar–turbulent transition and mean flow evolution is studied in a pressure-induced laminar separation bubble on a flat plate. The flat-plate boundary layer is subjected to a sufficiently strong adverse pressure gradient that a separation bubble develops. Upstream of the bubble a small-amplitude disturbance is introduced which causes transition. Downstream of transition, the mean flow strongly changes and, due to viscous–inviscid interaction, the overall pressure distribution is changed as well. As a consequence, the mean flow also changes upstream of the transition location. The difference in the mean flow between the forced and the unforced flows is denoted the mean flow deformation. Two different effects are caused by the mean flow deformation in the upstream, laminar part: a reduction of the size of the separation region and a stabilization of the flow with respect to small, linear perturbations. By carrying out numerical simulations based on the original base flow and the time-averaged deformed base flow, we are able to distinguish between direct and indirect nonlinear effects. Direct effects are caused by the quadratic nonlinearity of the Navier–Stokes equations, are associated with the generation of higher harmonics and are predominantly local. In contrast, the stabilization of the flow is an indirect effect, because it is independent of the Reynolds stress terms in the laminar region and is solely governed by the non-local alteration of the mean flow via the pressure.


2017 ◽  
Vol 815 ◽  
pp. 435-480 ◽  
Author(s):  
Benoît Pier ◽  
Peter J. Schmid

The dynamics of small-amplitude perturbations, as well as the regime of fully developed nonlinear propagating waves, is investigated for pulsatile channel flows. The time-periodic base flows are known analytically and completely determined by the Reynolds number $Re$ (based on the mean flow rate), the Womersley number $Wo$ (a dimensionless expression of the frequency) and the flow-rate waveform. This paper considers pulsatile flows with a single oscillating component and hence only three non-dimensional control parameters are present. Linear stability characteristics are obtained both by Floquet analyses and by linearized direct numerical simulations. In particular, the long-term growth or decay rates and the intracyclic modulation amplitudes are systematically computed. At large frequencies (mainly $Wo\geqslant 14$), increasing the amplitude of the oscillating component is found to have a stabilizing effect, while it is destabilizing at lower frequencies; strongest destabilization is found for $Wo\simeq 7$. Whether stable or unstable, perturbations may undergo large-amplitude intracyclic modulations; these intracyclic modulation amplitudes reach huge values at low pulsation frequencies. For linearly unstable configurations, the resulting saturated fully developed finite-amplitude solutions are computed by direct numerical simulations of the complete Navier–Stokes equations. Essentially two types of nonlinear dynamics have been identified: ‘cruising’ regimes for which nonlinearities are sustained throughout the entire pulsation cycle and which may be interpreted as modulated Tollmien–Schlichting waves, and ‘ballistic’ regimes that are propelled into a nonlinear phase before subsiding again to small amplitudes within every pulsation cycle. Cruising regimes are found to prevail for weak base-flow pulsation amplitudes, while ballistic regimes are selected at larger pulsation amplitudes; at larger pulsation frequencies, however, the ballistic regime may be bypassed due to the stabilizing effect of the base-flow pulsating component. By investigating extended regions of a multi-dimensional parameter space and considering both two-dimensional and three-dimensional perturbations, the linear and nonlinear dynamics are systematically explored and characterized.


2008 ◽  
Vol 615 ◽  
pp. 221-252 ◽  
Author(s):  
OLIVIER MARQUET ◽  
DENIS SIPP ◽  
LAURENT JACQUIN

A general theoretical formalism is developed to assess how base-flow modifications may alter the stability properties of flows studied in a global approach of linear stability theory. It also comprises a systematic approach to the passive control of globally unstable flows by the use of small control devices. This formalism is based on a sensitivity analysis of any global eigenvalue to base-flow modifications. The base-flow modifications investigated are either arbitrary or specific ones induced by a steady force. This leads to a definition of the so-called sensitivity to base-flow modifications and sensitivity to a steady force. These sensitivity analyses are applied to the unstable global modes responsible for the onset of vortex shedding in the wake of a cylinder for Reynolds numbers in the range 47≤Re≤80. First, it is demonstrated how the sensitivity to arbitrary base-flow modifications may be used to identify regions and properties of the base flow that contribute to the onset of vortex shedding. Secondly, the sensitivity to a steady force determines the regions of the flow where a steady force acting on the base flow stabilizes the unstable global modes. Upon modelling the presence of a control device by a steady force acting on the base flow, these predictions are then extensively compared with the experimental results of Strykowski & Sreenivasan (J. Fluid Mech., vol. 218, 1990, p. 71). A physical interpretation of the suppression of vortex shedding by use of a control cylinder is proposed in the light of the sensitivity analysis.


2021 ◽  
Author(s):  
Stav Belogolovsky ◽  
Philip Korsunsky ◽  
Shie Mannor ◽  
Chen Tessler ◽  
Tom Zahavy

AbstractWe consider the task of Inverse Reinforcement Learning in Contextual Markov Decision Processes (MDPs). In this setting, contexts, which define the reward and transition kernel, are sampled from a distribution. In addition, although the reward is a function of the context, it is not provided to the agent. Instead, the agent observes demonstrations from an optimal policy. The goal is to learn the reward mapping, such that the agent will act optimally even when encountering previously unseen contexts, also known as zero-shot transfer. We formulate this problem as a non-differential convex optimization problem and propose a novel algorithm to compute its subgradients. Based on this scheme, we analyze several methods both theoretically, where we compare the sample complexity and scalability, and empirically. Most importantly, we show both theoretically and empirically that our algorithms perform zero-shot transfer (generalize to new and unseen contexts). Specifically, we present empirical experiments in a dynamic treatment regime, where the goal is to learn a reward function which explains the behavior of expert physicians based on recorded data of them treating patients diagnosed with sepsis.


2021 ◽  
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Giorgia Ramponi ◽  
Andrea Tirinzoni ◽  
Matteo Giuliani ◽  
...  

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.


Minerals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 587
Author(s):  
Joao Pedro de Carvalho ◽  
Roussos Dimitrakopoulos

This paper presents a new truck dispatching policy approach that is adaptive given different mining complex configurations in order to deliver supply material extracted by the shovels to the processors. The method aims to improve adherence to the operational plan and fleet utilization in a mining complex context. Several sources of operational uncertainty arising from the loading, hauling and dumping activities can influence the dispatching strategy. Given a fixed sequence of extraction of the mining blocks provided by the short-term plan, a discrete event simulator model emulates the interaction arising from these mining operations. The continuous repetition of this simulator and a reward function, associating a score value to each dispatching decision, generate sample experiences to train a deep Q-learning reinforcement learning model. The model learns from past dispatching experience, such that when a new task is required, a well-informed decision can be quickly taken. The approach is tested at a copper–gold mining complex, characterized by uncertainties in equipment performance and geological attributes, and the results show improvements in terms of production targets, metal production, and fleet management.


Sign in / Sign up

Export Citation Format

Share Document