A Reinforcement Learning

Ender Özcan; Mustafa Misir; Gabriela Ochoa; Edmund K. Burke

doi:10.4018/jamc.2010102603

Modeling, Analysis, and Applications in Metaheuristic Computing ◽

10.4018/978-1-4666-0270-0.ch003 ◽

2012 ◽

pp. 34-55 ◽

Cited By ~ 4

Author(s):

Ender Özcan ◽

Mustafa Misir ◽

Gabriela Ochoa ◽

Edmund K. Burke

Keyword(s):

Reinforcement Learning ◽

Complete Solution ◽

Examination Timetabling ◽

Low Level ◽

Termination Criteria ◽

Candidate Solution ◽

Wide Range ◽

Finite Set ◽

Different Characteristics

Hyper-heuristics can be identified as methodologies that search the space generated by a finite set of low level heuristics for solving search problems. An iterative hyper-heuristic framework can be thought of as requiring a single candidate solution and multiple perturbation low level heuristics. An initially generated complete solution goes through two successive processes (heuristic selection and move acceptance) until a set of termination criteria is satisfied. A motivating goal of hyper-heuristic research is to create automated techniques that are applicable to a wide range of problems with different characteristics. Some previous studies show that different combinations of heuristic selection and move acceptance as hyper-heuristic components might yield different performances. This study investigates whether learning heuristic selection can improve the performance of a great deluge based hyper-heuristic using an examination timetabling problem as a case study.

Download Full-text

Deep Reinforcement Learning for Multiparameter Optimization in de novo Drug Design

10.26434/chemrxiv.7990910.v2 ◽

2019 ◽

Author(s):

Niclas Ståhl ◽

Göran Falkman ◽

Alexander Karlsson ◽

Gunnar Mathiason ◽

Jonas Boström

Keyword(s):

Reinforcement Learning ◽

Short Term Memory ◽

De Novo ◽

De Novo Drug Design ◽

Generative Process ◽

New Methods ◽

Multiparameter Optimization ◽

Long Short Term Memory ◽

New Compounds

<p>In medicinal chemistry programs it is key to design and make compounds that are efficacious and safe. This is a long, complex and difficult multi-parameter optimization process, often including several properties with orthogonal trends. New methods for the automated design of compounds against profiles of multiple properties are thus of great value. Here we present a fragment-based reinforcement learning approach based on an actor-critic model, for the generation of novel molecules with optimal properties. The actor and the critic are both modelled with bidirectional long short-term memory (LSTM) networks. The AI method learns how to generate new compounds with desired properties by starting from an initial set of lead molecules and then improve these by replacing some of their fragments. A balanced binary tree based on the similarity of fragments is used in the generative process to bias the output towards structurally similar molecules. The method is demonstrated by a case study showing that 93% of the generated molecules are chemically valid, and a third satisfy the targeted objectives, while there were none in the initial set.</p>

Download Full-text

Site specific agricultural soil management with the use of new technologies

Global NEST Journal ◽

10.30955/gnj.001203 ◽

2013 ◽

Vol 16 (1) ◽

pp. 59-67

Keyword(s):

New Technologies ◽

Digital Interface ◽

Appropriate Treatment ◽

Wide Range ◽

Soil Variation ◽

Soil Information ◽

Remote Sensing Techniques ◽

Agricultural Areas

<p>The Soil Science Institute of Thessaloniki produces new digitized Soil Maps that provide a useful electronic database for the spatial representation of the soil variation within a region, based on in situ soil sampling, laboratory analyses, GIS techniques and plant nutrition mathematical models, coupled with the local land cadastre. The novelty of these studies is that local agronomists have immediate access to a wide range of soil information by clicking on a field parcel shown in this digital interface and, therefore, can suggest an appropriate treatment (e.g. liming, manure incorporation, desalination, application of proper type and quantity of fertilizer) depending on the field conditions and cultivated crops. A specific case study is presented in the current work with regards to the construction of the digitized Soil Map of the regional unit of Kastoria. The potential of this map can easily be realized by the fact that the mapping of the physicochemical properties of the soils in this region provided delineation zones for differential fertilization management. An experiment was also conducted using remote sensing techniques for the enhancement of the fertilization advisory software database, which is a component of the digitized map, and the optimization of nitrogen management in agricultural areas.</p>

Download Full-text

Oxford Studies in Ancient Philosophy, Volume 55

10.1093/oso/9780198836339.001.0001 ◽

2018 ◽

Keyword(s):

Middle Ages ◽

Ancient Philosophy ◽

Recent Book ◽

Loss Of Control ◽

Ancient Greek ◽

Original Essays ◽

Constant Change ◽

Wide Range ◽

Metaphysical Principle

Oxford Studies in Ancient Philosophy provides, twice each year, a collection of the best current work in the field of ancient philosophy. Each volume features original essays that contribute to an understanding of a wide range of themes and problems in all periods of ancient Greek and Roman philosophy, from the beginnings to the threshold of the Middle Ages. From its first volume in 1983, OSAP has been a highly influential venue for work in the field, and has often featured essays of substantial length as well as critical essays on books of distinctive importance. Volume LV contains: a methodological examination on how the evidence for Presocratic thought is shaped through its reception by later thinkers, using discussions of a world soul as a case study; an article on Plato’s conception of flux and the way in which sensible particulars maintain a kind of continuity while undergoing constant change; a discussion of J. L. Austin’s unpublished lecture notes on Aristotle’s Nicomachean Ethics and his treatment of loss of control (akrasia); an article on the Stoics’ theory of time and in particular Chrysippus’ conception of the present and of events; and two articles on Plotinus, one that identifies a distinct argument to show that there is a single, ultimate metaphysical principle; and a review essay discussing E. K. Emilsson’s recent book, Plotinus.

Download Full-text

Assessing the Relation between Mud Components and Rheology for Loss Circulation Prevention Using Polymeric Gels: A Machine Learning Approach

Energies ◽

10.3390/en14051377 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1377

Author(s):

Musaab I. Magzoub ◽

Raj Kiran ◽

Saeed Salehi ◽

Ibnelwaleed A. Hussein ◽

Mustafa S. Nasser

Keyword(s):

Machine Learning ◽

Rheological Properties ◽

Nearest Neighbor ◽

Drilling Fluid ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

Wide Range ◽

Machine Learning Approach ◽

Drilling Operations

The traditional way to mitigate loss circulation in drilling operations is to use preventative and curative materials. However, it is difficult to quantify the amount of materials from every possible combination to produce customized rheological properties. In this study, machine learning (ML) is used to develop a framework to identify material composition for loss circulation applications based on the desired rheological characteristics. The relation between the rheological properties and the mud components for polyacrylamide/polyethyleneimine (PAM/PEI)-based mud is assessed experimentally. Four different ML algorithms were implemented to model the rheological data for various mud components at different concentrations and testing conditions. These four algorithms include (a) k-Nearest Neighbor, (b) Random Forest, (c) Gradient Boosting, and (d) AdaBoosting. The Gradient Boosting model showed the highest accuracy (91 and 74% for plastic and apparent viscosity, respectively), which can be further used for hydraulic calculations. Overall, the experimental study presented in this paper, together with the proposed ML-based framework, adds valuable information to the design of PAM/PEI-based mud. The ML models allowed a wide range of rheology assessments for various drilling fluid formulations with a mean accuracy of up to 91%. The case study has shown that with the appropriate combination of materials, reasonable rheological properties could be achieved to prevent loss circulation by managing the equivalent circulating density (ECD).

Download Full-text

Enhancing Energy Trading Between Different Islanded Microgrids A Reinforcement Learning Algorithm Case Study in Northern Kordofan State

2020 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE) ◽

10.1109/iccceee49695.2021.9429584 ◽

2021 ◽

Author(s):

Moayad ELamin ◽

Fay Elhassan ◽

Mahmoud A. Manzoul

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Energy Trading ◽

Reinforcement Learning Algorithm

Download Full-text

A Reinforcement Learning Framework for Spiking Networks with Dynamic Synapses

Computational Intelligence and Neuroscience ◽

10.1155/2011/869348 ◽

2011 ◽

Vol 2011 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Karim El-Laithy ◽

Martin Bogdan

Keyword(s):

Reinforcement Learning ◽

Spike Timing ◽

Neural Representation ◽

Model Parameters ◽

Learning Framework ◽

Reference Target ◽

Wide Range ◽

Spiking Network ◽

Dynamic Synapses ◽

Exclusive Or

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.

Download Full-text

Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors

Transportation Research Part C Emerging Technologies ◽

10.1016/j.trc.2021.102980 ◽

2021 ◽

Vol 124 ◽

pp. 102980

Author(s):

Qiangqiang Guo ◽

Ohay Angah ◽

Zhijun Liu ◽

Xuegang (Jeff) Ban

Keyword(s):

Reinforcement Learning ◽

Automated Vehicles ◽

Low Level

Download Full-text

Goal-driven active learning

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09527-5 ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Process ◽

Real World ◽

Imitation Learning ◽

Learning Approaches ◽

Wide Range ◽

Fixed Set ◽

Complex Decision Making ◽

Complex Decision

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.

Download Full-text

Using citizen science data to monitor the Sustainable Development Goals: a bottom-up analysis

Sustainability Science ◽

10.1007/s11625-021-01001-1 ◽

2021 ◽

Author(s):

Laura Ballerini ◽

Sylvia I. Bergh

Keyword(s):

Sustainable Development ◽

Citizen Science ◽

Sustainable Development Goals ◽

Comparative Case Study ◽

Science Data ◽

The Sustainable Development ◽

Case Study Analysis ◽

Wide Range ◽

Development Goals

AbstractOfficial data are not sufficient for monitoring the United Nations Sustainable Development Goals (SDGs): they do not reach remote locations or marginalized populations and can be manipulated by governments. Citizen science data (CSD), defined as data that citizens voluntarily gather by employing a wide range of technologies and methodologies, could help to tackle these problems and ultimately improve SDG monitoring. However, the link between CSD and the SDGs is still understudied. This article aims to develop an empirical understanding of the CSD-SDG link by focusing on the perspective of projects which employ CSD. Specifically, the article presents primary and secondary qualitative data collected on 30 of these projects and an explorative comparative case study analysis. It finds that projects which use CSD recognize that the SDGs can provide a valuable framework and legitimacy, as well as attract funding, visibility, and partnerships. But, at the same time, the article reveals that these projects also encounter several barriers with respect to the SDGs: a widespread lack of knowledge of the goals, combined with frustration and political resistance towards the UN, may deter these projects from contributing their data to the SDG monitoring apparatus.

Download Full-text