Emergent Solutions to High-Dimensional Multitask Reinforcement Learning

Algorithms that learn through environmental interaction and delayed rewards, or reinforcement learning (RL), increasingly face the challenge of scaling to dynamic, high-dimensional, and partially observable environments. Significant attention is being paid to frameworks from deep learning, which scale to high-dimensional data by decomposing the task through multilayered neural networks. While effective, the representation is complex and computationally demanding. In this work, we propose a framework based on genetic programming which adaptively complexifies policies through interaction with the task. We make a direct comparison with several deep reinforcement learning frameworks in the challenging Atari video game environment as well as more traditional reinforcement learning frameworks based on a priori engineered features. Results indicate that the proposed approach matches the quality of deep learning while being a minimum of three orders of magnitude simpler with respect to model complexity. This results in real-time operation of the champion RL agent without recourse to specialized hardware support. Moreover, the approach is capable of evolving solutions to multiple game titles simultaneously with no additional computational cost. In this case, agent behaviours for an individual game as well as single agents capable of playing all games emerge from the same evolutionary run.

Download Full-text

A Review of Mobile Robot Path Planning Based on Deep Reinforcement Learning Algorithm

Journal of Physics Conference Series ◽

10.1088/1742-6596/2138/1/012011 ◽

2021 ◽

Vol 2138 (1) ◽

pp. 012011

Author(s):

Yanwei Zhao ◽

Yinong Zhang ◽

Shuying Wang

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Path Planning ◽

Mobile Robot ◽

Video Game ◽

Autonomous Navigation ◽

Learning Algorithm ◽

Basic Knowledge ◽

Target Point ◽

Reinforcement Learning Algorithm

Abstract Path planning refers to that the mobile robot can obtain the surrounding environment information and its own state information through the sensor carried by itself, which can avoid obstacles and move towards the target point. Deep reinforcement learning consists of two parts: reinforcement learning and deep learning, mainly used to deal with perception and decision-making problems, has become an important research branch in the field of artificial intelligence. This paper first introduces the basic knowledge of deep learning and reinforcement learning. Then, the research status of deep reinforcement learning algorithm based on value function and strategy gradient in path planning is described, and the application research of deep reinforcement learning in computer game, video game and autonomous navigation is described. Finally, I made a brief summary and outlook on the algorithms and applications of deep reinforcement learning.

Download Full-text

Convergence rates of high dimensional Smolyak quadrature

ESAIM Mathematical Modelling and Numerical Analysis ◽

10.1051/m2an/2020003 ◽

2020 ◽

Vol 54 (4) ◽

pp. 1259-1307

Author(s):

Jakob Zech ◽

Christoph Schwab

Keyword(s):

Convergence Rates ◽

A Priori ◽

Computational Cost ◽

High Dimensional ◽

Taylor Polynomial ◽

Approximation Rates ◽

Moderate Number ◽

Asymptotic Rate ◽

Smolyak Algorithm ◽

Parametric Maps

We analyse convergence rates of Smolyak integration for parametric maps u: U → X taking values in a Banach space X, defined on the parameter domain U = [−1,1]N. For parametric maps which are sparse, as quantified by summability of their Taylor polynomial chaos coefficients, dimension-independent convergence rates superior to N-term approximation rates under the same sparsity are achievable. We propose a concrete Smolyak algorithm to a priori identify integrand-adapted sets of active multiindices (and thereby unisolvent sparse grids of quadrature points) via upper bounds for the integrands’ Taylor gpc coefficients. For so-called “(b,ε)-holomorphic” integrands u with b∈lp(∕) for some p ∈ (0, 1), we prove the dimension-independent convergence rate 2/p − 1 in terms of the number of quadrature points. The proposed Smolyak algorithm is proved to yield (essentially) the same rate in terms of the total computational cost for both nested and non-nested univariate quadrature points. Numerical experiments and a mathematical sparsity analysis accounting for cancellations in quadratures and in the combination formula demonstrate that the asymptotic rate 2/p − 1 is realized computationally for a moderate number of quadrature points under certain circumstances. By a refined analysis of model integrand classes we show that a generally large preasymptotic range otherwise precludes reaching the asymptotic rate 2/p − 1 for practically relevant numbers of quadrature points.

Download Full-text

A deep learning model for gas storage optimization

Decisions in Economics and Finance ◽

10.1007/s10203-021-00363-6 ◽

2021 ◽

Author(s):

Nicolas Curin ◽

Michael Kettler ◽

Xi Kleisinger-Yu ◽

Vlatka Komaric ◽

Thomas Krabichler ◽

...

Keyword(s):

Risk Management ◽

Monte Carlo ◽

Deep Learning ◽

Reinforcement Learning ◽

State Of The Art ◽

Gas Storage ◽

High Dimensional ◽

Forward Market ◽

Storage Optimization ◽

Deep Learning Model

AbstractTo the best of our knowledge, the application of deep learning in the field of quantitative risk management is still a relatively recent phenomenon. In this article, we utilize techniques inspired by reinforcement learning in order to optimize the operation plans of underground natural gas storage facilities. We provide a theoretical framework and assess the performance of the proposed method numerically in comparison to a state-of-the-art least-squares Monte-Carlo approach. Due to the inherent intricacy originating from the high-dimensional forward market as well as the numerous constraints and frictions, the optimization exercise can hardly be tackled by means of traditional techniques.

Download Full-text

Emergent Tangled Program Graphs in Multi-Task Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/740 ◽

2018 ◽

Author(s):

Stephen Kelly ◽

Malcolm Heywood

Keyword(s):

Decision Making ◽

Deep Learning ◽

Reinforcement Learning ◽

Video Games ◽

Genetic Programming ◽

Control Policy ◽

Task Environment ◽

High Dimensional ◽

Collective Decision Making ◽

Specialized Hardware

We propose a Genetic Programming (GP) framework to address high-dimensional Multi-Task Reinforcement Learning (MTRL) through emergent modularity. A bottom-up process is assumed in which multiple programs self-organize into collective decision-making entities, or teams, which then further develop into multi-team policy graphs, or Tangled Program Graphs (TPG). The framework learns to play three Atari video games simultaneously, producing a single control policy that matches or exceeds leading results from (game-specific) deep reinforcement learning in each game. More importantly, unlike the representation assumed for deep learning, TPG policies start simple and adaptively complexify through interaction with the task environment, resulting in agents that are exceedingly simple, operating in real-time without specialized hardware support such as GPUs.

Download Full-text

Automated Dysarthria Severity Classification Using Deep Learning Frameworks

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287741 ◽

2021 ◽

Author(s):

Amlu Anna Joshy ◽

Rajeev Rajan

Keyword(s):

Deep Learning ◽

Severity Classification ◽

Learning Frameworks

Download Full-text

Author response for "Deep learning and reinforcement learning approach on microgrid"

10.1002/2050-7038.12531/v2/response1 ◽

2020 ◽

Author(s):

Kumar Chandrasekaran ◽

Prabaakaran Kandasamy ◽

Srividhya Ramanathan

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Author Response ◽

Learning Approach

Download Full-text

Sequence-Based Deep Learning Frameworks on Enhancer-Promoter Interactions Prediction

Current Pharmaceutical Design ◽

10.2174/1381612826666201124112710 ◽

2020 ◽

Vol 26 ◽

Author(s):

Xiaoping Min ◽

Fengqing Lu ◽

Chunyan Li

Keyword(s):

Gene Expression ◽

Deep Learning ◽

Dna Sequences ◽

Experimental Methods ◽

Learning Methods ◽

Comprehensive Review ◽

Genomic Features ◽

Disease Mechanisms ◽

Evaluation Strategies ◽

Learning Frameworks

: Enhancer-promoter interactions (EPIs) in the human genome are of great significance to transcriptional regulation which tightly controls gene expression. Identification of EPIs can help us better deciphering gene regulation and understanding disease mechanisms. However, experimental methods to identify EPIs are constrained by the fund, time and manpower while computational methods using DNA sequences and genomic features are viable alternatives. Deep learning methods have shown promising prospects in classification and efforts that have been utilized to identify EPIs. In this survey, we specifically focus on sequence-based deep learning methods and conduct a comprehensive review of the literatures of them. We first briefly introduce existing sequence-based frameworks on EPIs prediction and their technique details. After that, we elaborate on the dataset, pre-processing means and evaluation strategies. Finally, we discuss the challenges these methods are confronted with and suggest several future opportunities.

Download Full-text

Review of the Applications of Deep Learning in Bioinformatics

Current Bioinformatics ◽

10.2174/1574893615999200711165743 ◽

2021 ◽

Vol 15 (8) ◽

pp. 898-911

Author(s):

Yongqing Zhang ◽

Jianrong Yan ◽

Siyu Chen ◽

Meiqin Gong ◽

Dongrui Gao ◽

...

Keyword(s):

Deep Learning ◽

Drug Discovery ◽

Biomedical Imaging ◽

State Of The Art ◽

Black Box ◽

Medical Data ◽

Biological Data ◽

High Dimensional ◽

Biological Research ◽

Process Data

Rapid advances in biological research over recent years have significantly enriched biological and medical data resources. Deep learning-based techniques have been successfully utilized to process data in this field, and they have exhibited state-of-the-art performances even on high-dimensional, nonstructural, and black-box biological data. The aim of the current study is to provide an overview of the deep learning-based techniques used in biology and medicine and their state-of-the-art applications. In particular, we introduce the fundamentals of deep learning and then review the success of applying such methods to bioinformatics, biomedical imaging, biomedicine, and drug discovery. We also discuss the challenges and limitations of this field, and outline possible directions for further research.

Download Full-text

Improving Sentiment Analysis using Hybrid Deep Learning Model

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190328200012 ◽

2020 ◽

Vol 13 (4) ◽

pp. 627-640 ◽

Cited By ~ 1

Author(s):

Avinash Chandra Pandey ◽

Dharmveer Singh Rajpoot

Keyword(s):

Neural Network ◽

Deep Learning ◽

Sentiment Analysis ◽

Classification Accuracy ◽

Short Term Memory ◽

Computational Cost ◽

Extraction Process ◽

Learning Model ◽

Sentiment Classification ◽

Deep Learning Model

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.

Download Full-text

The Role of Global Appearance of Omnidirectional Images in Relative Distance and Orientation Retrieval

Sensors ◽

10.3390/s21103327 ◽

2021 ◽

Vol 21 (10) ◽

pp. 3327

Author(s):

Vicente Román ◽

Luis Payá ◽

Adrián Peidró ◽

Mónica Ballesta ◽

Oscar Reinoso

Keyword(s):

Mobile Robotics ◽

A Priori ◽

Computational Cost ◽

Relevant Information ◽

Local Features ◽

New Family ◽

Lighting Conditions ◽

Omnidirectional Images ◽

Great Development

Over the last few years, mobile robotics has experienced a great development thanks to the wide variety of problems that can be solved with this technology. An autonomous mobile robot must be able to operate in a priori unknown environments, planning its trajectory and navigating to the required target points. With this aim, it is crucial solving the mapping and localization problems with accuracy and acceptable computational cost. The use of omnidirectional vision systems has emerged as a robust choice thanks to the big quantity of information they can extract from the environment. The images must be processed to obtain relevant information that permits solving robustly the mapping and localization problems. The classical frameworks to address this problem are based on the extraction, description and tracking of local features or landmarks. However, more recently, a new family of methods has emerged as a robust alternative in mobile robotics. It consists of describing each image as a whole, what leads to conceptually simpler algorithms. While methods based on local features have been extensively studied and compared in the literature, those based on global appearance still merit a deep study to uncover their performance. In this work, a comparative evaluation of six global-appearance description techniques in localization tasks is carried out, both in terms of accuracy and computational cost. Some sets of images captured in a real environment are used with this aim, including some typical phenomena such as changes in lighting conditions, visual aliasing, partial occlusions and noise.

Download Full-text