Unsupervised learning and clustered connectivity enhance reinforcement learning in spiking neural networks

ABSTRACTReinforcement learning is a learning paradigm that can account for how organisms learn to adapt their behavior in complex environments with sparse rewards. However, implementations in spiking neuronal networks typically rely on input architectures involving place cells or receptive fields. This is problematic, as such approaches either scale badly as the environment grows in size or complexity, or presuppose knowledge on how the environment should be partitioned. Here, we propose a learning architecture that combines unsupervised learning on the input projections with clustered connectivity within the representation layer. This combination allows input features to be mapped to clusters; thus the network self-organizes to produce task-relevant activity patterns that can serve as the basis for reinforcement learning on the output projections. On the basis of the MNIST and Mountain Car tasks, we show that our proposed model performs better than either a comparable unclustered network or a clustered network with static input projections. We conclude that the combination of unsupervised learning and clustered connectivity provides a generic representational substrate suitable for further computation.

Download Full-text

Unsupervised Learning and Clustered Connectivity Enhance Reinforcement Learning in Spiking Neural Networks

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2021.543872 ◽

2021 ◽

Vol 15 ◽

Author(s):

Philipp Weidel ◽

Renato Duarte ◽

Abigail Morrison

Keyword(s):

Reinforcement Learning ◽

Unsupervised Learning ◽

Ad Hoc ◽

Activity Patterns ◽

Receptive Fields ◽

Complex Environments ◽

Self Organized ◽

Proposed Model ◽

The Brain ◽

Clustered Network

Reinforcement learning is a paradigm that can account for how organisms learn to adapt their behavior in complex environments with sparse rewards. To partition an environment into discrete states, implementations in spiking neuronal networks typically rely on input architectures involving place cells or receptive fields specified ad hoc by the researcher. This is problematic as a model for how an organism can learn appropriate behavioral sequences in unknown environments, as it fails to account for the unsupervised and self-organized nature of the required representations. Additionally, this approach presupposes knowledge on the part of the researcher on how the environment should be partitioned and represented and scales poorly with the size or complexity of the environment. To address these issues and gain insights into how the brain generates its own task-relevant mappings, we propose a learning architecture that combines unsupervised learning on the input projections with biologically motivated clustered connectivity within the representation layer. This combination allows input features to be mapped to clusters; thus the network self-organizes to produce clearly distinguishable activity patterns that can serve as the basis for reinforcement learning on the output projections. On the basis of the MNIST and Mountain Car tasks, we show that our proposed model performs better than either a comparable unclustered network or a clustered network with static input projections. We conclude that the combination of unsupervised learning and clustered connectivity provides a generic representational substrate suitable for further computation.

Download Full-text

A strategy learning model for autonomous agents based on classification

International Journal of Applied Mathematics and Computer Science ◽

10.1515/amcs-2015-0035 ◽

2015 ◽

Vol 25 (3) ◽

pp. 471-482 ◽

Cited By ~ 7

Author(s):

Bartłomiej Śnieżyński

Keyword(s):

Reinforcement Learning ◽

Supervised Learning ◽

Learning Process ◽

Autonomous Agents ◽

Good Alternative ◽

Learning Model ◽

Learning Method ◽

Complex Environments ◽

Agent Based ◽

Proposed Model

AbstractIn this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process

Download Full-text

A Method of Personalized Driving Decision for Smart Car Based on Deep Reinforcement Learning

Information ◽

10.3390/info11060295 ◽

2020 ◽

Vol 11 (6) ◽

pp. 295 ◽

Cited By ~ 1

Author(s):

Xinpeng Wang ◽

Chaozhong Wu ◽

Jie Xue ◽

Zhijun Chen

Keyword(s):

Reinforcement Learning ◽

Decision Model ◽

Gradient Algorithm ◽

Learning Goals ◽

Learning Method ◽

Automatic Driving ◽

Proposed Model ◽

Policy Gradient ◽

Self Learning ◽

Better Than

To date, automatic driving technology has become a hotspot in academia. It is necessary to provide a personalization of automatic driving decision for each passenger. The purpose of this paper is to propose a self-learning method for personalized driving decisions. First, collect and analyze driving data from different drivers to set learning goals. Then, Deep Deterministic Policy Gradient algorithm is utilized to design a driving decision system. Furthermore, personalized factors are introduced for some observed parameters to build a personalized driving decision model. Finally, compare the proposed method with classic Deep Reinforcement Learning algorithms. The results show that the performance of the personalized driving decision model is better than the classic algorithms, and it is similar to the manual driving situation. Therefore, the proposed model can effectively learn the human-like personalized driving decisions of different drivers for structured road. Based on this model, the smart car can accomplish personalized driving.

Download Full-text

Learn to Steer through Deep Reinforcement Learning

Sensors ◽

10.3390/s18113650 ◽

2018 ◽

Vol 18 (11) ◽

pp. 3650 ◽

Cited By ~ 13

Author(s):

Keyu Wu ◽

Mahdi Esfahani ◽

Shenghai Yuan ◽

Han Wang

Keyword(s):

Reinforcement Learning ◽

Real World ◽

High Efficiency ◽

Feature Representation ◽

Fine Tuning ◽

Computational Time ◽

Complex Environments ◽

Learning Efficiency ◽

Depth Images ◽

Proposed Model

It is crucial for robots to autonomously steer in complex environments safely without colliding with any obstacles. Compared to conventional methods, deep reinforcement learning-based methods are able to learn from past experiences automatically and enhance the generalization capability to cope with unseen circumstances. Therefore, we propose an end-to-end deep reinforcement learning algorithm in this paper to improve the performance of autonomous steering in complex environments. By embedding a branching noisy dueling architecture, the proposed model is capable of deriving steering commands directly from raw depth images with high efficiency. Specifically, our learning-based approach extracts the feature representation from depth inputs through convolutional neural networks and maps it to both linear and angular velocity commands simultaneously through different streams of the network. Moreover, the training framework is also meticulously designed to improve the learning efficiency and effectiveness. It is worth noting that the developed system is readily transferable from virtual training scenarios to real-world deployment without any fine-tuning by utilizing depth images. The proposed method is evaluated and compared with a series of baseline methods in various virtual environments. Experimental results demonstrate the superiority of the proposed model in terms of average reward, learning efficiency, success rate as well as computational time. Moreover, a variety of real-world experiments are also conducted which reveal the high adaptability of our model to both static and dynamic obstacle-cluttered environments. A video of our experiments is available at https://youtu.be/yixnmFXIKf4 and http://v.youku.com/vshow/idXMzg1ODYwMzM5Ng.

Download Full-text

How environmental movement constraints shape the neural code for space

Cognitive Processing ◽

10.1007/s10339-021-01045-2 ◽

2021 ◽

Author(s):

Kate J. Jeffery

Keyword(s):

Activity Patterns ◽

Cell Types ◽

Environmental Movement ◽

Place Cells ◽

Neural Code ◽

Complex Environments ◽

Representation Of Space ◽

Functional Consequences ◽

Self Motion ◽

Spatial Cell

AbstractStudy of the neural code for space in rodents has many insights to offer for how mammals, including humans, construct a mental representation of space. This code is centered on the hippocampal place cells, which are active in particular places in the environment. Place cells are informed by numerous other spatial cell types including grid cells, which provide a signal for distance and direction and are thought to help anchor the place cell signal. These neurons combine self-motion and environmental information to create and update their map-like representation. Study of their activity patterns in complex environments of varying structure has revealed that this "cognitive map" of space is not a fixed and rigid entity that permeates space, but rather is variably affected by the movement constraints of the environment. These findings are pointing toward a more flexible spatial code in which the map is adapted to the movement possibilities of the space. An as-yet-unanswered question is whether these different forms of representation have functional consequences, as suggested by an enactivist view of spatial cognition.

Download Full-text

Dilated Convolution with Dilated GRU for Music Source Separation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/655 ◽

2019 ◽

Cited By ~ 3

Author(s):

Jen-Yu Liu ◽

Yi-Hsuan Yang

Keyword(s):

High Resolution ◽

Source Separation ◽

Receptive Fields ◽

Audio Signals ◽

Sound Sources ◽

Dilated Convolution ◽

Proposed Model ◽

Music Audio ◽

Better Than

Stacked dilated convolutions used in Wavenet have been shown effective for generating high-quality audios. By replacing pooling/striding with dilation in convolution layers, they can preserve high-resolution information and still reach distant locations. Producing high-resolution predictions is also crucial in music source separation, whose goal is to separate different sound sources while maintain the quality of the separated sounds. Therefore, in this paper, we use stacked dilated convolutions as the backbone for music source separation. Although stacked dilated convolutions can reach wider context than standard convolutions do, their effective receptive fields are still fixed and might not be wide enough for complex music audio signals. To reach even further information at remote locations, we propose to combine a dilated convolution with a modified GRU called Dilated GRU to form a block. A Dilated GRU receives information from k-step before instead of the previous step for a fixed k. This modification allows a GRU unit to reach a location with fewer recurrent steps and run faster because it can execute in parallel partially. We show that the proposed model with a stack of such blocks performs equally well or better than the state-of-the-art for separating both vocals and accompaniment.

Download Full-text

Reinforcement learning versus swarm intelligence for autonomous multi-HAPS coordination

SN Applied Sciences ◽

10.1007/s42452-021-04658-6 ◽

2021 ◽

Vol 3 (6) ◽

Author(s):

Ogbonnaya Anicho ◽

Philip B. Charlesworth ◽

Gurvinder S. Baicher ◽

Atulya K. Nagar

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Swarm Intelligence ◽

Performance Indicators ◽

Convergence Rates ◽

Tuning Parameters ◽

Continuous State Space ◽

Continuous State ◽

User Coverage ◽

Better Than

AbstractThis work analyses the performance of Reinforcement Learning (RL) versus Swarm Intelligence (SI) for coordinating multiple unmanned High Altitude Platform Stations (HAPS) for communications area coverage. It builds upon previous work which looked at various elements of both algorithms. The main aim of this paper is to address the continuous state-space challenge within this work by using partitioning to manage the high dimensionality problem. This enabled comparing the performance of the classical cases of both RL and SI establishing a baseline for future comparisons of improved versions. From previous work, SI was observed to perform better across various key performance indicators. However, after tuning parameters and empirically choosing suitable partitioning ratio for the RL state space, it was observed that the SI algorithm still maintained superior coordination capability by achieving higher mean overall user coverage (about 20% better than the RL algorithm), in addition to faster convergence rates. Though the RL technique showed better average peak user coverage, the unpredictable coverage dip was a key weakness, making SI a more suitable algorithm within the context of this work.

Download Full-text

An Unequal-Sized Unidirectional Loop Layout Design Problem Considering Empty Vehicle Trip

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.37-38.116 ◽

2010 ◽

Vol 37-38 ◽

pp. 116-121

Author(s):

Yu Lan Li ◽

Bo Li ◽

Su Jun Luo

Keyword(s):

Material Handling ◽

Design Principle ◽

Facility Layout ◽

Original Model ◽

Layout Problem ◽

Proposed Model ◽

Loop Layout ◽

Material Handling Costs ◽

Loop Layout Problem ◽

Better Than

In the facility layout decisions, the previous general design principle is to minimize material handling costs, and the objective of these old models only considers the costs of loaded trip, without regard to empty vehicle trip costs, which do not meet the actual demand. In this paper, the unequal-sized unidirectional loop layout problem is analyzed, and the model of facility layout is improved. The objective of the new model is to minimize the total loaded and empty vehicle trip costs. To solve this model, a heuristic algorithm based on partheno-genetic algorithms is designed. Finally, an unequal-sized unidirectional loop layout problem including 12 devices is simulated. Comparison shows that the result obtained using the proposed model is 20.4% better than that obtained using the original model.

Download Full-text

Optimization of Multilayer Optical Films with Unsupervised Learning, reinforcement learning and genetic algorithm

Frontiers in Optics / Laser Science ◽

10.1364/fio.2020.jm6a.5 ◽

2020 ◽

Author(s):

Jiang Anqing ◽

Osamu Yoshie

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Unsupervised Learning ◽

Optical Films ◽

Learning Reinforcement

Download Full-text

The coalescent process in models with selection, recombination and geographic subdivision

Genetics Research ◽

10.1017/s0016672300029074 ◽

1991 ◽

Vol 57 (1) ◽

pp. 83-91 ◽

Cited By ~ 40

Author(s):

Norman Kaplan ◽

Richard R. Hudson ◽

Masaru Iizuka

Keyword(s):

Genetic Variation ◽

Population Genetic ◽

Genetic Model ◽

Sequence Data ◽

Balancing Selection ◽

Similar Model ◽

Proposed Model ◽

Coalescent Approach ◽

Neutral Mutations ◽

Better Than

SummaryA population genetic model with a single locus at which balancing selection acts and many linked loci at which neutral mutations can occur is analysed using the coalescent approach. The model incorporates geographic subdivision with migration, as well as mutation, recombination, and genetic drift of neutral variation. It is found that geographic subdivision can affect genetic variation even with high rates of migration, providing that selection is strong enough to maintain different allele frequencies at the selected locus. Published sequence data from the alcohol dehydrogenase locus of Drosophila melanogaster are found to fit the proposed model slightly better than a similar model without subdivision.

Download Full-text