Playing Atari with few neurons

Giuseppe Cuccu; Julian Togelius; Philippe Cudré-Mauroux

doi:10.1007/s10458-021-09497-8

Playing Atari with few neurons

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09497-8 ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Giuseppe Cuccu ◽

Julian Togelius ◽

Philippe Cudré-Mauroux

Keyword(s):

Neural Network ◽

State Of The Art ◽

Reconstruction Error ◽

Learning Context ◽

Natural Evolution ◽

The Neural Network ◽

New Variant ◽

Compact State ◽

Novel Algorithms ◽

Selection Of

AbstractWe propose a new method for learning compact state representations and policies separately but simultaneously for policy approximation in vision-based applications such as Atari games. Approaches based on deep reinforcement learning typically map pixels directly to actions to enable end-to-end training. Internally, however, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it, two objectives which can be addressed independently. Separating the image processing from the action selection allows for a better understanding of either task individually, as well as potentially finding smaller policy representations which is inherently interesting. Our approach learns state representations using a compact encoder based on two novel algorithms: (i) Increasing Dictionary Vector Quantization builds a dictionary of state representations which grows in size over time, allowing our method to address new observations as they appear in an open-ended online-learning context; and (ii) Direct Residuals Sparse Coding encodes observations in function of the dictionary, aiming for highest information inclusion by disregarding reconstruction error and maximizing code sparsity. As the dictionary size increases, however, the encoder produces increasingly larger inputs for the neural network; this issue is addressed with a new variant of the Exponential Natural Evolution Strategies algorithm which adapts the dimensionality of its probability distribution along the run. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on each game’s controls). These are still capable of achieving results that are not much worse, and occasionally superior, to the state-of-the-art in direct policy search which uses two orders of magnitude more neurons.

Download Full-text

Playing Atari with Six Neurons (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/651 ◽

2020 ◽

Author(s):

Giuseppe Cuccu ◽

Julian Togelius ◽

Philippe Cudré-Mauroux

Keyword(s):

Reinforcement Learning ◽

Vector Quantization ◽

Sparse Coding ◽

Deep Neural Network ◽

State Of The Art ◽

Compact State ◽

Learning Policies ◽

Novel Algorithms ◽

Over Time ◽

Selection Of

Deep reinforcement learning applied to vision-based problems like Atari games maps pixels directly to actions; internally, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it. By separating image processing from decision-making, one could better understand the complexity of each task, as well as potentially find smaller policy representations that are easier for humans to understand and may generalize better. To this end, we propose a new method for learning policies and compact state representations separately but simultaneously for policy approximation in reinforcement learning. State representations are generated by an encoder based on two novel algorithms: Increasing Dictionary Vector Quantization makes the encoder capable of growing its dictionary size over time, to address new observations; and Direct Residuals Sparse Coding encodes observations by aiming for highest information inclusion. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on the game's controls). These are still capable of achieving results comparable---and occasionally superior---to state-of-the-art techniques which use two orders of magnitude more neurons.

Download Full-text

Adaptive Control Structure with Neural Data Processing Applied for Electrical Drive with Elastic Shaft

Energies ◽

10.3390/en14123389 ◽

2021 ◽

Vol 14 (12) ◽

pp. 3389

Author(s):

Marcin Kamiński ◽

Krzysztof Szabat

Keyword(s):

Neural Network ◽

Adaptive Control ◽

Dynamic Properties ◽

Experimental Studies ◽

Drive System ◽

Particle Swarm Optimizer ◽

The Neural Network ◽

Control Concept ◽

Constant Coefficients ◽

Selection Of

This paper presents issues related to the adaptive control of the drive system with an elastic clutch connecting the main motor and the load machine. Firstly, the problems and the main algorithms often implemented for the mentioned object are analyzed. Then, the control concept based on the RNN (recurrent neural network) for the drive system with the flexible coupling is thoroughly described. For this purpose, an adaptive model inspired by the Elman model is selected, which is related to internal feedback in the neural network. The indicated feature improves the processing of dynamic signals. During the design process, for the selection of constant coefficients of the controller, the PSO (particle swarm optimizer) is applied. Moreover, in order to obtain better dynamic properties and improve work in real conditions, one model based on the ADALINE (adaptive linear neuron) is introduced into the structure. Details of the algorithm used for the weights’ adaptation are presented (including stability analysis) to perform the shaft torque signal filtering. The effectiveness of the proposed approach is examined through simulation and experimental studies.

Download Full-text

Methods for the selection of parameters and structure of the neural network model

Semi-Empirical Neural Network Modeling and Digital Twins Development ◽

10.1016/b978-0-12-815651-3.00003-1 ◽

2020 ◽

pp. 73-103

Author(s):

Dmitriy Tarkhov ◽

Alexander Vasilyev

Keyword(s):

Neural Network ◽

Network Model ◽

Neural Network Model ◽

The Neural Network ◽

Selection Of

Download Full-text

Selection of the optimal type of thermal insulation structure based on the neural network modelling

E3S Web of Conferences ◽

10.1051/e3sconf/202021601037 ◽

2020 ◽

Vol 216 ◽

pp. 01037

Author(s):

Irina Akhmetova ◽

Elena Balzamova ◽

Veronika Bronskaya ◽

Denis Balzamov ◽

Konstantin Lapin ◽

...

Keyword(s):

Neural Network ◽

Web Application ◽

District Heating ◽

Heat Network ◽

Network Modelling ◽

Query Log ◽

Neural Network Modelling ◽

The Neural Network ◽

Selection Of ◽

Optimal Type

A software package with the user interface for calculating, analyzing and predicting the parameters of cogeneration-based district heating based on the neural network modelling is presented in order to optimize and ensure the reliability of heat networks. The package is the basis for a web-application that allows to calculate the characteristics of the heat network in accordance with the model, keep a query log and provide the possibility of administration.

Download Full-text

Enhancement of Multimodal Microwave-Ultrasound Breast Imaging Using a Deep-Learning Technique

Sensors ◽

10.3390/s19184050 ◽

2019 ◽

Vol 19 (18) ◽

pp. 4050 ◽

Cited By ~ 2

Author(s):

Vahab Khoshdel ◽

Ahmed Ashraf ◽

Joe LoVetri

Keyword(s):

Neural Network ◽

Deep Learning ◽

Breast Imaging ◽

Reconstruction Error ◽

Tumor Segmentation ◽

Source Inversion ◽

The Neural Network ◽

Contrast Source Inversion ◽

Imaging Artifacts ◽

Complex Valued

We present a deep learning method used in conjunction with dual-modal microwave-ultrasound imaging to produce tomographic reconstructions of the complex-valued permittivity of numerical breast phantoms. We also assess tumor segmentation performance using the reconstructed permittivity as a feature. The contrast source inversion (CSI) technique is used to create the complex-permittivity images of the breast with ultrasound-derived tissue regions utilized as prior information. However, imaging artifacts make the detection of tumors difficult. To overcome this issue we train a convolutional neural network (CNN) that takes in, as input, the dual-modal CSI reconstruction and attempts to produce the true image of the complex tissue permittivity. The neural network consists of successive convolutional and downsampling layers, followed by successive deconvolutional and upsampling layers based on the U-Net architecture. To train the neural network, the input-output pairs consist of CSI’s dual-modal reconstructions, along with the true numerical phantom images from which the microwave scattered field was synthetically generated. The reconstructed permittivity images produced by the CNN show that the network is not only able to remove the artifacts that are typical of CSI reconstructions, but can also improve the detectability of tumors. The performance of the CNN is assessed using a four-fold cross-validation on our dataset that shows improvement over CSI both in terms of reconstruction error and tumor segmentation performance.

Download Full-text

A Hybrid Swarm Intelligent Neural Network Model for Customer Churn Prediction and Identifying the Influencing Factors

Information ◽

10.3390/info9110288 ◽

2018 ◽

Vol 9 (11) ◽

pp. 288 ◽

Cited By ~ 5

Author(s):

Hossam Faris

Keyword(s):

Neural Network ◽

State Of The Art ◽

Churn Prediction ◽

Hybrid Swarm ◽

Customer Churn ◽

The Neural Network ◽

Proposed Model ◽

Telecommunication Companies ◽

Technique Evaluation ◽

Imbalanced Class Distribution

Customer churn is one of the most challenging problems for telecommunication companies. In fact, this is because customers are considered as the real asset for the companies. Therefore, more companies are increasing their investments in developing practical solutions that aim at predicting customer churn before it happens. Identifying which customer is about to churn will significantly help the companies in providing solutions to keep their customers and optimize their marketing campaigns. In this work, an intelligent hybrid model based on Particle Swarm Optimization and Feedforward neural network is proposed for churn prediction. PSO is used to tune the weights of the input features and optimize the structure of the neural network simultaneously to increase the prediction power. In addition, the proposed model handles the imbalanced class distribution of the data using an advanced oversampling technique. Evaluation results show that the proposed model can significantly improve the coverage rate of churn customers in comparison with other state-of-the-art classifiers. Moreover, the model has high interpretability, where the assigned feature weights can give an indicator about the importance of their corresponding features in the classification process.

Download Full-text

THE SELECTION OF THE OPTIMAL ARCHITECTURE AND CONFIGURATION OF THE NEURAL NETWORK FOR A SHORT-TERM LOAD FORECASTING OF DEFAULT PROVIDER

Vesti vysshikh uchebnykh zavedenii Chernozem'ya ◽

10.53015/18159958_2021_2_26 ◽

2021 ◽

pp. 26-42

Author(s):

Nikolay Aleksandrovich Serebryakov

Keyword(s):

Neural Network ◽

Load Forecasting ◽

Short Term ◽

The Neural Network ◽

Short Term Load Forecasting ◽

Selection Of

Download Full-text

Clinical Application of Artificial Intelligence in Patients with Chronic Myeloid Leukemia in Chronic Phase

Blood ◽

10.1182/blood.v128.22.940.940 ◽

2016 ◽

Vol 128 (22) ◽

pp. 940-940 ◽

Cited By ~ 1

Author(s):

Koji Sasaki ◽

Hagop M. Kantarjian ◽

Elias J. Jabbour ◽

Susan O'Brien ◽

Farhad Ravandi ◽

...

Keyword(s):

Neural Network ◽

Propensity Score ◽

Propensity Score Matching ◽

Multilayer Perceptron ◽

Research Funding ◽

Correct Prediction ◽

Test Dataset ◽

Free Survival ◽

The Neural Network ◽

Selection Of

Abstract Introduction Artificial intelligence (AI) has been applied to a wide range of daily activities to assist in decision-making. Randomized clinical trials can compare the efficacy of treatment between patient groups. However, the best treatment decision for each individual patient, with their own clinical and biological features, and in the context of comparable treatment options, is more difficult to predict. The integrated consideration of various prognostic features can reach the point beyond human recognition. An AI-assisted approach may help with decision-making in complex clinical situations. The aim of this study is to introduce a prototype of AI to predict outcome such as achievement of major molecular response (MMR) within 1 year of the start of tyrosine kinase inhibitor (TKI). Methods Response data for 630 patients with newly diagnosed CML-CP in consecutive prospective clinical trials of frontline imatinib (n=73; NCT00048672), high-dose imatinib (n=208; NCT00038469 and NCT00050531), nilotinib (n=148; NCT00129740), dasatinib (n=150; NCT00254423), and ponatinib (n=51; NCT01570868) were analyzed. After multiple imputation for missing variables, neural network analysis with a multilayer perceptron model using the statistically significant variables by stepwise multivariate analysis was performed to predict the cumulative incidence of MMR within 1 year. The hyperbolic tangent and softmax activation function were used to create the architecture of hidden layers and output layers, respectively. Batch training with scaled conjugate gradient optimization algorithm with learning parameters (initial Lambda of 0.0000005, initial Sigma of 0.00005, interval center of 0, and interval offset of ±0.5) was used to train the neural network. To evaluate the accuracy of prediction, the entire cohort was randomly divided into training dataset (70%) and test dataset (30%). The correct prediction in the test dataset was repeatedly assessed 1,000 times to validate this approach. The whole cohort was subsequently used to create the AI model for MMR prediction, and was divided into two cohorts based on the prediction by the AI; AI-predicted response, and AI-predicted nonresponse. Hypothetical choice of TKI was assumed to rank the selection of TKI among imatinib 400 mg/day, imatinib 800 mg/day, dasatinib, nilotinib, and ponatinib to calculate the estimated percentage of MMR within 1 year for each patient. The Kaplan-Meier method with a log-rank test was used for failure-free survival (FFS), transformation-free survival (TFS), event-free survival (EFS), and overall survival (OS). To balance baseline patient characteristics between cohorts, propensity score matching after propensity score calculation by logistic regression was performed with nearest neighbor matching method with a caliper of 0.20. Exact matching was used for the type of cytogenetic, transcript, and TKI. Results Of 630 patients treated, 464 (74%) achieved MMR within 1 year. The stepwise multivariate analysis identified the selection of TKI, type of transcript, white blood cell count, albumin, and spleen size at diagnosis were the predictors for MMR within 1 year. Neural network analysis with a multilayer perceptron model is shown in figure 1. Through repeated random selection for training set (70%) and test set (30%), the mean correct prediction for MMR within 1 year was 77.4% (95% confidence interval [CI], 74.2-80.5), and 76.9% (95% CI, 71.4-82.3), respectively. Of 630 patients, the neural network model predicted 539 patients (86%) as responders, and 91 patients (14%) as nonresponders (table 1). Before propensity score matching, the AI-response cohort had higher rates of CCyR, MMR, MR4, MR4.5, and CMR as well as FFS, TFS, EFS, and OS compared to those of the AI-nonresponse cohort (figure 2). After propensity score matching, 25 patients in each cohort were identified, and the baseline differences were minimized (table 1). The AI-response cohort had higher rates of MMR, MR4, and FFS than those of AI-non-response cohort (figure 2). Conclusion AI with a multilayer perceptron model can predict target outcome. Incorporation of additional clinical and biological variables may improve the prediction rates to suggest the best treatment option in each patient with CML-CP. Such strategy is ongoing. Disclosures Kantarjian: ARIAD: Research Funding; Bristol-Myers Squibb: Research Funding; Amgen: Research Funding; Pfizer Inc: Research Funding; Delta-Fly Pharma: Research Funding; Novartis: Research Funding. Jabbour:ARIAD: Consultancy, Research Funding; Pfizer: Consultancy, Research Funding; Novartis: Research Funding; BMS: Consultancy. Ravandi:BMS: Research Funding; Seattle Genetics: Consultancy, Honoraria, Research Funding. Konopleva:AbbVie: Research Funding; Genentech: Research Funding. Wierda:Novartis: Research Funding; Abbvie: Research Funding; Acerta: Research Funding; Gilead: Research Funding; Genentech: Research Funding. Daver:Pfizer: Consultancy, Research Funding; Kiromic: Research Funding; BMS: Research Funding; Otsuka: Consultancy, Honoraria; Sunesis: Consultancy, Research Funding; Karyopharm: Honoraria, Research Funding; Ariad: Research Funding.

Download Full-text

Features of the Development of a Neural Network to Automate the Recognition of Steel Defects

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.11.330-334 ◽

2020 ◽

Vol 11 (6) ◽

pp. 330-334

Author(s):

R. A. Karelova ◽

◽

E. E. Ignatov ◽

Keyword(s):

Neural Network ◽

Network Architecture ◽

Cross Entropy ◽

Neural Network Architecture ◽

Development Environment ◽

Steel Sheets ◽

Python Language ◽

The Neural Network ◽

Metal Sheets ◽

Selection Of

The article presents an embodiment of an artificial neural network for recognizing defects in images of steel sheets. Several stages of solving the problem are described: the choice of a development environment, a programming language, and libraries necessary for the implementation; features of data analysis, graphing, histograms, finding dependencies; the selection of a suitable neural network, the choice of neural network architecture, the selection of an algorithm for assessing quality and accuracy; neural network spelling; training and checking accuracy and quality, checking for overfitting (retraining). As development tools, Python language, PyTorch library, Jupyter development environment, convolutional neural network architecture — Unet are proposed. Features of the analysis of input images of steel sheets, features of the implementation of the neural network itself are described. The function of binary cross entropy was chosen as a criterion for assessing accuracy, since it seeks to bring the distribution of the network forecast to the target, fine not only for erroneous predictions, but also for uncertain ones. For additional evaluation, the DICE method was also used. The accuracy of the resulting model is 84 %. The proposed solution can become part of a hardware-software system for automating the recognition of defects on metal sheets.

Download Full-text

Neural Network Classifiers for Local Wind Prediction

Journal of Applied Meteorology ◽

10.1175/2057.1 ◽

2004 ◽

Vol 43 (5) ◽

pp. 727-738 ◽

Cited By ~ 32

Author(s):

Ralf Kretzschmar ◽

Pierre Eckert ◽

Daniel Cattani ◽

Fritz Eggimann

Keyword(s):

Neural Network ◽

Time Series ◽

Performance Measure ◽

Lead Times ◽

Wind Gust ◽

Model Data ◽

The Neural Network ◽

Neural Network Classifiers ◽

Selection Of

Abstract This paper evaluates the quality of neural network classifiers for wind speed and wind gust prediction with prediction lead times between +1 and +24 h. The predictions were realized based on local time series and model data. The selection of appropriate input features was initiated by time series analysis and completed by empirical comparison of neural network classifiers trained on several choices of input features. The selected input features involved day time, yearday, features from a single wind observation device at the site of interest, and features derived from model data. The quality of the resulting classifiers was benchmarked against persistence for two different sites in Switzerland. The neural network classifiers exhibited superior quality when compared with persistence judged on a specific performance measure, hit and false-alarm rates.

Download Full-text