scholarly journals Playing Atari with few neurons

2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Giuseppe Cuccu ◽  
Julian Togelius ◽  
Philippe Cudré-Mauroux

AbstractWe propose a new method for learning compact state representations and policies separately but simultaneously for policy approximation in vision-based applications such as Atari games. Approaches based on deep reinforcement learning typically map pixels directly to actions to enable end-to-end training. Internally, however, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it, two objectives which can be addressed independently. Separating the image processing from the action selection allows for a better understanding of either task individually, as well as potentially finding smaller policy representations which is inherently interesting. Our approach learns state representations using a compact encoder based on two novel algorithms: (i) Increasing Dictionary Vector Quantization builds a dictionary of state representations which grows in size over time, allowing our method to address new observations as they appear in an open-ended online-learning context; and (ii) Direct Residuals Sparse Coding encodes observations in function of the dictionary, aiming for highest information inclusion by disregarding reconstruction error and maximizing code sparsity. As the dictionary size increases, however, the encoder produces increasingly larger inputs for the neural network; this issue is addressed with a new variant of the Exponential Natural Evolution Strategies algorithm which adapts the dimensionality of its probability distribution along the run. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on each game’s controls). These are still capable of achieving results that are not much worse, and occasionally superior, to the state-of-the-art in direct policy search which uses two orders of magnitude more neurons.

Author(s):  
Giuseppe Cuccu ◽  
Julian Togelius ◽  
Philippe Cudré-Mauroux

Deep reinforcement learning applied to vision-based problems like Atari games maps pixels directly to actions; internally, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it. By separating image processing from decision-making, one could better understand the complexity of each task, as well as potentially find smaller policy representations that are easier for humans to understand and may generalize better. To this end, we propose a new method for learning policies and compact state representations separately but simultaneously for policy approximation in reinforcement learning. State representations are generated by an encoder based on two novel algorithms: Increasing Dictionary Vector Quantization makes the encoder capable of growing its dictionary size over time, to address new observations; and Direct Residuals Sparse Coding encodes observations by aiming for highest information inclusion. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on the game's controls). These are still capable of achieving results comparable---and occasionally superior---to state-of-the-art techniques which use two orders of magnitude more neurons.


Energies ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 3389
Author(s):  
Marcin Kamiński ◽  
Krzysztof Szabat

This paper presents issues related to the adaptive control of the drive system with an elastic clutch connecting the main motor and the load machine. Firstly, the problems and the main algorithms often implemented for the mentioned object are analyzed. Then, the control concept based on the RNN (recurrent neural network) for the drive system with the flexible coupling is thoroughly described. For this purpose, an adaptive model inspired by the Elman model is selected, which is related to internal feedback in the neural network. The indicated feature improves the processing of dynamic signals. During the design process, for the selection of constant coefficients of the controller, the PSO (particle swarm optimizer) is applied. Moreover, in order to obtain better dynamic properties and improve work in real conditions, one model based on the ADALINE (adaptive linear neuron) is introduced into the structure. Details of the algorithm used for the weights’ adaptation are presented (including stability analysis) to perform the shaft torque signal filtering. The effectiveness of the proposed approach is examined through simulation and experimental studies.


2020 ◽  
Vol 216 ◽  
pp. 01037
Author(s):  
Irina Akhmetova ◽  
Elena Balzamova ◽  
Veronika Bronskaya ◽  
Denis Balzamov ◽  
Konstantin Lapin ◽  
...  

A software package with the user interface for calculating, analyzing and predicting the parameters of cogeneration-based district heating based on the neural network modelling is presented in order to optimize and ensure the reliability of heat networks. The package is the basis for a web-application that allows to calculate the characteristics of the heat network in accordance with the model, keep a query log and provide the possibility of administration.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 4050 ◽  
Author(s):  
Vahab Khoshdel ◽  
Ahmed Ashraf ◽  
Joe LoVetri

We present a deep learning method used in conjunction with dual-modal microwave-ultrasound imaging to produce tomographic reconstructions of the complex-valued permittivity of numerical breast phantoms. We also assess tumor segmentation performance using the reconstructed permittivity as a feature. The contrast source inversion (CSI) technique is used to create the complex-permittivity images of the breast with ultrasound-derived tissue regions utilized as prior information. However, imaging artifacts make the detection of tumors difficult. To overcome this issue we train a convolutional neural network (CNN) that takes in, as input, the dual-modal CSI reconstruction and attempts to produce the true image of the complex tissue permittivity. The neural network consists of successive convolutional and downsampling layers, followed by successive deconvolutional and upsampling layers based on the U-Net architecture. To train the neural network, the input-output pairs consist of CSI’s dual-modal reconstructions, along with the true numerical phantom images from which the microwave scattered field was synthetically generated. The reconstructed permittivity images produced by the CNN show that the network is not only able to remove the artifacts that are typical of CSI reconstructions, but can also improve the detectability of tumors. The performance of the CNN is assessed using a four-fold cross-validation on our dataset that shows improvement over CSI both in terms of reconstruction error and tumor segmentation performance.


Information ◽  
2018 ◽  
Vol 9 (11) ◽  
pp. 288 ◽  
Author(s):  
Hossam Faris

Customer churn is one of the most challenging problems for telecommunication companies. In fact, this is because customers are considered as the real asset for the companies. Therefore, more companies are increasing their investments in developing practical solutions that aim at predicting customer churn before it happens. Identifying which customer is about to churn will significantly help the companies in providing solutions to keep their customers and optimize their marketing campaigns. In this work, an intelligent hybrid model based on Particle Swarm Optimization and Feedforward neural network is proposed for churn prediction. PSO is used to tune the weights of the input features and optimize the structure of the neural network simultaneously to increase the prediction power. In addition, the proposed model handles the imbalanced class distribution of the data using an advanced oversampling technique. Evaluation results show that the proposed model can significantly improve the coverage rate of churn customers in comparison with other state-of-the-art classifiers. Moreover, the model has high interpretability, where the assigned feature weights can give an indicator about the importance of their corresponding features in the classification process.


Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 940-940 ◽  
Author(s):  
Koji Sasaki ◽  
Hagop M. Kantarjian ◽  
Elias J. Jabbour ◽  
Susan O'Brien ◽  
Farhad Ravandi ◽  
...  

Abstract Introduction Artificial intelligence (AI) has been applied to a wide range of daily activities to assist in decision-making. Randomized clinical trials can compare the efficacy of treatment between patient groups. However, the best treatment decision for each individual patient, with their own clinical and biological features, and in the context of comparable treatment options, is more difficult to predict. The integrated consideration of various prognostic features can reach the point beyond human recognition. An AI-assisted approach may help with decision-making in complex clinical situations. The aim of this study is to introduce a prototype of AI to predict outcome such as achievement of major molecular response (MMR) within 1 year of the start of tyrosine kinase inhibitor (TKI). Methods Response data for 630 patients with newly diagnosed CML-CP in consecutive prospective clinical trials of frontline imatinib (n=73; NCT00048672), high-dose imatinib (n=208; NCT00038469 and NCT00050531), nilotinib (n=148; NCT00129740), dasatinib (n=150; NCT00254423), and ponatinib (n=51; NCT01570868) were analyzed. After multiple imputation for missing variables, neural network analysis with a multilayer perceptron model using the statistically significant variables by stepwise multivariate analysis was performed to predict the cumulative incidence of MMR within 1 year. The hyperbolic tangent and softmax activation function were used to create the architecture of hidden layers and output layers, respectively. Batch training with scaled conjugate gradient optimization algorithm with learning parameters (initial Lambda of 0.0000005, initial Sigma of 0.00005, interval center of 0, and interval offset of ±0.5) was used to train the neural network. To evaluate the accuracy of prediction, the entire cohort was randomly divided into training dataset (70%) and test dataset (30%). The correct prediction in the test dataset was repeatedly assessed 1,000 times to validate this approach. The whole cohort was subsequently used to create the AI model for MMR prediction, and was divided into two cohorts based on the prediction by the AI; AI-predicted response, and AI-predicted nonresponse. Hypothetical choice of TKI was assumed to rank the selection of TKI among imatinib 400 mg/day, imatinib 800 mg/day, dasatinib, nilotinib, and ponatinib to calculate the estimated percentage of MMR within 1 year for each patient. The Kaplan-Meier method with a log-rank test was used for failure-free survival (FFS), transformation-free survival (TFS), event-free survival (EFS), and overall survival (OS). To balance baseline patient characteristics between cohorts, propensity score matching after propensity score calculation by logistic regression was performed with nearest neighbor matching method with a caliper of 0.20. Exact matching was used for the type of cytogenetic, transcript, and TKI. Results Of 630 patients treated, 464 (74%) achieved MMR within 1 year. The stepwise multivariate analysis identified the selection of TKI, type of transcript, white blood cell count, albumin, and spleen size at diagnosis were the predictors for MMR within 1 year. Neural network analysis with a multilayer perceptron model is shown in figure 1. Through repeated random selection for training set (70%) and test set (30%), the mean correct prediction for MMR within 1 year was 77.4% (95% confidence interval [CI], 74.2-80.5), and 76.9% (95% CI, 71.4-82.3), respectively. Of 630 patients, the neural network model predicted 539 patients (86%) as responders, and 91 patients (14%) as nonresponders (table 1). Before propensity score matching, the AI-response cohort had higher rates of CCyR, MMR, MR4, MR4.5, and CMR as well as FFS, TFS, EFS, and OS compared to those of the AI-nonresponse cohort (figure 2). After propensity score matching, 25 patients in each cohort were identified, and the baseline differences were minimized (table 1). The AI-response cohort had higher rates of MMR, MR4, and FFS than those of AI-non-response cohort (figure 2). Conclusion AI with a multilayer perceptron model can predict target outcome. Incorporation of additional clinical and biological variables may improve the prediction rates to suggest the best treatment option in each patient with CML-CP. Such strategy is ongoing. Disclosures Kantarjian: ARIAD: Research Funding; Bristol-Myers Squibb: Research Funding; Amgen: Research Funding; Pfizer Inc: Research Funding; Delta-Fly Pharma: Research Funding; Novartis: Research Funding. Jabbour:ARIAD: Consultancy, Research Funding; Pfizer: Consultancy, Research Funding; Novartis: Research Funding; BMS: Consultancy. Ravandi:BMS: Research Funding; Seattle Genetics: Consultancy, Honoraria, Research Funding. Konopleva:AbbVie: Research Funding; Genentech: Research Funding. Wierda:Novartis: Research Funding; Abbvie: Research Funding; Acerta: Research Funding; Gilead: Research Funding; Genentech: Research Funding. Daver:Pfizer: Consultancy, Research Funding; Kiromic: Research Funding; BMS: Research Funding; Otsuka: Consultancy, Honoraria; Sunesis: Consultancy, Research Funding; Karyopharm: Honoraria, Research Funding; Ariad: Research Funding.


2020 ◽  
Vol 11 (6) ◽  
pp. 330-334
Author(s):  
R. A. Karelova ◽  
◽  
E. E. Ignatov ◽  

The article presents an embodiment of an artificial neural network for recognizing defects in images of steel sheets. Several stages of solving the problem are described: the choice of a development environment, a programming language, and libraries necessary for the implementation; features of data analysis, graphing, histograms, finding dependencies; the selection of a suitable neural network, the choice of neural network architecture, the selection of an algorithm for assessing quality and accuracy; neural network spelling; training and checking accuracy and quality, checking for overfitting (retraining). As development tools, Python language, PyTorch library, Jupyter development environment, convolutional neural network architecture — Unet are proposed. Features of the analysis of input images of steel sheets, features of the implementation of the neural network itself are described. The function of binary cross entropy was chosen as a criterion for assessing accuracy, since it seeks to bring the distribution of the network forecast to the target, fine not only for erroneous predictions, but also for uncertain ones. For additional evaluation, the DICE method was also used. The accuracy of the resulting model is 84 %. The proposed solution can become part of a hardware-software system for automating the recognition of defects on metal sheets.


2004 ◽  
Vol 43 (5) ◽  
pp. 727-738 ◽  
Author(s):  
Ralf Kretzschmar ◽  
Pierre Eckert ◽  
Daniel Cattani ◽  
Fritz Eggimann

Abstract This paper evaluates the quality of neural network classifiers for wind speed and wind gust prediction with prediction lead times between +1 and +24 h. The predictions were realized based on local time series and model data. The selection of appropriate input features was initiated by time series analysis and completed by empirical comparison of neural network classifiers trained on several choices of input features. The selected input features involved day time, yearday, features from a single wind observation device at the site of interest, and features derived from model data. The quality of the resulting classifiers was benchmarked against persistence for two different sites in Switzerland. The neural network classifiers exhibited superior quality when compared with persistence judged on a specific performance measure, hit and false-alarm rates.


Sign in / Sign up

Export Citation Format

Share Document