Artificial intelligence‐based radiotherapy machine parameter optimization using reinforcement learning

2020 ◽  
Author(s):  
William Thomas Hrinivich ◽  
Junghoon Lee
2014 ◽  
Vol 571-572 ◽  
pp. 105-108
Author(s):  
Lin Xu

This paper proposes a new framework of combining reinforcement learning with cloud computing digital library. Unified self-learning algorithms, which includes reinforcement learning, artificial intelligence and etc, have led to many essential advances. Given the current status of highly-available models, analysts urgently desire the deployment of write-ahead logging. In this paper we examine how DNS can be applied to the investigation of superblocks, and introduce the reinforcement learning to improve the quality of current cloud computing digital library. The experimental results show that the method works more efficiency.


2019 ◽  
Vol 1 (2) ◽  
pp. 74-84
Author(s):  
Evan Kusuma Susanto ◽  
Yosi Kristian

Asynchronous Advantage Actor-Critic (A3C) adalah sebuah algoritma deep reinforcement learning yang dikembangkan oleh Google DeepMind. Algoritma ini dapat digunakan untuk menciptakan sebuah arsitektur artificial intelligence yang dapat menguasai berbagai jenis game yang berbeda melalui trial and error dengan mempelajari tempilan layar game dan skor yang diperoleh dari hasil tindakannya tanpa campur tangan manusia. Sebuah network A3C terdiri dari Convolutional Neural Network (CNN) di bagian depan, Long Short-Term Memory Network (LSTM) di tengah, dan sebuah Actor-Critic network di bagian belakang. CNN berguna sebagai perangkum dari citra output layar dengan mengekstrak fitur-fitur yang penting yang terdapat pada layar. LSTM berguna sebagai pengingat keadaan game sebelumnya. Actor-Critic Network berguna untuk menentukan tindakan terbaik untuk dilakukan ketika dihadapkan dengan suatu kondisi tertentu. Dari hasil percobaan yang dilakukan, metode ini cukup efektif dan dapat mengalahkan pemain pemula dalam memainkan 5 game yang digunakan sebagai bahan uji coba.


Author(s):  
Abdelghafour Harraz ◽  
Mostapha Zbakh

Artificial Intelligence allows to create engines that are able to explore, learn environments and therefore create policies that permit to control them in real time with no human intervention. It can be applied, through its Reinforcement Learning techniques component, using frameworks such as temporal differences, State-Action-Reward-State-Action (SARSA), Q Learning to name a few, to systems that are be perceived as a Markov Decision Process, this opens door in front of applying Reinforcement Learning to Cloud Load Balancing to be able to dispatch load dynamically to a given Cloud System. The authors will describe different techniques that can used to implement a Reinforcement Learning based engine in a cloud system.


Author(s):  
Grzegorz Musiolik

Artificial intelligence evolves rapidly and will have a great impact on the society in the future. One important question which still cannot be addressed with satisfaction is whether the decision of an intelligent agent can be predicted. As a consequence of this, the general question arises if such agents can be controllable and future robotic applications can be safe. This chapter shows that unpredictable systems are very common in mathematics and physics although the underlying mathematical structure can be very simple. It also shows that such unpredictability can also emerge for intelligent agents in reinforcement learning, especially for complex tasks with various input parameters. An observer would not be capable to distinguish this unpredictability from a free will of the agent. This raises ethical questions and safety issues which are briefly presented.


2020 ◽  
Vol 107 (4) ◽  
pp. 853-857 ◽  
Author(s):  
Benjamin Ribba ◽  
Sherri Dudal ◽  
Thierry Lavé ◽  
Richard W. Peck

Science ◽  
2018 ◽  
Vol 362 (6419) ◽  
pp. 1140-1144 ◽  
Author(s):  
David Silver ◽  
Thomas Hubert ◽  
Julian Schrittwieser ◽  
Ioannis Antonoglou ◽  
Matthew Lai ◽  
...  

The game of chess is the longest-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. By contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go by reinforcement learning from self-play. In this paper, we generalize this approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games. Starting from random play and given no domain knowledge except the game rules, AlphaZero convincingly defeated a world champion program in the games of chess and shogi (Japanese chess), as well as Go.


Data ◽  
2021 ◽  
Vol 6 (4) ◽  
pp. 42
Author(s):  
Diana Kafkes ◽  
Jason St. John

The Booster Operation Optimization Sequential Time-series for Regression (BOOSTR) dataset was created to provide a cycle-by-cycle time series of readings and settings from instruments and controllable devices of the Booster, Fermilab’s Rapid-Cycling Synchrotron (RCS) operating at 15 Hz. BOOSTR provides a time series from 55 device readings and settings that pertain most directly to the high-precision regulation of the Booster’s gradient magnet power supply (GMPS). To our knowledge, this is one of the first well-documented datasets of accelerator device parameters made publicly available. We are releasing it in the hopes that it can be used to demonstrate aspects of artificial intelligence for advanced control systems, such as reinforcement learning and autonomous anomaly detection.


Sign in / Sign up

Export Citation Format

Share Document