scholarly journals Quantum-accessible reinforcement learning beyond strictly epochal environments

2021 ◽  
Vol 3 (2) ◽  
Author(s):  
A. Hamann ◽  
V. Dunjko ◽  
S. Wölk

AbstractIn recent years, quantum-enhanced machine learning has emerged as a particularly fruitful application of quantum algorithms, covering aspects of supervised, unsupervised and reinforcement learning. Reinforcement learning offers numerous options of how quantum theory can be applied, and is arguably the least explored, from a quantum perspective. Here, an agent explores an environment and tries to find a behavior optimizing some figure of merit. Some of the first approaches investigated settings where this exploration can be sped-up, by considering quantum analogs of classical environments, which can then be queried in superposition. If the environments have a strict periodic structure in time (i.e. are strictly episodic), such environments can be effectively converted to conventional oracles encountered in quantum information. However, in general environments, we obtain scenarios that generalize standard oracle tasks. In this work, we consider one such generalization, where the environment is not strictly episodic, which is mapped to an oracle identification setting with a changing oracle. We analyze this case and show that standard amplitude-amplification techniques can, with minor modifications, still be applied to achieve quadratic speed-ups. In addition, we prove that an algorithm based on Grover iterations is optimal for oracle identification even if the oracle changes over time in a way that the “rewarded space” is monotonically increasing. This result constitutes one of the first generalizations of quantum-accessible reinforcement learning.

2014 ◽  
Vol 11 (92) ◽  
pp. 20131091 ◽  
Author(s):  
Oren Kolodny ◽  
Shimon Edelman ◽  
Arnon Lotem

Continuous, ‘always on’, learning of structure from a stream of data is studied mainly in the fields of machine learning or language acquisition, but its evolutionary roots may go back to the first organisms that were internally motivated to learn and represent their environment. Here, we study under what conditions such continuous learning (CL) may be more adaptive than simple reinforcement learning and examine how it could have evolved from the same basic associative elements. We use agent-based computer simulations to compare three learning strategies: simple reinforcement learning; reinforcement learning with chaining (RL-chain) and CL that applies the same associative mechanisms used by the other strategies, but also seeks statistical regularities in the relations among all items in the environment, regardless of the initial association with food. We show that a sufficiently structured environment favours the evolution of both RL-chain and CL and that CL outperforms the other strategies when food is relatively rare and the time for learning is limited. This advantage of internally motivated CL stems from its ability to capture statistical patterns in the environment even before they are associated with food, at which point they immediately become useful for planning.


2021 ◽  
Vol 14 (1) ◽  
pp. 5
Author(s):  
J.M. Calabuig ◽  
L.M. Garcia-Raffi ◽  
E.A. Sánchez-Pérez

<p class="p1">La inteligencia artificial está presente en el entorno habitual de todos los estudiantes de secundaria. Sin embargo, la población general -y los alumnos en particular- no conocen cómo funcionan estas técnicas algorítmicas, que muchas veces tienen mecanismos muy sencillos y que pueden explicarse a nivel elemental en las clases de matemáticas o de tecnología en los Institutos de Enseñanza Secundaria (IES). Posiblemente estos contenidos tardarán muchos años en formar parte de los currículos de estas asignaturas, pero se pueden introducir como parte de los contenidos de álgebra que se explican en matemáticas, o de los relacionados con los algoritmos, en las clases de informática. Sobre todo si se plantean en forma de juego, en los que pueden competir diferentes grupos de estudiantes, tal y como proponemos en este artículo. Así, presentamos un ejemplo muy simple de un algoritmo de aprendizaje por refuerzo (Machine Learning-Reinforcement Learning), que sintetiza en una actividad lúdica los elementos fundamentales que constituyen un algoritmo de inteligencia artificial.</p>


Author(s):  
Ahmad Roihan ◽  
Po Abas Sunarya ◽  
Ageng Setiani Rafika

Abstrak - Pembelajaran mesin merupakan bagian dari kecerdasan buatan yang banyak digunakan untuk memecahkan berbagai masalah. Artikel ini menyajikan ulasan pemecahan masalah dari penelitian-penelitian terkini dengan mengklasifikasikan machine learning menjadi tiga kategori: pembelajaran terarah, pembelajaran tidak terarah, dan pembelajaran reinforcement. Hasil ulasan menunjukkan ketiga kategori masih berpeluang digunakan dalam beberapa kasus terkini dan dapat ditingkatkan untuk mengurangi beban komputasi dan mempercepat kinerja untuk mendapatkan tingkat akurasi dan presisi yang tinggi. Tujuan ulasan artikel ini diharapkan dapat menemukan celah dan dijadikan pedoman untuk penelitian pada masa yang akan datang.Katakunci: pembelajaran mesin, pembelajaran reinforcement, pembelajaran terarah, pembelajaran tidak terarahAbstract - Machine learning is part of artificial intelligence that is widely used to solve various problems. This article reviews problem solving from the latest studies by classifying machine learning into three categories: supervised learning, unsupervised learning, and reinforcement learning. The results of the review show that the three categories are still likely to be used in some of the latest cases and can be improved to reduce computational costs and accelerate performance to get a high level of accuracy and precision. The purpose of this article review is expected to be able to find a gap and it is used as a guideline for future research.Keywords: machine learning, reinforcement learning, supervised learning, unsupervised learning


2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Maria Hügle ◽  
Patrick Omoumi ◽  
Jacob M van Laar ◽  
Joschka Boedecker ◽  
Thomas Hügle

Abstract Machine learning as a field of artificial intelligence is increasingly applied in medicine to assist patients and physicians. Growing datasets provide a sound basis with which to apply machine learning methods that learn from previous experiences. This review explains the basics of machine learning and its subfields of supervised learning, unsupervised learning, reinforcement learning and deep learning. We provide an overview of current machine learning applications in rheumatology, mainly supervised learning methods for e-diagnosis, disease detection and medical image analysis. In the future, machine learning will be likely to assist rheumatologists in predicting the course of the disease and identifying important disease factors. Even more interestingly, machine learning will probably be able to make treatment propositions and estimate their expected benefit (e.g. by reinforcement learning). Thus, in future, shared decision-making will not only include the patient’s opinion and the rheumatologist’s empirical and evidence-based experience, but it will also be influenced by machine-learned evidence.


VASA ◽  
2015 ◽  
Vol 44 (5) ◽  
pp. 355-362 ◽  
Author(s):  
Marie Urban ◽  
Alban Fouasson-Chailloux ◽  
Isabelle Signolet ◽  
Christophe Colas Ribas ◽  
Mathieu Feuilloy ◽  
...  

Abstract. Summary: Background: We aimed at estimating the agreement between the Medicap® (photo-optical) and Radiometer® (electro-chemical) sensors during exercise transcutaneous oxygen pressure (tcpO2) tests. Our hypothesis was that although absolute starting values (tcpO2rest: mean over 2 minutes) might be different, tcpO2-changes over time and the minimal value of the decrease from rest of oxygen pressure (DROPmin) results at exercise shall be concordant between the two systems. Patients and methods: Forty seven patients with arterial claudication (65 + / - 7 years) performed a treadmill test with 5 probes each of the electro-chemical and photo-optical devices simultaneously, one of each system on the chest, on each buttock and on each calf. Results: Seventeen Medicap® probes disconnected during the tests. tcpO2rest and DROPmin values were higher with Medicap® than with Radiometer®, by 13.7 + / - 17.1 mm Hg and 3.4 + / - 11.7 mm Hg, respectively. Despite the differences in absolute starting values, changes over time were similar between the two systems. The concordance between the two systems was approximately 70 % for classification of test results from DROPmin. Conclusions: Photo-optical sensors are promising alternatives to electro-chemical sensors for exercise oximetry, provided that miniaturisation and weight reduction of the new sensors are possible.


2007 ◽  
Author(s):  
Miranda Olff ◽  
Mirjam Nijdam ◽  
Kristin Samuelson ◽  
Julia Golier ◽  
Mariel Meewisse ◽  
...  

2010 ◽  
Author(s):  
Rebecca D. Stinson ◽  
Zachary Sussman ◽  
Megan Foley Nicpon ◽  
Allison L. Allmon ◽  
Courtney Cornick ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document