Multi-Context Generation in Virtual Reality Environments Using Deep Reinforcement Learning

Author(s):  
James Cunningham ◽  
Christian Lopez ◽  
Omar Ashour ◽  
Conrad S. Tucker

Abstract In this work, a Deep Reinforcement Learning (RL) approach is proposed for Procedural Content Generation (PCG) that seeks to automate the generation of multiple related virtual reality (VR) environments for enhanced personalized learning. This allows for the user to be exposed to multiple virtual scenarios that demonstrate a consistent theme, which is especially valuable in an educational context. RL approaches to PCG offer the advantage of not requiring training data, as opposed to other PCG approaches that employ supervised learning approaches. This work advances the state of the art in RL-based PCG by demonstrating the ability to generate a diversity of contexts in order to teach the same underlying concept. A case study is presented that demonstrates the feasibility of the proposed RL-based PCG method using examples of probability distributions in both manufacturing facility and grocery store virtual environments. The method demonstrated in this paper has the potential to enable the automatic generation of a variety of virtual environments that are connected by a common concept or theme.

2019 ◽  
Vol 141 (12) ◽  
Author(s):  
Gary M. Stump ◽  
Simon W. Miller ◽  
Michael A. Yukish ◽  
Timothy W. Simpson ◽  
Conrad Tucker

Abstract A novel method has been developed to optimize both the form and behavior of complex systems. The method uses spatial grammars embodied in character-recurrent neural networks (char-RNNs) to define the system including actuator numbers and degrees of freedom, reinforcement learning to optimize actuator behavior, and physics-based simulation systems to determine performance and provide (re)training data for the char-RNN. Compared to parametric design optimization with fixed numbers of inputs, using grammars and char-RNNs allows for a more complex, combinatorial infinite design space. In the proposed method, the char-RNN is first trained to learn a spatial grammar that defines the assembly layout, component geometries, material properties, and arbitrary numbers and degrees of freedom of actuators. Next, generated designs are evaluated using a physics-based environment, with an inner optimization loop using reinforcement learning to determine the best control policy for the actuators. The resulting design is thus optimized for both form and behavior, generated by a char-RNN embodying a high-performing grammar. Two evaluative case studies are presented using the design of the modular sailing craft. The first case study optimizes the design without actuated surfaces, allowing the char-RNN to understand the semantics of high-performing designs. The second case study extends the first by incorporating controllable actuators requiring an inner loop behavioral optimization. The implications of the results are discussed along with the ongoing and future work.


Author(s):  
Imthias Ahamed T.P. ◽  
Nagendra Rao P.S. ◽  
Sastry P.S.

This paper presents the design and implementation of a learning controller for the Automatic Generation Control (AGC) in power systems based on a reinforcement learning (RL) framework. In contrast to the recent RL scheme for AGC proposed by us, the present method permits handling of power system variables such as Area Control Error (ACE) and deviations from scheduled frequency and tie-line flows as continuous variables. (In the earlier scheme, these variables have to be quantized into finitely many levels). The optimal control law is arrived at in the RL framework by making use of Q-learning strategy. Since the state variables are continuous, we propose the use of Radial Basis Function (RBF) neural networks to compute the Q-values for a given input state. Since, in this application we cannot provide training data appropriate for the standard supervised learning framework, a reinforcement learning algorithm is employed to train the RBF network. We also employ a novel exploration strategy, based on a Learning Automata algorithm, for generating training samples during Q-learning. The proposed scheme, in addition to being simple to implement, inherits all the attractive features of an RL scheme such as model independent design, flexibility in control objective specification, robustness etc. Two implementations of the proposed approach are presented. Through simulation studies the attractiveness of this approach is demonstrated.


Author(s):  
Susan Turner

This chapter considers the role of sound, and more specifically, listening, in creating a sense of presence (of “being there”) in “places” recreated by virtual reality technologies. We first briefly review the treatment of sound in place and presence research. Here we give particular attention to the role of sound in inducing a sense of presence in virtual environments that immerse their users in representations of particular places. We then consider the phenomenology of listening, the nature of different types of listening, and their application: listening is active, directed, intentional hearing, and is not merely egocentric, it is body-centric. A classification of modes of listening that draws on work in film studies, virtual reality, and audiology is then proposed as a means of supporting the design of place-centric virtual environments in providing an effective aural experience. Finally, we apply this to a case study of listening in real and simulated soundscapes, and suggest directions for further applications of this work


2021 ◽  
Author(s):  
Jarrad Kowlessar ◽  
James Keal ◽  
Daryl Wesley ◽  
Ian Moffat ◽  
Dudley Lawrence ◽  
...  

In recent years, machine learning approaches have been used to classify and extract style from media and have been used to reinforce known chronologies from classical art history. In this work we employ the first ever machine learning analysis of Australian rock art using a data efficient transfer learning approach to identify features suitable for distinguishing styles of rock art. These features are evaluated in a one-shot learning setting. Results demonstrate that known Arnhem Land Rock art styles can be resolved without knowledge of prior groupings. We then analyse the activation space of learned features and report on the relationships between styles and arrange these classes into a stylistic chronology based on distance within the activation space. By generating a stylistic chronology, it is shown that the model is sensitive to both temporal and spatial patterns in the distribution of rock art in the Arnhem Land Plateau region. More broadly, this approach is ideally suited to evaluating style within any material culture assemblage and overcomes the common constraint of small training data sets in archaeological machine learning studies.


2019 ◽  
Vol 141 (11) ◽  
Author(s):  
Xian Yeow Lee ◽  
Aditya Balu ◽  
Daniel Stoecklein ◽  
Baskar Ganapathysubramanian ◽  
Soumik Sarkar

Abstract Efficient exploration of design spaces is highly sought after in engineering applications. A spectrum of tools has been proposed to deal with the computational difficulties associated with such problems. In the context of our case study, these tools can be broadly classified into optimization and supervised learning approaches. Optimization approaches, while successful, are inherently data inefficient, with evolutionary optimization-based methods being a good example. This inefficiency stems from data not being reused from previous design explorations. Alternately, supervised learning-based design paradigms are data efficient. However, the quality of ensuing solutions depends heavily on the quality of data available. Furthermore, it is difficult to incorporate physics models and domain knowledge aspects of design exploration into pure-learning-based methods. In this work, we formulate a reinforcement learning (RL)-based design framework that mitigates disadvantages of both approaches. Our framework simultaneously finds solutions that are more efficient compared with supervised learning approaches while using data more efficiently compared with genetic algorithm (GA)-based optimization approaches. We illustrate our framework on a problem of microfluidic device design for flow sculpting, and our results show that a single generic RL agent is capable of exploring the solution space to achieve multiple design objectives. Additionally, we demonstrate that the RL agent can be used to solve more complex problems using a targeted refinement step. Thus, we address the data efficiency limitation of optimization-based methods and the limited data problem of supervised learning-based methods. The versatility of our framework is illustrated by utilizing it to gain domain insights and to incorporate domain knowledge. We envision such RL frameworks to have an impact on design science.


Processes ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 1497
Author(s):  
Titus Quah ◽  
Derek Machalek ◽  
Kody M. Powell

One popular method for optimizing systems, referred to as ANN-PSO, uses an artificial neural network (ANN) to approximate the system and an optimization method like particle swarm optimization (PSO) to select inputs. However, with reinforcement learning developments, it is important to compare ANN-PSO to newer algorithms, like Proximal Policy Optimization (PPO). To investigate ANN-PSO’s and PPO’s performance and applicability, we compare their methodologies, apply them on steady-state economic optimization of a chemical process, and compare their results to a conventional first principles modeling with nonlinear programming (FP-NLP). Our results show that ANN-PSO and PPO achieve profits nearly as high as FP-NLP, but PPO achieves slightly higher profits compared to ANN-PSO. We also find PPO has the fastest computational times, 10 and 10,000 times faster than FP-NLP and ANN-PSO, respectively. However, PPO requires more training data than ANN-PSO to converge to an optimal policy. This case study suggests PPO has better performance as it achieves higher profits and faster online computational times. ANN-PSO shows better applicability with its capability to train on historical operational data and higher training efficiency.


Author(s):  
Christian E. López ◽  
James Cunningham ◽  
Omar Ashour ◽  
Conrad S. Tucker

Abstract This work presents a deep reinforcement learning (DRL) approach for procedural content generation (PCG) to automatically generate three-dimensional (3D) virtual environments that users can interact with. The primary objective of PCG methods is to algorithmically generate new content in order to improve user experience. Researchers have started exploring the use of machine learning (ML) methods to generate content. However, these approaches frequently implement supervised ML algorithms that require initial datasets to train their generative models. In contrast, RL algorithms do not require training data to be collected a priori since they take advantage of simulation to train their models. Considering the advantages of RL algorithms, this work presents a method that generates new 3D virtual environments by training an RL agent using a 3D simulation platform. This work extends the authors’ previous work and presents the results of a case study that supports the capability of the proposed method to generate new 3D virtual environments. The ability to automatically generate new content has the potential to maintain users’ engagement in a wide variety of applications such as virtual reality applications for education and training, and engineering conceptual design.


1999 ◽  

Abstract This volume covers papers presented at the NIST-ASME Industrial Virtual Reality Symposium held at the University of Illinois at Chicago, November 1-2, and the Symposium on Virtual Environments (VE) for Manufacturing, a part of the International Mechanical Engineering Congress and Exposition, held in Nashville, November 14-18. A collection of research and case study papers from both symposia are included.


2006 ◽  
Author(s):  
Georgina Cardenas-Lopez ◽  
Sandra Munoz ◽  
Maribel Gonzalez ◽  
Carmen Ramos
Keyword(s):  

2019 ◽  
Author(s):  
Niclas Ståhl ◽  
Göran Falkman ◽  
Alexander Karlsson ◽  
Gunnar Mathiason ◽  
Jonas Boström

<p>In medicinal chemistry programs it is key to design and make compounds that are efficacious and safe. This is a long, complex and difficult multi-parameter optimization process, often including several properties with orthogonal trends. New methods for the automated design of compounds against profiles of multiple properties are thus of great value. Here we present a fragment-based reinforcement learning approach based on an actor-critic model, for the generation of novel molecules with optimal properties. The actor and the critic are both modelled with bidirectional long short-term memory (LSTM) networks. The AI method learns how to generate new compounds with desired properties by starting from an initial set of lead molecules and then improve these by replacing some of their fragments. A balanced binary tree based on the similarity of fragments is used in the generative process to bias the output towards structurally similar molecules. The method is demonstrated by a case study showing that 93% of the generated molecules are chemically valid, and a third satisfy the targeted objectives, while there were none in the initial set.</p>


Sign in / Sign up

Export Citation Format

Share Document