Emergent Tangled Program Graphs in Partially Observable Recursive Forecasting and ViZDoom Navigation Tasks

2021 ◽  
Vol 1 (3) ◽  
pp. 1-41
Author(s):  
Stephen Kelly ◽  
Robert J. Smith ◽  
Malcolm I. Heywood ◽  
Wolfgang Banzhaf

Modularity represents a recurring theme in the attempt to scale evolution to the design of complex systems. However, modularity rarely forms the central theme of an artificial approach to evolution. In this work, we report on progress with the recently proposed Tangled Program Graph (TPG) framework in which programs are modules. The combination of the TPG representation and its variation operators enable both teams of programs and graphs of teams of programs to appear in an emergent process. The original development of TPG was limited to tasks with, for the most part, complete information. This work details two recent approaches for scaling TPG to tasks that are dominated by partially observable sources of information using different formulations of indexed memory. One formulation emphasizes the incremental construction of memory, again as an emergent process, resulting in a distributed view of state. The second formulation assumes a single global instance of memory and develops it as a communication medium, thus a single global view of state. The resulting empirical evaluation demonstrates that TPG equipped with memory is able to solve multi-task recursive time-series forecasting problems and visual navigation tasks expressed in two levels of a commercial first-person shooter environment.

Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 380
Author(s):  
Emanuele Cavenaghi ◽  
Gabriele Sottocornola ◽  
Fabio Stella ◽  
Markus Zanker

The Multi-Armed Bandit (MAB) problem has been extensively studied in order to address real-world challenges related to sequential decision making. In this setting, an agent selects the best action to be performed at time-step t, based on the past rewards received by the environment. This formulation implicitly assumes that the expected payoff for each action is kept stationary by the environment through time. Nevertheless, in many real-world applications this assumption does not hold and the agent has to face a non-stationary environment, that is, with a changing reward distribution. Thus, we present a new MAB algorithm, named f-Discounted-Sliding-Window Thompson Sampling (f-dsw TS), for non-stationary environments, that is, when the data streaming is affected by concept drift. The f-dsw TS algorithm is based on Thompson Sampling (TS) and exploits a discount factor on the reward history and an arm-related sliding window to contrast concept drift in non-stationary environments. We investigate how to combine these two sources of information, namely the discount factor and the sliding window, by means of an aggregation function f(.). In particular, we proposed a pessimistic (f=min), an optimistic (f=max), as well as an averaged (f=mean) version of the f-dsw TS algorithm. A rich set of numerical experiments is performed to evaluate the f-dsw TS algorithm compared to both stationary and non-stationary state-of-the-art TS baselines. We exploited synthetic environments (both randomly-generated and controlled) to test the MAB algorithms under different types of drift, that is, sudden/abrupt, incremental, gradual and increasing/decreasing drift. Furthermore, we adapt four real-world active learning tasks to our framework—a prediction task on crimes in the city of Baltimore, a classification task on insects species, a recommendation task on local web-news, and a time-series analysis on microbial organisms in the tropical air ecosystem. The f-dsw TS approach emerges as the best performing MAB algorithm. At least one of the versions of f-dsw TS performs better than the baselines in synthetic environments, proving the robustness of f-dsw TS under different concept drift types. Moreover, the pessimistic version (f=min) results as the most effective in all real-world tasks.


2020 ◽  
Vol 117 (15) ◽  
pp. 8391-8397 ◽  
Author(s):  
Maija Honig ◽  
Wei Ji Ma ◽  
Daryl Fougnie

Working memory (WM) plays an important role in action planning and decision making; however, both the informational content of memory and how that information is used in decisions remain poorly understood. To investigate this, we used a color WM task in which subjects viewed colored stimuli and reported both an estimate of a stimulus color and a measure of memory uncertainty, obtained through a rewarded decision. Reported memory uncertainty is correlated with memory error, showing that people incorporate their trial-to-trial memory quality into rewarded decisions. Moreover, memory uncertainty can be combined with other sources of information; after inducing expectations (prior beliefs) about stimuli probabilities, we found that estimates became shifted toward expected colors, with the shift increasing with reported uncertainty. The data are best fit by models in which people incorporate their trial-to-trial memory uncertainty with potential rewards and prior beliefs. Our results suggest that WM represents uncertainty information, and that this can be combined with prior beliefs. This highlights the potential complexity of WM representations and shows that rewarded decision can be a powerful tool for examining WM and informing and constraining theoretical, computational, and neurobiological models of memory.


2020 ◽  
Vol 9 (1) ◽  
pp. 102-118
Author(s):  
Amy Melissa McKay ◽  
Antal Wozniak

Abstract The government of the UK is reputed to be among the world’s most transparent governments. Yet in comparison with many other countries, its 5-year-old register of lobbyists provides little information about the lobbying activity directed at the British state. Further, its published lists of meetings with government ministers are vague, delayed, and scattered across numerous online locations. Our analysis of more than 72,000 reported ministerial meetings and nearly 1000 lobbying clients and consultants reveals major discrepancies between these two sources of information about lobbying in the UK. Over the same four quarters, we find that only about 29% of clients listed in the lobby register appear in the published record of ministerial meetings with outside groups, and less than 4% of groups disclosed in ministerial meetings records appear in the lobby register. This wide variation between the two sets of data, along with other evidence, contribute to our conclusion that the Government could have made, and still should make, the lobby register more robust.


2007 ◽  
Vol 30 ◽  
pp. 1-50 ◽  
Author(s):  
M. J. Carman ◽  
C. A. Knoblock

The Internet contains a very large number of information sources providing many types of data from weather forecasts to travel deals and financial information. These sources can be accessed via Web-forms, Web Services, RSS feeds and so on. In order to make automated use of these sources, we need to model them semantically, but writing semantic descriptions for Web Services is both tedious and error prone. In this paper we investigate the problem of automatically generating such models. We introduce a framework for learning Datalog definitions of Web sources. In order to learn these definitions, our system actively invokes the sources and compares the data they produce with that of known sources of information. It then performs an inductive logic search through the space of plausible source definitions in order to learn the best possible semantic model for each new source. In this paper we perform an empirical evaluation of the system using real-world Web sources. The evaluation demonstrates the effectiveness of the approach, showing that we can automatically learn complex models for real sources in reasonable time. We also compare our system with a complex schema matching system, showing that our approach can handle the kinds of problems tackled by the latter.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Miguel Rodriguez ◽  
Antoinette A. Danvers ◽  
Carolina Sanabia ◽  
Siobhan M. Dolan

Abstract Background The objective of the study was to understand how pregnant women learned about Zika infection and to identify what sources of information were likely to influence them during their pregnancy. Methods We conducted 13 semi-structed interviews in English and Spanish with women receiving prenatal care who were tested for Zika virus infection. We analyzed the qualitative data using descriptive approach. Results Pregnant women in the Bronx learned about Zika from family, television, the internet and their doctor. Informational sources played different roles. Television, specifically Spanish language networks, was often the initial source of information. Women searched the internet for additional information about Zika. Later, they engaged in further discussions with their healthcare providers. Conclusions Television played an important role in providing awareness about Zika to pregnant women in the Bronx, but that information was incomplete. The internet and healthcare providers were sources of more complete information and are likely the most influential. Efforts to educate pregnant women about emerging infectious diseases will benefit from using a variety of approaches including television messages that promote public awareness followed up by reliable information via the internet and healthcare providers.


Author(s):  
Sebastian Junges ◽  
Nils Jansen ◽  
Sanjit A. Seshia

AbstractPartially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information. We consider the EXPTIME-hard problem of synthesising policies that almost-surely reach some goal state without ever visiting a bad state. In particular, we are interested in computing the winning region, that is, the set of system configurations from which a policy exists that satisfies the reachability specification. A direct application of such a winning region is the safe exploration of POMDPs by, for instance, restricting the behavior of a reinforcement learning agent to the region. We present two algorithms: A novel SAT-based iterative approach and a decision-diagram based alternative. The empirical evaluation demonstrates the feasibility and efficacy of the approaches.


2006 ◽  
Vol 54 (1) ◽  
pp. 84-94 ◽  
Author(s):  
A. Montesanto ◽  
G. Tascini ◽  
P. Puliti ◽  
P. Baldassarri

Author(s):  
Taciana Novo Kudo ◽  
Renato De Freitas Bulcão Neto ◽  
Auri Marcelo Rizzo Vincenzi ◽  
Alessandra Alaniz Macedo

In the past few years, the literature has shown that the practice of reuse through requirement patterns is an effective alternative to address specification quality issues, with the additional benefit of time savings. Due to the interactions between requirements engineering and other phases of the software development life cycle (SDLC), these benefits may extend to the entire development process. This paper describes a revisited systematic literature mapping (SLM) that identifies and analyzes research that demonstrates those benefits from the use of requirement patterns for software design, construction, testing, and maintenance. In this extended version, the SLM protocol includes automatic search over two additional sources of information and the application of the snowballing technique, resulting in ten primary studies for analysis and synthesis. In spite of this new version of the SLM protocol, results still point out a small number of studies on requirement patterns at the SDLC (excluding requirements engineering). Results indicate that there is yet an open field for research that demonstrates, through empirical evaluation and usage in practice, the pertinence of requirement patterns at software design, construction, testing, and maintenance.


2019 ◽  
Vol 65 ◽  
pp. 209-269
Author(s):  
Sarah Keren ◽  
Avigdor Gal ◽  
Erez Karpas

Goal recognition design (GRD) facilitates understanding the goals of acting agents through the analysis and redesign of goal recognition models, thus offering a solution for assessing and minimizing the maximal progress of any agent in the model before goal recognition is guaranteed. In a nutshell, given a model of a domain and a set of possible goals, a solution to a GRD problem determines (1) the extent to which actions performed by an agent within the model reveal the agent’s objective; and (2) how best to modify the model so that the objective of an agent can be detected as early as possible. This approach is relevant to any domain in which rapid goal recognition is essential and the model design can be controlled. Applications include intrusion detection, assisted cognition, computer games, and human-robot collaboration. A GRD problem has two components: the analyzed goal recognition setting, and a design model specifying the possible ways the environment in which agents act can be modified so as to facilitate recognition. This work formulates a general framework for GRD in deterministic and partially observable environments, and offers a toolbox of solutions for evaluating and optimizing model quality for various settings. For the purpose of evaluation we suggest the worst case distinctiveness (WCD) measure, which represents the maximal cost of a path an agent may follow before its goal can be inferred by a goal recognition system. We offer novel compilations to classical planning for calculating WCD in settings where agents are bounded-suboptimal. We then suggest methods for minimizing WCD by searching for an optimal redesign strategy within the space of possible modifications, and using pruning to increase efficiency. We support our approach with an empirical evaluation that measures WCD in a variety of GRD settings and tests the efficiency of our compilation-based methods for computing it. We also examine the effectiveness of reducing WCD via redesign and the performance gain brought about by our proposed pruning strategy.


Sign in / Sign up

Export Citation Format

Share Document