scholarly journals Transfer Learning for Operator Selection: A Reinforcement Learning Approach

Author(s):  
Rafet Durgut ◽  
Mehmet Emin Aydin ◽  
Abdur Rakib

In the past two decades, metaheuristic optimization algorithms (MOAs) have been increasingly popular, particularly in logistic, science, and engineering problems. The fundamental characteristics of such algorithms are that they are dependent on a parameter or a strategy. Some online and offline strategies are employed in order to obtain optimal configurations of the algorithms. Adaptive operator selection is one of them, and it determines whether or not to update a strategy from the strategy pool during the search process. In the filed of machine learning, Reinforcement Learning (RL) refers to goal-oriented algorithms, which learn from the environment how to achieve a goal. On MOAs, reinforcement learning has been utilised to control the operator selection process. Existing research, however, fails to show that learned information may be transferred from one problem-solving procedure to another. The primary goal of the proposed research is to determine the impact of transfer learning on RL and MOAs. As a test problem, a set union knapsack problem with 30 separate benchmark problem instances is used. The results are statistically compared in depth. The learning process, according to the findings, improved the convergence speed while significantly reducing the CPU time.

Author(s):  
Sidik Wibowo Akhmad

The purpose of this study was to describe the students’ management in increasing the character and achievement in MAN 2 Banjarnegara including: (1) the enrollment process of new students, (2) guiding students through discipline, noble character building, academic and non-academic achievement, and (3) the impact of character building and the achievement for students MAN 2 Banjarnegara. This research implemented descriptive qualitative approach. The data collection techniques were in-depth interview, observation, and documentation study. The validity of the data used three criteria; namely credibility, dependability, and conformability. The findings of this study were: The first, the enrollment process of the new students was made a breakthrough during the registration of academic and non-academic achievement of scholarships, the selection process was conducted through the value of official learning reports, certificate of championship/achievement, academic potential test and non-academic, and also the skill test. For the students who passed the selection process were supposed to sign the achievement contract during the learning process at MAN 2 Banjarnegara. The second, the character building was done by the concept of habituation and activities program that were integrated in curricular and extracurricular activities. The third, students who joined the academic and non-academic achievement programs at MAN 2 Banjarnegara had strong motivation, spirit of competition to achieve higher achievement and more focus on self-development and they could anticipate the usage of spare time for positive things/activities.


2019 ◽  
Author(s):  
Jennifer R Sadler ◽  
Grace Elisabeth Shearrer ◽  
Nichollette Acosta ◽  
Kyle Stanley Burger

BACKGROUND: Dietary restraint represents an individual’s intent to limit their food intake and has been associated with impaired passive food reinforcement learning. However, the impact of dietary restraint on an active, response dependent learning is poorly understood. In this study, we tested the relationship between dietary restraint and food reinforcement learning using an active, instrumental conditioning task. METHODS: A sample of ninety adults completed a response-dependent instrumental conditioning task with reward and punishment using sweet and bitter tastes. Brain response via functional MRI was measured during the task. Participants also completed anthropometric measures, reward/motivation related questionnaires, and a working memory task. Dietary restraint was assessed via the Dutch Restrained Eating Scale. RESULTS: Two groups were selected from the sample: high restraint (n=29, score >2.5) and low restraint (n=30; score <1.85). High restraint was associated with significantly higher BMI (p=0.003) and lower N-back accuracy (p=0.045). The high restraint group also was marginally better at the instrumental conditioning task (p=0.066, r=0.37). High restraint was also associated with significantly greater brain response in the intracalcarine cortex (MNI: 15, -69, 12; k=35, pfwe< 0.05) to bitter taste, compared to neutral taste.CONCLUSIONS: High restraint was associated with improved performance on an instrumental task testing how individuals learn from reward and punishment. This may be mediated by greater brain response in the primary visual cortex, which has been associated with mental representation. Results suggest that dietary restraint does not impair response-dependent reinforcement learning.


2021 ◽  
Vol 13 (12) ◽  
pp. 6581
Author(s):  
Jooyoung Hwang ◽  
Anita Eves ◽  
Jason L. Stienmetz

Travellers have high standards and regard restaurants as important travel attributes. In the tourism and hospitality industry, the use of developed tools (e.g., smartphones and location-based tablets) has been popularised as a way for travellers to easily search for information and to book venues. Qualitative research using semi-structured interviews based on the face-to-face approach was adopted for this study to examine how consumers’ restaurant selection processes are performed with the utilisation of social media on smartphones. Then, thematic analysis was adopted. The findings of this research show that the adoption of social media on smartphones is positively related with consumers’ gratification. More specifically, when consumers regard that process, content and social gratification are satisfied, their intention to adopt social media is fulfilled. It is suggested by this study that consumers’ restaurant decision-making process needs to be understood, as each stage of the decision-making process is not independent; all the stages of the restaurant selection process are organically connected and influence one another.


Biomimetics ◽  
2021 ◽  
Vol 6 (1) ◽  
pp. 13
Author(s):  
Adam Bignold ◽  
Francisco Cruz ◽  
Richard Dazeley ◽  
Peter Vamplew ◽  
Cameron Foale

Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing so can be considerable. Additionally, reusing the same people for the experiment introduces bias, as they will learn the behaviour of the agent and the dynamics of the environment. This paper presents a methodology for evaluating interactive reinforcement learning agents by employing simulated users. Simulated users allow human knowledge, bias, and interaction to be simulated. The use of simulated users allows the development and testing of reinforcement learning agents, and can provide indicative results of agent performance under defined human constraints. While simulated users are no replacement for actual humans, they do offer an affordable and fast alternative for evaluative assisted agents. We introduce a method for performing a preliminary evaluation utilising simulated users to show how performance changes depending on the type of user assisting the agent. Moreover, we describe how human interaction may be simulated, and present an experiment illustrating the applicability of simulating users in evaluating agent performance when assisted by different types of trainers. Experimental results show that the use of this methodology allows for greater insight into the performance of interactive reinforcement learning agents when advised by different users. The use of simulated users with varying characteristics allows for evaluation of the impact of those characteristics on the behaviour of the learning agent.


Minerals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 587
Author(s):  
Joao Pedro de Carvalho ◽  
Roussos Dimitrakopoulos

This paper presents a new truck dispatching policy approach that is adaptive given different mining complex configurations in order to deliver supply material extracted by the shovels to the processors. The method aims to improve adherence to the operational plan and fleet utilization in a mining complex context. Several sources of operational uncertainty arising from the loading, hauling and dumping activities can influence the dispatching strategy. Given a fixed sequence of extraction of the mining blocks provided by the short-term plan, a discrete event simulator model emulates the interaction arising from these mining operations. The continuous repetition of this simulator and a reward function, associating a score value to each dispatching decision, generate sample experiences to train a deep Q-learning reinforcement learning model. The model learns from past dispatching experience, such that when a new task is required, a well-informed decision can be quickly taken. The approach is tested at a copper–gold mining complex, characterized by uncertainties in equipment performance and geological attributes, and the results show improvements in terms of production targets, metal production, and fleet management.


2021 ◽  
Vol 11 (4) ◽  
pp. 1514 ◽  
Author(s):  
Quang-Duy Tran ◽  
Sang-Hoon Bae

To reduce the impact of congestion, it is necessary to improve our overall understanding of the influence of the autonomous vehicle. Recently, deep reinforcement learning has become an effective means of solving complex control tasks. Accordingly, we show an advanced deep reinforcement learning that investigates how the leading autonomous vehicles affect the urban network under a mixed-traffic environment. We also suggest a set of hyperparameters for achieving better performance. Firstly, we feed a set of hyperparameters into our deep reinforcement learning agents. Secondly, we investigate the leading autonomous vehicle experiment in the urban network with different autonomous vehicle penetration rates. Thirdly, the advantage of leading autonomous vehicles is evaluated using entire manual vehicle and leading manual vehicle experiments. Finally, the proximal policy optimization with a clipped objective is compared to the proximal policy optimization with an adaptive Kullback–Leibler penalty to verify the superiority of the proposed hyperparameter. We demonstrate that full automation traffic increased the average speed 1.27 times greater compared with the entire manual vehicle experiment. Our proposed method becomes significantly more effective at a higher autonomous vehicle penetration rate. Furthermore, the leading autonomous vehicles could help to mitigate traffic congestion.


2021 ◽  
Vol 7 (3) ◽  
pp. 59
Author(s):  
Yohanna Rodriguez-Ortega ◽  
Dora M. Ballesteros ◽  
Diego Renza

With the exponential growth of high-quality fake images in social networks and media, it is necessary to develop recognition algorithms for this type of content. One of the most common types of image and video editing consists of duplicating areas of the image, known as the copy-move technique. Traditional image processing approaches manually look for patterns related to the duplicated content, limiting their use in mass data classification. In contrast, approaches based on deep learning have shown better performance and promising results, but they present generalization problems with a high dependence on training data and the need for appropriate selection of hyperparameters. To overcome this, we propose two approaches that use deep learning, a model by a custom architecture and a model by transfer learning. In each case, the impact of the depth of the network is analyzed in terms of precision (P), recall (R) and F1 score. Additionally, the problem of generalization is addressed with images from eight different open access datasets. Finally, the models are compared in terms of evaluation metrics, and training and inference times. The model by transfer learning of VGG-16 achieves metrics about 10% higher than the model by a custom architecture, however, it requires approximately twice as much inference time as the latter.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Abu Quwsar Ohi ◽  
M. F. Mridha ◽  
Muhammad Mostafa Monowar ◽  
Md. Abdul Hamid

AbstractPandemic defines the global outbreak of a disease having a high transmission rate. The impact of a pandemic situation can be lessened by restricting the movement of the mass. However, one of its concomitant circumstances is an economic crisis. In this article, we demonstrate what actions an agent (trained using reinforcement learning) may take in different possible scenarios of a pandemic depending on the spread of disease and economic factors. To train the agent, we design a virtual pandemic scenario closely related to the present COVID-19 crisis. Then, we apply reinforcement learning, a branch of artificial intelligence, that deals with how an individual (human/machine) should interact on an environment (real/virtual) to achieve the cherished goal. Finally, we demonstrate what optimal actions the agent perform to reduce the spread of disease while considering the economic factors. In our experiment, we let the agent find an optimal solution without providing any prior knowledge. After training, we observed that the agent places a long length lockdown to reduce the first surge of a disease. Furthermore, the agent places a combination of cyclic lockdowns and short length lockdowns to halt the resurgence of the disease. Analyzing the agent’s performed actions, we discover that the agent decides movement restrictions not only based on the number of the infectious population but also considering the reproduction rate of the disease. The estimation and policy of the agent may improve the human-strategy of placing lockdown so that an economic crisis may be avoided while mitigating an infectious disease.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Alina Trifan ◽  
José Luis Oliveira

Abstract With the continuous increase in the use of social networks, social mining is steadily becoming a powerful component of digital phenotyping. In this paper we explore social mining for the classification of self-diagnosed depressed users of Reddit as social network. We conduct a cross evaluation study based on two public datasets in order to understand the impact of transfer learning when the data source is virtually the same. We further complement these results with an experiment of transfer learning in post-partum depression classification, using a corpus we have collected for the matter. Our findings show that transfer learning in social mining might still be at an early stage in computational research and we thoroughly discuss its implications.


Sign in / Sign up

Export Citation Format

Share Document