scholarly journals Adaptive Human–Machine Evaluation Framework Using Stochastic Gradient Descent-Based Reinforcement Learning for Dynamic Competing Network

2020 ◽  
Vol 10 (7) ◽  
pp. 2558 ◽  
Author(s):  
Jinbae Kim ◽  
Hyunsoo Lee

Complex problems require considerable work, extensive computation, and the development of effective solution methods. Recently, physical hardware- and software-based technologies have been utilized to support problem solving with computers. However, problem solving often involves human expertise and guidance. In these cases, accurate human evaluations and diagnoses must be communicated to the system, which should be done using a series of real numbers. In previous studies, only binary numbers have been used for this purpose. Hence, to achieve this objective, this paper proposes a new method of learning complex network topologies that coexist and compete in the same environment and interfere with the learning objectives of the others. Considering the special problem of reinforcement learning in an environment in which multiple network topologies coexist, we propose a policy that properly computes and updates the rewards derived from quantitative human evaluation and computes together with the rewards of the system. The rewards derived from the quantitative human evaluation are designed to be updated quickly and easily in an adaptive manner. Our new framework was applied to a basketball game for validation and demonstrated greater effectiveness than the existing methods.


2020 ◽  
Vol 10 (17) ◽  
pp. 5828
Author(s):  
Jinbae Kim ◽  
Hyunsoo Lee

In recent years, the problem of reinforcement learning has become increasingly complex, and the computational demands with respect to such processes have increased. Accordingly, various methods for effective learning have been proposed. With the help of humans, the learning object can learn more accurately and quickly to maximize the reward. However, the rewards calculated by the system and via human intervention (that make up the learning environment) differ and must be used accordingly. In this paper, we propose a framework for learning the problems of competitive network topologies, wherein the environment dynamically changes agent, by computing the rewards via the system and via human evaluation. The proposed method is adaptively updated with the rewards calculated via human evaluation, making it more stable and reducing the penalty incurred while learning. It also ensures learning accuracy, including rewards generated from complex network topology consisting of multiple agents. The proposed framework contributes to fast training process using multi-agent cooperation. By implementing these methods as software programs, this study performs numerical analysis to demonstrate the effectiveness of the adaptive evaluation framework applied to the competitive network problem depicting the dynamic environmental topology changes proposed herein. As per the numerical experiments, the greater is the human intervention, the better is the learning performance with the proposed framework.



2014 ◽  
Vol 14 (03) ◽  
pp. 1450014 ◽  
Author(s):  
Jian Lin ◽  
Bo Peng ◽  
Tianrui Li

Image segmentation is a fundamental task in automatic image analysis. However, there is still no generally accepted effectiveness measure which is suitable for evaluating the segmentation quality in every application. In this paper, we propose an evaluation framework which benefits from multiple stand-alone measures. To this end, different segmentation evaluation measures are chosen to evaluate segmentation separately, and the results are effectively combined using machine learning methods. We train and implement this framework in our brand-new segmentation dataset which contains images of different contents with segmentation ground truth and Weizmann segmentation database (WSD). In addition, we provide human evaluation of image segmentation pairs to benchmark the evaluation results of the measures. Experimental results show a better performance than the stand-alone methods.



2020 ◽  
Vol 34 (05) ◽  
pp. 7969-7976
Author(s):  
Junjie Hu ◽  
Yu Cheng ◽  
Zhe Gan ◽  
Jingjing Liu ◽  
Jianfeng Gao ◽  
...  

Previous storytelling approaches mostly focused on optimizing traditional metrics such as BLEU, ROUGE and CIDEr. In this paper, we re-examine this problem from a different angle, by looking deep into what defines a natural and topically-coherent story. To this end, we propose three assessment criteria: relevance, coherence and expressiveness, which we observe through empirical analysis could constitute a “high-quality” story to the human eye. We further propose a reinforcement learning framework, ReCo-RL, with reward functions designed to capture the essence of these quality criteria. Experiments on the Visual Storytelling Dataset (VIST) with both automatic and human evaluation demonstrate that our ReCo-RL model achieves better performance than state-of-the-art baselines on both traditional metrics and the proposed new criteria.



2019 ◽  
Vol 5 (1) ◽  
Author(s):  
Xiao-Ming Zhang ◽  
Zezhu Wei ◽  
Raza Asad ◽  
Xu-Chen Yang ◽  
Xin Wang

Abstract Reinforcement learning has been widely used in many problems, including quantum control of qubits. However, such problems can, at the same time, be solved by traditional, non-machine-learning methods, such as stochastic gradient descent and Krotov algorithms, and it remains unclear which one is most suitable when the control has specific constraints. In this work, we perform a comparative study on the efficacy of three reinforcement learning algorithms: tabular Q-learning, deep Q-learning, and policy gradient, as well as two non-machine-learning methods: stochastic gradient descent and Krotov algorithms, in the problem of preparing a desired quantum state. We found that overall, the deep Q-learning and policy gradient algorithms outperform others when the problem is discretized, e.g. allowing discrete values of control, and when the problem scales up. The reinforcement learning algorithms can also adaptively reduce the complexity of the control sequences, shortening the operation time and improving the fidelity. Our comparison provides insights into the suitability of reinforcement learning in quantum control problems.





2008 ◽  
Vol 38 (6) ◽  
pp. 1357-1365 ◽  
Author(s):  
Anna V. Tikina ◽  
John L. Innes

With increasing concerns about the costs of forest management, there is a need to rigorously evaluate any management activities that add to costs. Certification has been widely adopted at considerable financial cost to those managing forests. Although there have been many studies of the impacts of certification, there is no comprehensive framework for assessing whether or not certification has been effective in achieving its goals. To do this, certification needs to be viewed as a part of an international environmental regime. Using established methodologies, this paper applies an evaluation framework and examines forest certification effectiveness in a number of categories: problem solving, goal attainment, behavioural effectiveness, process effectiveness, constitutive effectiveness, and evaluative effectiveness. It is too early to assess its effectiveness in problem solving and goal attainment. However, forest certification has been quite successful at process and constitutive effectiveness and is now widely recognized by a range of institutions. Its effectiveness in changing behaviours is less clear, and its evaluative effectiveness remains to be determined.



2020 ◽  
Vol 34 (05) ◽  
pp. 9306-9313
Author(s):  
Liqiang Xiao ◽  
Lu Wang ◽  
Hao He ◽  
Yaohui Jin

Jointly using the extractive and abstractive summarization methods can combine their complementary advantages, generating both informative and concise summary. Existing methods that adopt an extract-then-abstract strategy have achieved impressive results, yet they suffer from the information loss in the abstraction step because they compress all the selected sentences without distinguish. Especially when the whole sentence is summary-worthy, salient content would be lost by compression. To address this problem, we propose HySum, a hybrid framework for summarization that can flexibly switch between copying sentence and rewriting sentence according to the degree of redundancy. In this way, our approach can effectively combine the advantages of two branches of summarization, juggling informativity and conciseness. Moreover, we based on Hierarchical Reinforcement Learning, propose an end-to-end reinforcing method to bridge together the extraction module and rewriting module, which can enhance the cooperation between them. Automatic evaluation shows that our approach significantly outperforms the state-of-the-arts on the CNN/DailyMail corpus. Human evaluation also demonstrates that our generated summaries are more informative and concise than popular models.



2020 ◽  
Vol 34 (03) ◽  
pp. 2693-2700
Author(s):  
Paul Hongsuck Seo ◽  
Piyush Sharma ◽  
Tomer Levinboim ◽  
Bohyung Han ◽  
Radu Soricut

Human ratings are currently the most accurate way to assess the quality of an image captioning model, yet most often the only used outcome of an expensive human rating evaluation is a few overall statistics over the evaluation dataset. In this paper, we show that the signal from instance-level human caption ratings can be leveraged to improve captioning models, even when the amount of caption ratings is several orders of magnitude less than the caption training data. We employ a policy gradient method to maximize the human ratings as rewards in an off-policy reinforcement learning setting, where policy gradients are estimated by samples from a distribution that focuses on the captions in a caption ratings dataset. Our empirical evidence indicates that the proposed method learns to generalize the human raters' judgments to a previously unseen set of images, as judged by a different set of human judges, and additionally on a different, multi-dimensional side-by-side human evaluation procedure.



Sign in / Sign up

Export Citation Format

Share Document