scholarly journals A Decision-Making Model for Self-Driving Vehicles Based on Overtaking Frequency

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Mengyuan Huang ◽  
Shiwu Li ◽  
Mengzhu Guo ◽  
Lihong Han

The driving state of a self-driving vehicle represents an important component in the self-driving decision system. To ensure the safe and efficient driving state of a self-driving vehicle, the driving state of the self-driving vehicle needs to be evaluated quantitatively. In this paper, a driving state assessment method for the decision system of self-driving vehicles is proposed. First, a self-driving vehicle and surrounding vehicles are compared in terms of the overtaking frequency (OTF), and an OTF-based driving state evaluation algorithm is proposed considering the future driving efficiency. Next, a decision model based on the deep deterministic policy gradient (DDPG) algorithm and the proposed method is designed, and the driving state assessment method is integrated with the existing time-to-collision (TTC) and minimum safe distance. In addition, the reward function and multiple driving scenarios are designed so that the most efficient driving strategy at the current moment can be determined by optimal search under the condition of ensuring safety. Finally, the proposed decision model is verified by simulations in four three-lane highway scenarios. The simulation results show that the proposed decision model that integrates the self-driving vehicle driving state assessment method can help self-driving vehicles to drive safely and to maintain good maneuverability.

Electronics ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 870
Author(s):  
Yangyang Hou ◽  
Huajie Hong ◽  
Zhaomei Sun ◽  
Dasheng Xu ◽  
Zhe Zeng

As a research hotspot in the field of artificial intelligence, the application of deep reinforcement learning to the learning of the motion ability of a manipulator can help to improve the learning of the motion ability of a manipulator without a kinematic model. To suppress the overestimation bias of values in Deep Deterministic Policy Gradient (DDPG) networks, the Twin Delayed Deep Deterministic Policy Gradient (TD3) was proposed. This paper further suppresses the overestimation bias of values for multi-degree of freedom (DOF) manipulator learning based on deep reinforcement learning. Twin Delayed Deep Deterministic Policy Gradient with Rebirth Mechanism (RTD3) was proposed. The experimental results show that RTD3 applied to multi degree freedom manipulators is in place, with an improved learning ability by 29.15% on the basis of TD3. In this paper, a step-by-step reward function is proposed specifically for the learning and innovation of the multi degree of freedom manipulator’s motion ability. The view of continuous decision-making and process problem is used to guide the learning of the manipulator, and the learning efficiency is improved by optimizing the playback of experience. In order to measure the point-to-point position motion ability of a manipulator, a new evaluation index based on the characteristics of the continuous decision process problem, energy efficiency distance, is presented in this paper, which can evaluate the learning quality of the manipulator motion ability by a more comprehensive and fair evaluation algorithm.


2019 ◽  
pp. 125-133
Author(s):  
Duong Truong Thi Thuy ◽  
Anh Pham Thi Hoang

Banking has always played an important role in the economy because of its effects on individuals as well as on the economy. In the process of renovation and modernization of the country, the system of commercial banks has changed dramatically. Business models and services have become more diversified. Therefore, the performance of commercial banks is always attracting the attention of managers, supervisors, banks and customers. Bank ranking can be viewed as a multi-criteria decision model. This article uses the technique for order of preference by similarity to ideal solution (TOPSIS) method to rank some commercial banks in Vietnam.


Author(s):  
Feng Pan ◽  
Hong Bao

This paper proposes a new approach of using reinforcement learning (RL) to train an agent to perform the task of vehicle following with human driving characteristics. We refer to the ideal of inverse reinforcement learning to design the reward function of the RL model. The factors that need to be weighed in vehicle following were vectorized into reward vectors, and the reward function was defined as the inner product of the reward vector and weights. Driving data of human drivers was collected and analyzed to obtain the true reward function. The RL model was trained with the deterministic policy gradient algorithm because the state and action spaces are continuous. We adjusted the weight vector of the reward function so that the value vector of the RL model could continuously approach that of a human driver. After dozens of rounds of training, we selected the policy with the nearest value vector to that of a human driver and tested it in the PanoSim simulation environment. The results showed the desired performance for the task of an agent following the preceding vehicle safely and smoothly.


2021 ◽  
pp. 1-10
Author(s):  
Wei Zhou ◽  
Xing Jiang ◽  
Bingli Guo (Member, IEEE) ◽  
Lingyu Meng

Currently, Quality-of-Service (QoS)-aware routing is one of the crucial challenges in Software Defined Network (SDN). The QoS performances, e.g. latency, packet loss ratio and throughput, must be optimized to improve the performance of network. Traditional static routing algorithms based on Open Shortest Path First (OSPF) could not adapt to traffic fluctuation, which may cause severe network congestion and service degradation. Central intelligence of SDN controller and recent breakthroughs of Deep Reinforcement Learning (DRL) pose a promising solution to tackle this challenge. Thus, we propose an on-policy DRL mechanism, namely the PPO-based (Proximal Policy Optimization) QoS-aware Routing Optimization Mechanism (PQROM), to achieve a general and re-customizable routing optimization. PQROM can dynamically update the routing calculation by adjusting the reward function according to different optimization objectives, and it is independent of any specific network pattern. Additionally, as a black-box one-step optimization, PQROM is qualified for both continuous and discrete action space with high-dimensional input and output. The OMNeT ++ simulation experiment results show that PQROM not only has good convergence, but also has better stability compared with OSPF, less training time and simpler hyper-parameters adjustment than Deep Deterministic Policy Gradient (DDPG) and less hardware consumption than Asynchronous Advantage Actor-Critic (A3C).


2016 ◽  
Vol 7 (4) ◽  
pp. 16-26
Author(s):  
Uk Jung ◽  
Seongmin Yim ◽  
Sunguk Lim ◽  
Chongman Kim

AbstractAHP and the Kano model are such prevalent TQM tools that it may be surprising that a true hybrid decision-making model has so far eluded researchers. The quest for a hybrid approach is complicated by the differing output perspective of each model, namely discrete ranking (AHP) versus a multi-dimensional picture (Kano). This paper presents a hybrid model of AHP and Kano model, so called two-dimension AHP (2D-AHP).This paper first compares the two approaches and justifies a hybrid model based on a simple conceit drawn from the Kano perspective: given a decision hierarchy, child and parent elements can exhibit multi-dimension relationships under different circumstances. Based on this premise, the authors construct a hybrid two-dimension AHP model whereby a functional-dysfunctional question-pair technique is incorporated into a traditional AHP framework.Using the proposed hybrid model, this paper provides a practical test case of its implementation. The 2D-AHP approach revealed important evaluation variances obscured through AHP, while a survey study confirmed that the 2D-AHP approach is both feasible and preferred in some respects by respondents.Although there have been rich research efforts to combine AHP and Kano model, most of them is simply about a series of individual usage of each methodology. On the other hand, the type of hybridization between AHP and Kano model in this paper is quite unique in terms of the two dimensional perspective. The model provides a general approach with application possibilities far beyond the scope of the test case and its problem structure, and so calls for application and validation in new cases.


Author(s):  
M. Frelih ◽  
A. Fedorova

The article is devoted to the study of factors that have a negative impact on the well-being of employees in the workplace. Special attention is paid to the problem of presenteeism on the example of a large metallurgical enterprise. A review of foreign and domestic publications allows concluding that until now specialists do not have reliable and valid tools for studying the presenteeism phenomenon in organizations. The purpose of the research presented in the paper is to examine influence of the factors of the intra-organizational environment on the personnel well-being and assess the level of presenteeism at the enterprise. Empirical data were obtained by conducting a sociological survey of various categories of workers, as well as assessing the impact of presenteeism on the economic indicators of the studied enterprise. For the subsequent in-depth study of health problems in the workplace, the authors have developed a research tool based on the use of a digital service, which allows monitoring the self-feeling of employees by the self-assessment method, which determines the level of physical and psychosocial well-being of staff.


2021 ◽  
Vol 13 (4) ◽  
pp. 189-202
Author(s):  
M.M. Dmitrieva ◽  
S.V. Umnov ◽  
D.A. Podolsky

The existing tools for assessing the effectiveness of educational programs differ in the degree of effectiveness, applicability, and costs. The self-assessment method, which involves the determination of the level of expression of various qualities by students of educational programs before and after graduation. The method of assessing students' own competencies has significant limitations associated with social desirability. Nevertheless, the article analyzes the possibilities of using this method to assess the effectiveness of corporate educational programs. The results of using the self-assessment method to assess the effectiveness of corporate educational programs conducted in large organizations are presented. The possibilities and limitations of using various kinds of criteria for evaluating themselves by listeners are substantiated. The conclusion is made about the possibility of using the self-assessment method in the case of the focus of educational programs on the development of meta-competencies of managers. The article provides data on the assessment of learning outcomes, discusses the potential reasons for the differences obtained in the framework of projects, as well as the possibilities and limitations of using the self-assessment method to assess the educational effect of corporate training programs.


2021 ◽  
Author(s):  
Yijun Liu ◽  
Daopin Chen ◽  
Muxin Diao ◽  
Guangyu Xiao ◽  
Jing Yan ◽  
...  

Author(s):  
Jose Leao E Silva Filho ◽  
Danielle Costa Morais

This paper presents a group decision-making model using a distance aggregator based on Ordered Weighted Distance (OWD) which offers a solution that can reduce disagreement between decision makers (DMs). This paper discusses decision rules and sets out measures to evaluate compensatory effects that have a bearing on DMs’ opinions. The model uses formulations of distances to reveal the differences in opinion among DMs and discusses the meanings of distance and the information presented by each DM. Finally, a case study of a logistics problem is used to illustrate how the model is applied.


Sign in / Sign up

Export Citation Format

Share Document