cross entropy method
Recently Published Documents


TOTAL DOCUMENTS

270
(FIVE YEARS 48)

H-INDEX

28
(FIVE YEARS 4)

Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7611
Author(s):  
Appasamy C. Sumathi ◽  
Muthuramalingam Akila ◽  
Rocío Pérez de Prado ◽  
Marcin Wozniak ◽  
Parameshachari Bidare Divakarachari

Smart home and smart building systems based on the Internet of Things (IoT) in smart cities currently suffer from security issues. In particular, data trustworthiness and efficiency are two major concerns in Internet of Things (IoT)-based Wireless Sensor Networks (WSN). Various approaches, such as routing methods, intrusion detection, and path selection, have been applied to improve the security and efficiency of real-time networks. Path selection and malicious node discovery provide better solutions in terms of security and efficiency. This study proposed the Dynamic Bargaining Game (DBG) method for node selection and data transfer, to increase the data trustworthiness and efficiency. The data trustworthiness and efficiency are considered in the Pareto optimal solution to select the node, and the bargaining method assigns the disagreement measure to the nodes to eliminate the malicious nodes from the node selection. The DBG method performs the search process in a distributed manner that helps to find an effective solution for the dynamic networks. In this study, the data trustworthiness was measured based on the node used for data transmission and throughput was measured to analyze the efficiency. An SF attack was simulated in the network and the packet delivery ratio was measured to test the resilience of the DBG and existing methods. The results of the packet delivery ratio showed that the DBG method has higher resilience than the existing methods in a dynamic network. Moreover, for 100 nodes, the DBG method has higher data trustworthiness of 98% and throughput of 398 Mbps, whereas the existing fuzzy cross entropy method has data trustworthiness of 94% and a throughput of 334 Mbps.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Jian Liu ◽  
Liming Feng

The reinforcement learning algorithms based on policy gradient may fall into local optimal due to gradient disappearance during the update process, which in turn affects the exploration ability of the reinforcement learning agent. In order to solve the above problem, in this paper, the cross-entropy method (CEM) in evolution policy, maximum mean difference (MMD), and twin delayed deep deterministic policy gradient algorithm (TD3) are combined to propose a diversity evolutionary policy deep reinforcement learning (DEPRL) algorithm. By using the maximum mean discrepancy as a measure of the distance between different policies, some of the policies in the population maximize the distance between them and the previous generation of policies while maximizing the cumulative return during the gradient update. Furthermore, combining the cumulative returns and the distance between policies as the fitness of the population encourages more diversity in the offspring policies, which in turn can reduce the risk of falling into local optimal due to the disappearance of the gradient. The results in the MuJoCo test environment show that DEPRL has achieved excellent performance on continuous control tasks; especially in the Ant-v2 environment, the return of DEPRL ultimately achieved a nearly 20% improvement compared to TD3.


Author(s):  
Haobo Jiang ◽  
Jianjun Qian ◽  
Jin Xie ◽  
Jian Yang

Point cloud registration is a fundamental problem in 3D computer vision. In this paper, we cast point cloud registration into a planning problem in reinforcement learning, which can seek the transformation between the source and target point clouds through trial and error. By modeling the point cloud registration process as a Markov decision process (MDP), we develop a latent dynamic model of point clouds, consisting of a transformation network and evaluation network. The transformation network aims to predict the new transformed feature of the point cloud after performing a rigid transformation (i.e., action) on it while the evaluation network aims to predict the alignment precision between the transformed source point cloud and target point cloud as the reward signal. Once the dynamic model of the point cloud is trained, we employ the cross-entropy method (CEM) to iteratively update the planning policy by maximizing the rewards in the point cloud registration process. Thus, the optimal policy, i.e., the transformation between the source and target point clouds, can be obtained via gradually narrowing the search space of the transformation. Experimental results on ModelNet40 and 7Scene benchmark datasets demonstrate that our method can yield good registration performance in an unsupervised manner.


2021 ◽  
Vol 13 (10) ◽  
pp. 5386
Author(s):  
Qun Niu ◽  
Ming You ◽  
Zhile Yang ◽  
Yang Zhang

The conventional electrical power system economic dispatch (ED) often only pursues immediate economic benefits but neglects the harmful environment impacts of gas emissions from thermal power plants. To address this shortfall, economic emission dispatch (EED) has drawn a lot of attention in recent years. With the increasing penetration of renewable generation, the intermittence and uncertainty of renewable energy such as solar power and wind power increase the difficulties of power system scheduling. To enhance the dispatch performance with significant penetration of renewable energy, a modified multi-objective cross entropy algorithm (MMOCE) is proposed in this paper. To solve multi-objective optimization problems, a crowding–distance calculation technique and a novel external archive mechanism are introduced into the conventional cross entropy method. Additionally, the population updating process is simplified by introducing a self-adaptive parameter operator that substitutes the smoothing parameters, while the solution diversity and the adaptability in large scale systems are improved by introducing the crossover operator. Finally, a two-stage evolutionary mechanism further enhances the diversity and the rate of convergence. To verify the efficacy of the proposed MMOCE, eight benchmark functions and three different test systems considering different mixes of renewable energy sources are employed. The dispatch results by the proposed MMOCE are compared with other multi-objective cross entropy algorithms and published heuristic methods, confirming the superiority of the proposed MMOCE over other methods in all test systems.


Sign in / Sign up

Export Citation Format

Share Document