scholarly journals Large-Scale Interactive Recommendation with Tree-Structured Policy Gradient

Author(s):  
Haokun Chen ◽  
Xinyi Dai ◽  
Han Cai ◽  
Weinan Zhang ◽  
Xuejian Wang ◽  
...  

Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for long-run performance. As IRS is always with thousands of items to recommend (i.e., thousands of actions), most existing RL-based methods, however, fail to handle such a large discrete action space problem and thus become inefficient. The existing work that tries to deal with the large discrete action space problem by utilizing the deep deterministic policy gradient framework suffers from the inconsistency between the continuous action representation (the output of the actor network) and the real discrete action. To avoid such inconsistency and achieve high efficiency and recommendation effectiveness, in this paper, we propose a Tree-structured Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical clustering tree is built over the items and picking an item is formulated as seeking a path from the root to a certain leaf of the tree. Extensive experiments on carefully-designed environments based on two real-world datasets demonstrate that our model provides superior recommendation performance and significant efficiency improvement over state-of-the-art methods.

2018 ◽  
Vol 141 (2) ◽  
Author(s):  
Philip Odonkor ◽  
Kemper Lewis

The control of shared energy assets within building clusters has traditionally been confined to a discrete action space, owing in part to a computationally intractable decision space. In this work, we leverage the current state of the art in reinforcement learning (RL) for continuous control tasks, the deep deterministic policy gradient (DDPG) algorithm, toward addressing this limitation. The goals of this paper are twofold: (i) to design an efficient charged/discharged dispatch policy for a shared battery system within a building cluster and (ii) to address the continuous domain task of determining how much energy should be charged/discharged at each decision cycle. Experimentally, our results demonstrate an ability to exploit factors such as energy arbitrage, along with the continuous action space toward demand peak minimization. This approach is shown to be computationally tractable, achieving efficient results after only 5 h of simulation. Additionally, the agent showed an ability to adapt to different building clusters, designing unique control strategies to address the energy demands of the clusters studied.


Energies ◽  
2021 ◽  
Vol 14 (3) ◽  
pp. 584
Author(s):  
Luqin Fan ◽  
Jing Zhang ◽  
Yu He ◽  
Ying Liu ◽  
Tao Hu ◽  
...  

Microgrid has flexible composition, a complex operation mechanism, and a large amount of data while operating. However, optimization methods of microgrid scheduling do not effectively accumulate and utilize the scheduling knowledge at present. This paper puts forward a microgrid optimal scheduling method based on Deep Deterministic Policy Gradient (DDPG) and Transfer Learning (TL). This method uses Reinforcement Learning (RL) to learn the scheduling strategy and accumulates the corresponding scheduling knowledge. Meanwhile, the DDPG model is introduced to extend the microgrid scheduling strategy action from the discrete action space to the continuous action space. On this basis, this paper holds that a microgrid optimal scheduling TL algorithm on the strength of the actual supply and demand similarity is proposed with a purpose of making use of the existing scheduling knowledge effectively. The simulation results indicate that this paper can provide optimal scheduling strategy for microgrid with complex operation mechanism flexibly and efficiently through the effective accumulation of scheduling knowledge and the utilization of scheduling knowledge through TL.


2021 ◽  
Vol 11 (19) ◽  
pp. 8823
Author(s):  
Shicheng Zhou ◽  
Jingju Liu ◽  
Dongdong Hou ◽  
Xiaofeng Zhong ◽  
Yue Zhang

Penetration testing is an effective way to test and evaluate cybersecurity by simulating a cyberattack. However, the traditional methods deeply rely on domain expert knowledge, which requires prohibitive labor and time costs. Autonomous penetration testing is a more efficient and intelligent way to solve this problem. In this paper, we model penetration testing as a Markov decision process problem and use reinforcement learning technology for autonomous penetration testing in large scale networks. We propose an improved deep Q-network (DQN) named NDSPI-DQN to address the sparse reward problem and large action space problem in large-scale scenarios. First, we reasonably integrate five extensions to DQN, including noisy nets, soft Q-learning, dueling architectures, prioritized experience replay, and intrinsic curiosity model to improve the exploration efficiency. Second, we decouple the action and split the estimators of the neural network to calculate two elements of action separately, so as to decrease the action space. Finally, the performance of algorithms is investigated in a range of scenarios. The experiment results demonstrate that our methods have better convergence and scaling performance.


2018 ◽  
Author(s):  
Matthias May ◽  
Kira Rehfeld

Greenhouse gas emissions must be cut to limit global warming to 1.5-2C above preindustrial levels. Yet the rate of decarbonisation is currently too low to achieve this. Policy-relevant scenarios therefore rely on the permanent removal of CO<sub>2</sub> from the atmosphere. However, none of the envisaged technologies has demonstrated scalability to the decarbonization targets for the year 2050. In this analysis, we show that artificial photosynthesis for CO<sub>2</sub> reduction may deliver an efficient large-scale carbon sink. This technology is mainly developed towards solar fuels and its potential for negative emissions has been largely overlooked. With high efficiency and low sensitivity to high temperature and illumination conditions, it could, if developed towards a mature technology, present a viable approach to fill the gap in the negative emissions budget.<br>


2018 ◽  
Author(s):  
Matthias May ◽  
Kira Rehfeld

Greenhouse gas emissions must be cut to limit global warming to 1.5-2C above preindustrial levels. Yet the rate of decarbonisation is currently too low to achieve this. Policy-relevant scenarios therefore rely on the permanent removal of CO<sub>2</sub> from the atmosphere. However, none of the envisaged technologies has demonstrated scalability to the decarbonization targets for the year 2050. In this analysis, we show that artificial photosynthesis for CO<sub>2</sub> reduction may deliver an efficient large-scale carbon sink. This technology is mainly developed towards solar fuels and its potential for negative emissions has been largely overlooked. With high efficiency and low sensitivity to high temperature and illumination conditions, it could, if developed towards a mature technology, present a viable approach to fill the gap in the negative emissions budget.<br>


2020 ◽  
Vol 60 (1) ◽  
pp. 159-168
Author(s):  
V. V. Antonenko ◽  
A. V. Zubkov ◽  
S. N. Kruchina

Data were obtained on the basis of the results of research carried out on the territory of the educational and experimental farm of the Timiryazev State Agrarian University, in Moscow during 2018-2019. As a result of the surveys, the most dangerous diseases and pests of pome crops on the territory of this farm were established. The most resistant apple and pear varieties to major diseases have been identified. Peculiarities of development of alternariosis on pear are described, the harmfulness of the disease on pear and apple seedlings is noted. A possible role in the transfer of alternariosis infection from garden-protective plantations and weed vegetation to fruit trees was noted. A possible role has been established in the transport of septoriosis, powdery dew infection from dicotyledonous weeds plants. The peculiarities of the spread of infection under the influence of wind direction are noted. The results and peculiarities of the application of various methods of scaring birds in the orchard are presented. As a result of route surveys the most harmful weed plants have been identified. The possibility of using herbicides of different mechanism of action in fruit gardens for weed control has been studied. High efficiency and relative safety of application of herbicides of contact action in nursery fields, operational orchards and for control of piglets on fruit trees are shown. Recommendations are given for the use of soil and systemic herbicides of soil in seedlings beds, the first and second fields of the nursery, as well as in the process of production of large-scale planting material and operational orchards of fruit crops. The safety of the herbicides in question is established when used in accordance with the recommended methods of use.


2020 ◽  
Vol 18 (1) ◽  
pp. 287-294
Author(s):  
Harsasi Setyawati ◽  
Handoko Darmokoesoemo ◽  
Irmina Kris Murwani ◽  
Ahmadi Jaya Permana ◽  
Faidur Rochman

AbstractThe demands of ecofriendly technologies to produce a reliable supply of renewable energy on a large scale remains a challenge. A solar cell based on DSSC (Dye-Sensitized Solar Cell) technology is environmentally friendly and holds the promise of a high efficiency in converting sunlight into electricity. This manuscript describes the development of a light harvester system as a main part of a DSSC. Congo red dye has been functionalized with metals (Fe, Co, Ni), forming a series of complexes that serve as a novel light harvester on the solar cell. Metal-congo red complexes have been characterized by UV-VIS and FTIR spectroscopy, and elemental analyses. The performance of metal complexes in capturing photons from sunlight has been investigated in a solar cell device. The incorporation of metals to congo red successfully improved of the congo red efficiency as follows: Fe(II)-congo red, Co(II)-congo red and Ni(II)-congo red had efficiencies of 8.17%, 6.13% and 2.65%, respectively. This research also discusses the effect of metal ions on the ability of congo red to capture energy from sunlight.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jing Zhao ◽  
Alan Blayney ◽  
Xiaorong Liu ◽  
Lauren Gandy ◽  
Weihua Jin ◽  
...  

AbstractEpigallocatechin gallate (EGCG) from green tea can induce apoptosis in cancerous cells, but the underlying molecular mechanisms remain poorly understood. Using SPR and NMR, here we report a direct, μM interaction between EGCG and the tumor suppressor p53 (KD = 1.6 ± 1.4 μM), with the disordered N-terminal domain (NTD) identified as the major binding site (KD = 4 ± 2 μM). Large scale atomistic simulations (>100 μs), SAXS and AUC demonstrate that EGCG-NTD interaction is dynamic and EGCG causes the emergence of a subpopulation of compact bound conformations. The EGCG-p53 interaction disrupts p53 interaction with its regulatory E3 ligase MDM2 and inhibits ubiquitination of p53 by MDM2 in an in vitro ubiquitination assay, likely stabilizing p53 for anti-tumor activity. Our work provides insights into the mechanisms for EGCG’s anticancer activity and identifies p53 NTD as a target for cancer drug discovery through dynamic interactions with small molecules.


Author(s):  
Yuan-Ho Chen ◽  
Chieh-Yang Liu

AbstractIn this paper, a very-large-scale integration (VLSI) design that can support high-efficiency video coding inverse discrete cosine transform (IDCT) for multiple transform sizes is proposed. The proposed two-dimensional (2-D) IDCT is implemented at a low area by using a single one-dimensional (1-D) IDCT core with a transpose memory. The proposed 1-D IDCT core decomposes a 32-point transform into 16-, 8-, and 4-point matrix products according to the symmetric property of the transform coefficient. Moreover, we use the shift-and-add unit to share hardware resources between multiple transform dimension matrix products. The 1-D IDCT core can simultaneously calculate the first- and second-dimensional data. The results indicate that the proposed 2-D IDCT core has a throughput rate of 250 MP/s, with only 110 K gate counts when implemented into the Taiwan semiconductor manufacturing (TSMC) 90-nm complementary metal-oxide-semiconductor (CMOS) technology. The results show the proposed circuit has the smallest area supporting the multiple transform sizes.


Sign in / Sign up

Export Citation Format

Share Document