Cooperatively Improving Data Center Energy Efficiency Based on Multi-Agent Deep Reinforcement Learning

The problem of high power consumption in data centers is becoming more and more prominent. In order to improve the energy efficiency of data centers, cooperatively optimizing the energy of IT systems and cooling systems has become an effective way. In this paper, a model-free deep reinforcement learning (DRL)-based joint optimization method MAD3C is developed to overcome the high-dimensional state and action space problems of the data center energy optimization. A hybrid AC-DDPG cooperative multi-agent framework is devised for the improvement of the cooperation between the IT and cooling systems for further energy efficiency improvement. In the framework, a scheduling baseline comparison method is presented to enhance the stability of the framework. Meanwhile, an adaptive score is designed for the architecture in consideration of multi-dimensional resources and resource utilization improvement. Experiments show that our proposed approach can effectively reduce energy for data centers through the cooperative optimization while guaranteeing training stability and improving resource utilization.

Download Full-text

Jointly Optimizing the IT and Cooling Systems for Data Center Energy Efficiency based on Multi-Agent Deep Reinforcement Learning

Proceedings of the Eleventh ACM International Conference on Future Energy Systems ◽

10.1145/3396851.3402658 ◽

2020 ◽

Cited By ~ 1

Author(s):

Ce Chi ◽

Kaixuan Ji ◽

Avinab Marahatta ◽

Penglei Song ◽

Fa Zhang ◽

...

Keyword(s):

Energy Efficiency ◽

Reinforcement Learning ◽

Data Center ◽

Cooling Systems ◽

Multi Agent

Download Full-text

Development of an energy evaluation and design tool for dedicated cooling systems of data centers: Sensing data center cooling energy efficiency

Energy and Buildings ◽

10.1016/j.enbuild.2015.03.040 ◽

2015 ◽

Vol 96 ◽

pp. 357-372 ◽

Cited By ~ 7

Author(s):

Jinkyun Cho ◽

Joonyoung Yang ◽

Changkeun Lee ◽

Jinyoung Lee

Keyword(s):

Energy Efficiency ◽

Data Center ◽

Data Centers ◽

Design Tool ◽

Cooling Systems ◽

Sensing Data ◽

Energy Evaluation ◽

Data Center Cooling

Download Full-text

A novel optimal bipartite consensus control scheme for unknown multi-agent systems via model-free reinforcement learning

Applied Mathematics and Computation ◽

10.1016/j.amc.2019.124821 ◽

2020 ◽

Vol 369 ◽

pp. 124821 ◽

Cited By ~ 10

Author(s):

Zhinan Peng ◽

Jiangping Hu ◽

Kaibo Shi ◽

Rui Luo ◽

Rui Huang ◽

...

Keyword(s):

Reinforcement Learning ◽

Multi Agent Systems ◽

Consensus Control ◽

Agent Systems ◽

Model Free ◽

Control Scheme ◽

Multi Agent ◽

Bipartite Consensus

Download Full-text

Perimeter Cooling Unit and Localized Row-Based Cooling Unit Transient Air Flow Effects Modeling and Characterization in Data Centers

Volume 8B: Heat Transfer and Thermal Engineering ◽

10.1115/imece2015-51012 ◽

2015 ◽

Author(s):

Tianyi Gao ◽

James Geer ◽

Bahgat G. Sammakia ◽

Russell Tipton ◽

Mark Seymour

Keyword(s):

Thermal Management ◽

Data Center ◽

Data Centers ◽

Cooling System ◽

Air Flow ◽

Cooling Systems ◽

Dynamic Thermal Management ◽

Hybrid Cooling ◽

Cooling Unit ◽

Cooling Air

Cooling power constitutes a large portion of the total electrical power consumption in data centers. Approximately 25%∼40% of the electricity used within a production data center is consumed by the cooling system. Improving the cooling energy efficiency has attracted a great deal of research attention. Many strategies have been proposed for cutting the data center energy costs. One of the effective strategies for increasing the cooling efficiency is using dynamic thermal management. Another effective strategy is placing cooling devices (heat exchangers) closer to the source of heat. This is the basic design principle of many hybrid cooling systems and liquid cooling systems for data centers. Dynamic thermal management of data centers is a huge challenge, due to the fact that data centers are operated under complex dynamic conditions, even during normal operating conditions. In addition, hybrid cooling systems for data centers introduce additional localized cooling devices, such as in row cooling units and overhead coolers, which significantly increase the complexity of dynamic thermal management. Therefore, it is of paramount importance to characterize the dynamic responses of data centers under variations from different cooling units, such as cooling air flow rate variations. In this study, a detailed computational analysis of an in row cooler based hybrid cooled data center is conducted using a commercially available computational fluid dynamics (CFD) code. A representative CFD model for a raised floor data center with cold aisle-hot aisle arrangement fashion is developed. The hybrid cooling system is designed using perimeter CRAH units and localized in row cooling units. The CRAH unit supplies centralized cooling air to the under floor plenum, and the cooling air enters the cold aisle through perforated tiles. The in row cooling unit is located on the raised floor between the server racks. It supplies the cooling air directly to the cold aisle, and intakes hot air from the back of the racks (hot aisle). Therefore, two different cooling air sources are supplied to the cold aisle, but the ways they are delivered to the cold aisle are different. Several modeling cases are designed to study the transient effects of variations in the flow rates of the two cooling air sources. The server power and the cooling air flow variation combination scenarios are also modeled and studied. The detailed impacts of each modeling case on the rack inlet air temperature and cold aisle air flow distribution are studied. The results presented in this work provide an understanding of the effects of air flow variations on the thermal performance of data centers. The results and corresponding analysis is used for improving the running efficiency of this type of raised floor hybrid data centers using CRAH and IRC units.

Download Full-text

A Switch Based Resource Management Method for Energy Optimization in Cloud Data Center

International Journal of Computing ◽

10.47839/ijc.20.1.2103 ◽

2021 ◽

pp. 85-91

Author(s):

Shally Vats ◽

Sanjay Kumar Sharma ◽

Sunil Kumar

Keyword(s):

Data Center ◽

Data Centers ◽

Energy Requirement ◽

Energy Optimization ◽

Cloud Service ◽

Optimization Method ◽

Level Energy ◽

Cloud Data Center ◽

Vm Migration ◽

Cloud Data

Proliferation of large number of cloud users steered the exponential increase in number and size of the data centers. These data centers are energy hungry and put burden for cloud service provider in terms of electricity bills. There is environmental concern too, due to large carbon foot print. A lot of work has been done on reducing the energy requirement of data centers using optimal use of CPUs. Virtualization has been used as the core technology for optimal use of computing resources using VM migration. However, networking devices also contribute significantly to the responsible for the energy dissipation. We have proposed a two level energy optimization method for the data center to reduce energy consumption by keeping SLA. VM migration has been performed for optimal use of physical machines as well as switches used to connect physical machines in data center. Results of experiments conducted in CloudSim on PlanetLab data confirm superiority of the proposed method over existing methods using only single level optimization.

Download Full-text

Multi-Agent Reinforcement Learning for Optimizing Traffic Signal Timing

10.5121/csit.2021.110102 ◽

2021 ◽

Author(s):

Areej Salaymeh ◽

Loren Schwiebert ◽

Stephen Remias

Keyword(s):

Reinforcement Learning ◽

Traffic Signals ◽

Transportation Systems ◽

Traffic Signal ◽

Urban Traffic ◽

Signal Timing ◽

Model Free ◽

Proposed Model ◽

Multi Agent ◽

Traffic Signal Timing

Designing efficient transportation systems is crucial to save time and money for drivers and for the economy as whole. One of the most important components of traffic systems are traffic signals. Currently, most traffic signal systems are configured using fixed timing plans, which are based on limited vehicle count data. Past research has introduced and designed intelligent traffic signals; however, machine learning and deep learning have only recently been used in systems that aim to optimize the timing of traffic signals in order to reduce travel time. A very promising field in Artificial Intelligence is Reinforcement Learning. Reinforcement learning (RL) is a data driven method that has shown promising results in optimizing traffic signal timing plans to reduce traffic congestion. However, model-based and centralized methods are impractical here due to the high dimensional state-action space in complex urban traffic network. In this paper, a model-free approach is used to optimize signal timing for complicated multiple four-phase signalized intersections. We propose a multi-agent deep reinforcement learning framework that aims to optimize traffic flow using data within traffic signal intersections and data coming from other intersections in a Multi-Agent Environment in what is called Multi-Agent Reinforcement Learning (MARL). The proposed model consists of state-of-art techniques such as Double Deep Q-Network and Hindsight Experience Replay (HER). This research uses HER to allow our framework to quickly learn on sparse reward settings. We tested and evaluated our proposed model via a Simulation of Urban MObility simulation (SUMO). Our results show that the proposed method is effective in reducing congestion in both peak and off-peak times.

Download Full-text

Minimization of Energy Using Heuristic Resource Allocation and Migration for Cloud Computing

International Journal of Knowledge and Systems Science ◽

10.4018/ijkss.2021010106 ◽

2021 ◽

Vol 12 (1) ◽

pp. 74-83

Author(s):

Manjunatha S. ◽

Suresh L.

Keyword(s):

Cloud Computing ◽

Data Center ◽

Large Scale ◽

Data Centers ◽

Service Providers ◽

Cost Effective ◽

Cooling Systems ◽

Operational Costs ◽

Cloud Computing Service ◽

And Migration

Data center is a cost-effective infrastructure for storing large volumes of data and hosting large-scale service applications. Cloud computing service providers are rapidly deploying data centers across the world with a huge number of servers and switches. These data centers consume significant amounts of energy, contributing to high operational costs. Thus, optimizing the energy consumption of servers and networks in data centers can reduce operational costs. In a data center, power consumption is mainly due to servers, networking devices, and cooling systems, and an effective energy-saving strategy is to consolidate the computation and communication into a smaller number of servers and network devices and then power off as many unneeded servers and network devices as possible.

Download Full-text

Effect of Cooling Systems on the Energy Efficiency of Data Centers: Machine Learning Optimisation

2020 International Conference on Computational Performance Evaluation (ComPE) ◽

10.1109/compe49325.2020.9200088 ◽

2020 ◽

Author(s):

Rajendra Kumar ◽

Sunil Kumar Khatri ◽

Mario Jose Divan

Keyword(s):

Machine Learning ◽

Energy Efficiency ◽

Data Centers ◽

Cooling Systems

Download Full-text

Energy Optimization of Solar Micro-Grid Using Multi Agent Reinforcement Learning

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.787.843 ◽

2015 ◽

Vol 787 ◽

pp. 843-847

Author(s):

Leo Raju ◽

R.S. Milton ◽

S. Sakthiyanandan

Keyword(s):

Reinforcement Learning ◽

Energy Savings ◽

Learning Method ◽

Solar Pv ◽

Q Learning ◽

Pv Systems ◽

Model Free ◽

Individual Unit ◽

Multi Agent ◽

Micro Grid

In this paper, two solar Photovoltaic (PV) systems are considered; one in the department with capacity of 100 kW and the other in the hostel with capacity of 200 kW. Each one has battery and load. The capital cost and energy savings by conventional methods are compared and it is proved that the energy dependency from grid is reduced in solar micro-grid element, operating in distributed environment. In the smart grid frame work, the grid energy consumption is further reduced by optimal scheduling of the battery, using Reinforcement Learning. Individual unit optimization is done by a model free reinforcement learning method, called Q-Learning and it is compared with distributed operations of solar micro-grid using a Multi Agent Reinforcement Learning method, called Joint Q-Learning. The energy planning is designed according to the prediction of solar PV energy production and observed load pattern of department and the hostel. A simulation model was developed using Python programming.

Download Full-text

Introducing Energy Efficiency Metrics for Server and Data Centers

Volume 4: Energy Systems Analysis, Thermodynamics and Sustainability; Combustion Science and Engineering; Nanoengineering for Energy, Parts A and B ◽

10.1115/imece2011-64704 ◽

2011 ◽

Author(s):

Abdlmonem H. Beitelmal ◽

Drazen Fabris

Keyword(s):

Energy Efficiency ◽

Data Center ◽

Data Centers ◽

Cooling Efficiency ◽

Idle Power ◽

Relevant Variables ◽

The Cost ◽

Evaluation Of Data

New servers and data center metrics are introduced to facilitate proper evaluation of data centers power and cooling efficiency. These metrics will be used to help reduce the cost of operation and to provision data centers cooling resources. The most relevant variables for these metrics are identified and they are: the total facility power, the servers’ idle power, the average servers’ utilization, the cooling resources power and the total IT equipment power. These metrics can be used to characterize and classify servers and data centers performance and energy efficiency regardless of their size and location.

Download Full-text