Influence of Experimental Uncertainty on Prediction of Holistic Multi-Scale Data Center Energy Efficiency

Author(s):  
Thomas J. Breen ◽  
Ed J. Walsh ◽  
Jeff Punch ◽  
Amip J. Shah ◽  
Niru Kumari ◽  
...  

As the energy footprint of data centers continues to increase, models that allow for “what-if” simulations of different data center design and management paradigms will be important. Prior work by the authors has described a multi-scale energy efficiency model that allows for evaluating the coefficient of performance of the data center ensemble (COPGrand), and demonstrated the utility of such a model for purposes of choosing operational set-points and evaluating design trade-offs. However, experimental validation of these models poses a challenge because of the complexity involved with tailoring such a model for implementation to legacy data centers, with shared infrastructure and limited control over IT workload. Further, test facilities with dummy heat loads or artificial racks in lieu of IT equipment generally have limited utility in validating end-to-end models owing to the inability of such loads to mimic phenomena such as fan scalability, etc. In this work, we describe the experimental analysis conducted in a special test chamber and data center facility. The chamber, focusing on system level effects, is loaded with an actual IT rack, and a compressor delivers chilled air to the chamber at a preset temperature. By varying the load in the IT rack as well as the air delivery parameters — such as flow rate, supply temperature, etc. — a setup which simulates the system level of a data center is created. Experimental tests within a live data center facility are also conducted where the operating conditions of the cooling infrastructure are monitored — such as fluid temperatures, flow rates, etc. — and can be analyzed to determine effects such as air flow recirculation, heat exchanger performance, etc. Using the experimental data a multi-scale model configuration emulating the data center can be defined. We compare the results from such experimental analysis to a multi-scale energy efficiency model of the data center, and discuss the accuracies as well as inaccuracies within such a model. Difficulties encountered in the experimental work are discussed. The paper concludes by discussing areas for improvement in such modeling and experimental evaluation. Further validation of the complete multi-scale data center energy model is planned.

2020 ◽  
Vol 142 (2) ◽  
Author(s):  
Yogesh Fulpagare ◽  
Atul Bhargav ◽  
Yogendra Joshi

Abstract With the explosion in digital traffic, the number of data centers as well as demands on each data center, continue to increase. Concomitantly, the cost (and environmental impact) of energy expended in the thermal management of these data centers is of concern to operators in particular, and society in general. In the absence of physics-based control algorithms, computer room air conditioning (CRAC) units are typically operated through conservatively predetermined set points, resulting in suboptimal energy consumption. For a more optimal control algorithm, predictive capabilities are needed. In this paper, we develop a data-informed, experimentally validated and computationally inexpensive system level predictive tool that can forecast data center behavior for a broad range of operating conditions. We have tested this model on experiments as well as on (experimentally) validated transient computational fluid dynamics (CFD) simulations for two different data center design configurations. The validated model can accurately forecast temperatures and air flows in a data center (including the rack air temperatures) for 10–15 min into the future. Once integrated with control aspects, we expect that this model can form an important building block in a future intelligent, increasingly automated data center environment management systems.


2008 ◽  
Author(s):  
Amip J. Shah ◽  
Anita Rogacs ◽  
Chandrakant D. Patel

With the proliferation of multiple cores on a single processor and the integration of numerous stacked die in chip packages, the resulting non-uniformity in power dissipation necessitates spatially and temporally localized control of temperature in the package. Thermoelectric (TE) devices can potentially provide one mechanism to achieve such control. Unfortunately, at typical junction-to-ambient temperatures, the coefficient-of-performance (COP) of existing bulk TE devices tends to be quite low. As a result, for many high-power systems, the additional power input required to operate a TE cooling module can lead to an increase in the overall system cost-of-ownership, causing TE cooling solutions to be excluded from cost-sensitive thermal management solutions in high-volume computer systems. However, recent trends of compaction and consolidation of computer servers in high-density data centers have resulted in a dramatic increase in the burdened cost-of-ownership of mission-critical IT facilities. For example, the energy consumption of the cooling infrastructure for many high-density data centers can equal or exceed the power consumption of the compute equipment. Thus, for the growing enterprise thermal management segment, the appropriate metric is no longer the COP of the thermal solution but rather the COP of the cooling ensemble, which takes into account the energy efficiency of cooling solutions in the chip, system, rack, data center and facility. To examine the effects of chip-level COP on the ensemble-level COP, this paper explores a case study comparing two ensemble solutions. In one case, local hotspots on a chip in a three-dimensional package are mitigated by increasing fan power at the system level (and subsequently, in the computer room and the rest of the ensemble). In another case, local hotspots at the chip are mitigated through spot-cooling via TE cooling modules. For each of these cases, the COP of the ensemble is evaluated. The model suggests that while feasible, the benefit of using TEs at current performance levels is limited. However, ongoing research that may improve TE performance in the future has the potential to enhance infrastructure energy efficiency.


Author(s):  
Abdlmonem H. Beitelmal ◽  
Drazen Fabris

New servers and data center metrics are introduced to facilitate proper evaluation of data centers power and cooling efficiency. These metrics will be used to help reduce the cost of operation and to provision data centers cooling resources. The most relevant variables for these metrics are identified and they are: the total facility power, the servers’ idle power, the average servers’ utilization, the cooling resources power and the total IT equipment power. These metrics can be used to characterize and classify servers and data centers performance and energy efficiency regardless of their size and location.


Energies ◽  
2020 ◽  
Vol 13 (22) ◽  
pp. 6147
Author(s):  
Jinkyun Cho ◽  
Jesang Woo ◽  
Beungyong Park ◽  
Taesub Lim

Removing heat from high-density information technology (IT) equipment is essential for data centers. Maintaining the proper operating environment for IT equipment can be expensive. Rising energy cost and energy consumption has prompted data centers to consider hot aisle and cold aisle containment strategies, which can improve the energy efficiency and maintain the recommended level of inlet air temperature to IT equipment. It can also resolve hot spots in traditional uncontained data centers to some degree. This study analyzes the IT environment of the hot aisle containment (HAC) system, which has been considered an essential solution for high-density data centers. The thermal performance was analyzed for an IT server room with HAC in a reference data center. Computational fluid dynamics analysis was conducted to compare the operating performances of the cooling air distribution systems applied to the raised and hard floors and to examine the difference in the IT environment between the server rooms. Regarding operating conditions, the thermal performances in a state wherein the cooling system operated normally and another wherein one unit had failed were compared. The thermal performance of each alternative was evaluated by comparing the temperature distribution, airflow distribution, inlet air temperatures of the server racks, and recirculation ratio from the outlet to the inlet. In conclusion, the HAC system with a raised floor has higher cooling efficiency than that with a hard floor. The HAC with a raised floor over a hard floor can improve the air distribution efficiency by 28%. This corresponds to 40% reduction in the recirculation ratio for more than 20% of the normal cooling conditions. The main contribution of this paper is that it realistically implements the effectiveness of the existing theoretical comparison of the HAC system by developing an accurate numerical model of a data center with a high-density fifth-generation (5G) environment and applying the operating conditions.


Author(s):  
Chandrakant Patel ◽  
Ratnesh Sharma ◽  
Cullen Bash ◽  
Sven Graupner

Computing will be pervasive, and enablers of pervasive computing will be data centers housing computing, networking and storage hardware. The data center of tomorrow is envisaged as one containing thousands of single board computing systems deployed in racks. A data center, with 1000 racks, over 30,000 square feet, would require 10 MW of power to power the computing infrastructure. At this power dissipation, an additional 5 MW would be needed by the cooling resources to remove the dissipated heat. At $100/MWh, the cooling alone would cost $4 million per annum for such a data center. The concept of Computing Grid, based on coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations, is emerging as the new paradigm in distributed and pervasive computing for scientific as well as commercial applications. We envision a global network of data centers housing an aggregation of computing, networking and storage hardware. The increased compaction of such devices in data centers has created thermal and energy management issues that inhibit sustainability of such a global infrastructure. In this paper, we propose the framework of Energy Aware Grid that will provide a global utility infrastructure explicitly incorporating energy efficiency and thermal management among data centers. Designed around an energy-aware co-allocator, workload placement decisions will be made across the Grid, based on data center energy efficiency coefficients. The coefficient, evaluated by the data center’s resource allocation manager, is a complex function of the data center thermal management infrastructure and the seasonal and diurnal variations. A detailed procedure for implementation of a test case is provided with an estimate of energy savings to justify the economics. An example workload deployment shown in the paper aspires to seek the most energy efficient data center in the global network of data centers. The locality based energy efficiency in a data center is shown to arise from use of ground coupled loops in cold climates to lower ambient temperature for heat rejection e.g. computing and rejecting heat from a data center at nighttime ambient of 20°C. in New Delhi, India while Phoenix, USA is at 45°C. The efficiency in the cooling system in the data center in New Delhi is derived based on lower lift from evaporator to condenser. Besides the obvious advantage due to external ambient, the paper also incorporates techniques that rate the efficiency arising from internal thermo-fluids behavior of a data center in workload placement decision.


Author(s):  
Thomas J. Breen ◽  
Ed J. Walsh ◽  
Jeff Punch ◽  
Amip J. Shah ◽  
Cullen E. Bash ◽  
...  

The power consumption of the chip package is known to vary with operating temperature, independently of the workload processing power. This variation is commonly known as chip leakage power, typically accounting for ∼10% of total chip power consumption. The influence of operating temperature on leakage power consumption is a major concern for the IT industry for design optimization where IT system power densities are steadily increasing and leakage power expected to account for up to ∼50% of chip power in the near future associated with the reducing package size. Much attention has been placed on developing models of the chip leakage power as a function of package temperature, ranging from simple linear models to complex super-linear models. This knowledge is crucial for IT system designers to improve chip level energy efficiency and minimize heat dissipation. However, this work has been focused on the component level with little thought given to the impact of chip leakage power on entire data center efficiency. Studies on data center power consumption quote IT system heat dissipation as a constant value without accounting for the variance of chip power with operating temperature due to leakage power. Previous modeling techniques have also omitted this temperature dependent relationship. In this paper we discuss the need for chip leakage power to be included in the analysis of holistic data center performance. A chip leakage power model is defined and its implementation into an existing multi-scale data center energy model is discussed. Parametric studies are conducted over a range of system and environment operating conditions to evaluate the impact of varying degrees of chip leakage power. Possible strategies for mitigating the impact of leakage power are also illustrated in this study. This work illustrates that when including chip leakage power in the data center model, a compromise exists between increasing operating temperatures to improve cooling infrastructure efficiency and the increase in heat load at higher operating temperatures due to leakage power.


Author(s):  
Michael K. Patterson ◽  
Michael Meakins ◽  
Dennis Nasont ◽  
Prasad Pusuluri ◽  
William Tschudi ◽  
...  

Increasing energy-efficient performance built into today’s servers has created significant opportunities for expanded Information and Communications Technology (ICT) capabilities. Unfortunately the power densities of these systems now challenge the data center cooling systems and have outpaced the ability of many data centers to support them. One of the persistent problems yet to be overcome in the data center space has been the separate worlds of the ICT and Facilities design and operations. This paper covers the implementation of a demonstration project where the integration of these two management systems can be used to gain significant energy savings while improving the operations staff’s visibility to the full data center; both ICT and facilities. The majority of servers have a host of platform information available to the ICT management network. This demonstration project takes the front panel temperature sensor data from the servers and provides that information over to the facilities management system to control the cooling system in the data center. The majority of data centers still use the cooling system return air temperature as the primary control variable to adjust supply air temperature, significantly limiting energy efficiency. Current best practices use a cold aisle temperature sensor to drive the cooling system. But even in this case the sensor is still only a proxy for what really matters; the inlet temperature to the servers. The paper presents a novel control scheme in which the control of the cooling system is split into two control loops to maximize efficiency. The first control loop is the cooling fluid which is driven by the temperature from the physically lower server to ensure the correct supply air temperature. The second control loop is the airflow in the cooling system. A variable speed drive is controlled by a differential temperature from the lower server to the server at the top of the rack. Controlling to this differential temperature will minimize the amount of air moved (and energy to do so) while ensuring no recirculation from the hot aisle. Controlling both of these facilities parameters by the server’s data will allow optimization of the energy used in the cooling system. Challenges with the integration of the ICT management data with the facilities control system are discussed. It is expected that this will be the most fruitful area in improving data center efficiency over the next several years.


Author(s):  
Jimil M. Shah ◽  
Roshan Anand ◽  
Satyam Saini ◽  
Rawhan Cyriac ◽  
Dereje Agonafer ◽  
...  

Abstract A remarkable amount of data center energy is consumed in eliminating the heat generated by the IT equipment to maintain and ensure safe operating conditions and optimum performance. The installation of Airside Economizers, while very energy efficient, bears the risk of particulate contamination in data centers, hence, deteriorating the reliability of IT equipment. When RH in data centers exceeds the deliquescent relative humidity (DRH) of salts or accumulated particulate matter, it absorbs moisture, becomes wet and subsequently leads to electrical short circuiting because of degraded surface insulation resistance between the closely spaced features on printed circuit boards. Another concern with this type of failure is the absence of evidence that hinders the process of evaluation and rectification. Therefore, it is imperative to develop a practical test method to determine the DRH value of the accumulated particulate matter found on PCBs (Printed Circuit Boards). This research is a first attempt to develop an experimental technique to measure the DRH of dust particles by logging the leakage current versus RH% (Relative Humidity percentage) for the particulate matter dispensed on an interdigitated comb coupon. To validate this methodology, the DRH of pure salts like MgCl2, NH4NO3 and NaCl is determined and their results are then compared with their published values. This methodology was therefore implemented to help lay a modus operandi of establishing the limiting value or an effective relative humidity envelope to be maintained at a real-world data center facility situated in Dallas industrial area for its continuous and reliable operation.


Author(s):  
Cullen Bash ◽  
George Forman

Data center costs for computer power and cooling have been steadily increasing over the past decade. Much work has been done in recent years on understanding how to improve the delivery of cooling resources to IT equipment in data centers, but little attention has been paid to the optimization of heat production by considering the placement of application workload. Because certain physical locations inside the data center are more efficient to cool than others, this suggests that allocating heavy computational workloads onto those servers that are in more efficient places might bring substantial savings. This paper explores this issue by introducing a workload placement metric that considers the cooling efficiency of the environment. Additionally, results from a set of experiments that utilize this metric in a thermally isolated portion of a real data center are described. The results show that the potential savings is substantial and that further work in this area is needed to exploit the savings opportunity.


Author(s):  
Amip Shah ◽  
Cullen Bash ◽  
Ratnesh Sharma ◽  
Tom Christian ◽  
Brian J. Watson ◽  
...  

Numerous evaluation metrics and standards are being proposed across industry and government to measure and monitor the energy efficiency of data centers. However, the energy use of data centers is just one aspect of the environmental impact. In this paper, we explore the overall environmental footprint of data centers beyond just energy efficiency. Building upon established procedures from the environmental sciences, we create an end-to-end life-cycle model of the environmental footprint of data centers across a diverse range of impacts. We test this model in the case study of a hypothetical 2.2-MW data center. Our analysis suggests the need for evaluation metrics that go beyond just operational energy use in order to achieve sustainable data centers.


Sign in / Sign up

Export Citation Format

Share Document