heSRPT

Modern data centers serve workloads which can exploit parallelism. When a job parallelizes across multiple servers it completes more quickly. However, it is unclear how to share a limited number of servers between many parallelizable jobs. In this paper we consider a typical scenario where a data center composed of N servers will be tasked with completing a set of M parallelizable jobs. Typically, M is much smaller than N. In our scenario, each job consists of some amount of inherent work which we refer to as a job's size. We assume that job sizes are known up front to the system, and each job can utilize any number of servers at any moment in time. These assumptions are reasonable for many parallelizable workloads such as training neural networks using TensorFlow [2]. Our goal in this paper is to allocate servers to jobs so as to minimize the mean slowdown across all jobs, where the slowdown of a job is the job's completion time divided by its running time if given exclusive access to all N servers. Slowdown measures how a job was interfered with by other jobs in the system, and is often the metric of interest in the theoretical parallel scheduling literature (where it is also called stretch), as well as the HPC community (where it is called expansion factor).

Download Full-text

Research on Intelligent Air Conditioning System of Data Center

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.602-605.928 ◽

2014 ◽

Vol 602-605 ◽

pp. 928-932

Author(s):

Min Li ◽

Yun Wang ◽

Zheng Qian Feng ◽

Wang Li

Keyword(s):

Heat Exchange ◽

Energy Saving ◽

Data Center ◽

Air Conditioning ◽

Data Centers ◽

Cooling Efficiency ◽

Exchange System ◽

Air Conditioning System ◽

Running Time ◽

Heat Exchange System

By studying the energy-saving technologies of air-conditioning system in data centers, we designed a intelligent air conditioning system, improved the cooling efficiency of air conditioning system through a reasonable set of hot and cold aisles, reduced the running time of HVAC by using the intelligent heat exchange system, an provided a reference for energy saving research of air conditioning system of data centers.

Download Full-text

Data Center Modeling Using Response Surface With Multi-Parameters Approach

Volume 8B: Heat Transfer and Thermal Engineering ◽

10.1115/imece2015-52205 ◽

2015 ◽

Author(s):

Long Phan ◽

Cheng-Xian Lin ◽

Mackenson Telusma

Keyword(s):

Response Surface ◽

Thermal Management ◽

Data Center ◽

Large Scale ◽

Data Centers ◽

Temperature Fields ◽

Design Parameters ◽

Full Field ◽

Running Time ◽

Training Error

Energy consumption and thermal management have become key challenges in the design of large-scale data centers, where perforated tiles are used together with cold and hot aisles configuration to improve thermal management. Although full-field simulations using computational fluid dynamics and heat transfer (CFD/HT) tools can be applied to predict the flow and temperature fields inside data centers, their running time remains the biggest challenge to most modelers. In this paper, response surface methodology based on radial basis function is used to drastically reduce the running time while preserving the accuracy of the model. Response surface method with data interpolation allows the study of many design parameters of data center model more feasible and economical in terms of modeling time. Three scenarios of response surface construction are investigated (5%, 10%, and 20%). The method shows very good agreement with the simulation results obtained from CFD/HT model as in the case of 20% of the original CFD data points used for response surface training. Error analysis is carried out to quantify the error associated with each scenario. Case 20% shows superb accuracy as compared to others. With only 2.12 × 104 in mean relative error and R2 = 0.970, the case can capture most of the aspects of the original CFD model.

Download Full-text

Performance Analysis of Heterogeneous Data Centers in Cloud Computing Using a Complex Queuing Model

Mathematical Problems in Engineering ◽

10.1155/2015/980945 ◽

2015 ◽

Vol 2015 ◽

pp. 1-15 ◽

Cited By ~ 9

Author(s):

Wei-Hua Bai ◽

Jian-Qing Xi ◽

Jia-Xian Zhu ◽

Shao-Wei Huang

Keyword(s):

Analytical Model ◽

Data Center ◽

Queuing Theory ◽

Data Centers ◽

Heterogeneous Data ◽

Queuing Model ◽

Research Attention ◽

Cloud Data ◽

Mean Waiting Time ◽

The Mean

Performance evaluation of modern cloud data centers has attracted considerable research attention among both cloud providers and cloud customers. In this paper, we investigate the heterogeneity of modern data centers and the service process used in these heterogeneous data centers. Using queuing theory, we construct a complex queuing model composed of two concatenated queuing systems and present this as an analytical model for evaluating the performance of heterogeneous data centers. Based on this complex queuing model, we analyze the mean response time, the mean waiting time, and other important performance indicators. We also conduct simulation experiments to confirm the validity of the complex queuing model. We further conduct numerical experiments to demonstrate that the traffic intensity (or utilization) of each execution server, as well as the configuration of server clusters, in a heterogeneous data center will impact the performance of the system. Our results indicate that our analytical model is effective in accurately estimating the performance of the heterogeneous data center.

Download Full-text

Optimal Scheduling of Parallel Jobs With Unknown Service Requirements

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Handbook of Research on Methodologies and Applications of Supercomputing ◽

10.4018/978-1-7998-7156-9.ch003 ◽

2021 ◽

pp. 18-40

Author(s):

Benjamin Berg ◽

Mor Harchol-Balter

Keyword(s):

Response Time ◽

Data Centers ◽

Optimal Allocation ◽

Large Data ◽

Optimal Scheduling ◽

System Efficiency ◽

Mean Response Time ◽

Parallel Jobs ◽

Multiple Servers ◽

The Mean

Large data centers composed of many servers provide the opportunity to improve performance by parallelizing jobs. However, effectively exploiting parallelism is non-trivial. For each arriving job, one must decide the number of servers on which the job is run. The goal is to determine the optimal allocation of servers to jobs that minimizes the mean response time across jobs – the average time from when a job arrives until it completes. Parallelizing a job across multiple servers reduces the response time of that individual job. However, jobs receive diminishing returns from being allocated additional servers, so allocating too many servers to a single job leads to low system efficiency. The authors consider the case where the remaining sizes of jobs are unknown to the system at every moment in time. They prove that, if all jobs follow the same speedup function, the optimal policy is EQUI, which divides servers equally among jobs. When jobs follow different speedup functions, EQUI is no longer optimal and they provide an alternate policy, GREEDY*, which performs within 1% of optimal in simulation.

Download Full-text

Multi-Objective Optimization of a Data Center Modeling Using Response Surface

Volume 8: Heat Transfer and Thermal Engineering ◽

10.1115/imece2016-67800 ◽

2016 ◽

Author(s):

Long Phan ◽

Cheng-Xian Lin

Keyword(s):

Radial Basis Function ◽

Response Surface ◽

Thermal Management ◽

Basis Function ◽

Data Center ◽

Data Centers ◽

Maximum Temperature ◽

Design Parameters ◽

Running Time ◽

Radial Basis

Energy consumption and thermal management have become key challenges in the design of large-scale data centers, where perforated tiles are used together with cold and hot aisles configuration to improve thermal management. Although full-field simulations using computational fluid dynamics and heat transfer (CFD/HT) tools can be applied to predict the flow and temperature fields inside data centers, their running time remain the biggest challenge to most modelers. In this paper, response surface methodology based on radial basis function is used to significantly reduce the running time for generating a large set of generations during a two-objective minimization process which uses the genetic algorithm as its main engine. Three design parameters including mass flow inlet, inlet temperature, and server heat load are investigated for a two-objective optimization. The goal is to minimize both the temperature difference and the maximum temperature inside the data center and search for a range of design parameters that satisfy both of these objectives. Numerous radial basis function models are studied and compared. Discussion on a more preferred scheme for the response surface construction is provided. Finally, a graph of Pareto font is generated showing the set of optimal designs in the objective space, and Pareto design validation is also performed.

Download Full-text

Data Center Equipment Reliability Concerns—Contamination Issues, Standards Actions, and Case Studies

ISTFA 2013: Conference Proceedings from the 39th International Symposium for Testing and Failure Analysis ◽

10.31399/asm.cp.istfa2013p0438 ◽

2013 ◽

Author(s):

Chris Muller ◽

Chuck Arent ◽

Henry Yu

Keyword(s):

Case Studies ◽

Data Center ◽

Data Centers ◽

Electronic Equipment ◽

Circuit Board ◽

Lead Free ◽

Reliable Operation ◽

Contamination Assessment ◽

Equipment Reliability ◽

Equipment Failures

Abstract Lead-free manufacturing regulations, reduction in circuit board feature sizes and the miniaturization of components to improve hardware performance have combined to make data center IT equipment more prone to attack by corrosive contaminants. Manufacturers are under pressure to control contamination in the data center environment and maintaining acceptable limits is now critical to the continued reliable operation of datacom and IT equipment. This paper will discuss ongoing reliability issues with electronic equipment in data centers and will present updates on ongoing contamination concerns, standards activities, and case studies from several different locations illustrating the successful application of contamination assessment, control, and monitoring programs to eliminate electronic equipment failures.

Download Full-text

Bandwidth-Aware Rescheduling Mechanism in SDN-Based Data Center Networks

Electronics ◽

10.3390/electronics10151774 ◽

2021 ◽

Vol 10 (15) ◽

pp. 1774

Author(s):

Ming-Chin Chuang ◽

Chia-Cheng Yen ◽

Chia-Jui Hung

Keyword(s):

Data Center ◽

Completion Time ◽

Network Performance ◽

Data Locality ◽

Task Completion ◽

Data Center Networks ◽

Task Completion Time ◽

Data Packets ◽

Network Bandwidth ◽

Data Process

Recently, with the increase in network bandwidth, various cloud computing applications have become popular. A large number of network data packets will be generated in such a network. However, most existing network architectures cannot effectively handle big data, thereby necessitating an efficient mechanism to reduce task completion time when large amounts of data are processed in data center networks. Unfortunately, achieving the minimum task completion time in the Hadoop system is an NP-complete problem. Although many studies have proposed schemes for improving network performance, they have shortcomings that degrade their performance. For this reason, in this study, we propose a centralized solution, called the bandwidth-aware rescheduling (BARE) mechanism for software-defined network (SDN)-based data center networks. BARE improves network performance by employing a prefetching mechanism and a centralized network monitor to collect global information, sorting out the locality data process, splitting tasks, and executing a rescheduling mechanism with a scheduler to reduce task completion time. Finally, we used simulations to demonstrate our scheme’s effectiveness. Simulation results show that our scheme outperforms other existing schemes in terms of task completion time and the ratio of data locality.

Download Full-text

Is This a Geolibrary? A Case of the Idaho Geospatial Data Center

Information Technology and Libraries ◽

10.6017/ital.v19i1.10068 ◽

2017 ◽

Vol 19 (1) ◽

pp. 4-10 ◽

Cited By ~ 1

Author(s):

Maria Anna Jankowska ◽

Piotr Jankowski

Keyword(s):

Digital Library ◽

Future Development ◽

Data Center ◽

Data Centers ◽

Public Domain ◽

The State ◽

Geospatial Data ◽

Design And Implementation ◽

Geographic Data

The article presents the Idaho Geospatial Data Center (IGDC), a digital library of public-domain geographic data for the state of Idaho. The design and implementation of IGDC are introduced as part of the larger context of a geolibrary model. The article presents methodology and tools used to build IGDC with the focus on a geolibrary map browser. The use of IGDC is evaluated from the perspective of accessa and demand for geographic data. Finally, the article offers recommendations for future development of geospatial data centers.

Download Full-text

Perimeter Cooling Unit and Localized Row-Based Cooling Unit Transient Air Flow Effects Modeling and Characterization in Data Centers

Volume 8B: Heat Transfer and Thermal Engineering ◽

10.1115/imece2015-51012 ◽

2015 ◽

Author(s):

Tianyi Gao ◽

James Geer ◽

Bahgat G. Sammakia ◽

Russell Tipton ◽

Mark Seymour

Keyword(s):

Thermal Management ◽

Data Center ◽

Data Centers ◽

Cooling System ◽

Air Flow ◽

Cooling Systems ◽

Dynamic Thermal Management ◽

Hybrid Cooling ◽

Cooling Unit ◽

Cooling Air

Cooling power constitutes a large portion of the total electrical power consumption in data centers. Approximately 25%∼40% of the electricity used within a production data center is consumed by the cooling system. Improving the cooling energy efficiency has attracted a great deal of research attention. Many strategies have been proposed for cutting the data center energy costs. One of the effective strategies for increasing the cooling efficiency is using dynamic thermal management. Another effective strategy is placing cooling devices (heat exchangers) closer to the source of heat. This is the basic design principle of many hybrid cooling systems and liquid cooling systems for data centers. Dynamic thermal management of data centers is a huge challenge, due to the fact that data centers are operated under complex dynamic conditions, even during normal operating conditions. In addition, hybrid cooling systems for data centers introduce additional localized cooling devices, such as in row cooling units and overhead coolers, which significantly increase the complexity of dynamic thermal management. Therefore, it is of paramount importance to characterize the dynamic responses of data centers under variations from different cooling units, such as cooling air flow rate variations. In this study, a detailed computational analysis of an in row cooler based hybrid cooled data center is conducted using a commercially available computational fluid dynamics (CFD) code. A representative CFD model for a raised floor data center with cold aisle-hot aisle arrangement fashion is developed. The hybrid cooling system is designed using perimeter CRAH units and localized in row cooling units. The CRAH unit supplies centralized cooling air to the under floor plenum, and the cooling air enters the cold aisle through perforated tiles. The in row cooling unit is located on the raised floor between the server racks. It supplies the cooling air directly to the cold aisle, and intakes hot air from the back of the racks (hot aisle). Therefore, two different cooling air sources are supplied to the cold aisle, but the ways they are delivered to the cold aisle are different. Several modeling cases are designed to study the transient effects of variations in the flow rates of the two cooling air sources. The server power and the cooling air flow variation combination scenarios are also modeled and studied. The detailed impacts of each modeling case on the rack inlet air temperature and cold aisle air flow distribution are studied. The results presented in this work provide an understanding of the effects of air flow variations on the thermal performance of data centers. The results and corresponding analysis is used for improving the running efficiency of this type of raised floor hybrid data centers using CRAH and IRC units.

Download Full-text