scholarly journals Reducing the energy consumption of large-scale computing systems through combined shutdown policies with multiple constraints

Author(s):  
Anne Benoit ◽  
Laurent Lefèvre ◽  
Anne-Cécile Orgerie ◽  
Issam Raïs

Large-scale distributed systems (high-performance computing centers, networks, data centers) are expected to consume huge amounts of energy. In order to address this issue, shutdown policies constitute an appealing approach able to dynamically adapt the resource set to the actual workload. However, multiple constraints have to be taken into account for such policies to be applied on real infrastructures: the time and energy cost of switching on and off, the power and energy consumption bounds caused by the electricity grid or the cooling system, and the availability of renewable energy. In this article, we propose models translating these various constraints into different shutdown policies that can be combined for a multiconstraint purpose. Our models and their combinations are validated through simulations on a real workload trace.

Author(s):  
Tyng-Yeu Liang ◽  
Fu-Chun Lu ◽  
Jun-Yao Chiu

QoS and energy consumption are two important issues for Cloud computing. In this paper, the authors propose a hybrid resource reservation method to address these two issues for scientific workflows in the high-performance computing Clouds built on hybrid CPU/GPU architecture. As named, this method reserves proper CPU or GPU for executing different jobs in the same workflow based on the profile of execution time and energy consumption of each resource-to-program pair. They have implemented the proposed resource reservation method on a real service-oriented system. The experimental results show that the proposed resource reservation method can effectively maintain the QoS of workflows while simultaneously minimizing the energy consumption of executing the workflows.


2020 ◽  
Vol 23 (1-4) ◽  
Author(s):  
Ruth Schöbel ◽  
Robert Speck

AbstractTo extend prevailing scaling limits when solving time-dependent partial differential equations, the parallel full approximation scheme in space and time (PFASST) has been shown to be a promising parallel-in-time integrator. Similar to space–time multigrid, PFASST is able to compute multiple time-steps simultaneously and is therefore in particular suitable for large-scale applications on high performance computing systems. In this work we couple PFASST with a parallel spectral deferred correction (SDC) method, forming an unprecedented doubly time-parallel integrator. While PFASST provides global, large-scale “parallelization across the step”, the inner parallel SDC method allows integrating each individual time-step “parallel across the method” using a diagonalized local Quasi-Newton solver. This new method, which we call “PFASST with Enhanced concuRrency” (PFASST-ER), therefore exposes even more temporal concurrency. For two challenging nonlinear reaction-diffusion problems, we show that PFASST-ER works more efficiently than the classical variants of PFASST and can use more processors than time-steps.


2012 ◽  
Vol 63 (2) ◽  
pp. 365-377 ◽  
Author(s):  
Yulai Yuan ◽  
Yongwei Wu ◽  
Qiuping Wang ◽  
Guangwen Yang ◽  
Weimin Zheng

2021 ◽  
Vol 11 (3) ◽  
pp. 1169
Author(s):  
Erol Gelenbe ◽  
Miltiadis Siavvas

Long-running software may operate on hardware platforms with limited energy resources such as batteries or photovoltaic, or on high-performance platforms that consume a large amount of energy. Since such systems may be subject to hardware failures, checkpointing is often used to assure the reliability of the application. Since checkpointing introduces additional computation time and energy consumption, we study how checkpoint intervals need to be selected so as to minimize a cost function that includes the execution time and the energy. Expressions for both the program’s energy consumption and execution time are derived as a function of the failure probability per instruction. A first principle based analysis yields the checkpoint interval that minimizes a linear combination of the average energy consumption and execution time of the program, in terms of the classical “Lambert function”. The sensitivity of the checkpoint to the importance attributed to energy consumption is also derived. The results are illustrated with numerical examples regarding programs of various lengths and showing the relation between the checkpoint interval that minimizes energy consumption and execution time, and the one that minimizes a weighted sum of the two. In addition, our results are applied to a popular software benchmark, and posted on a publicly accessible web site, together with the optimization software that we have developed.


Author(s):  
Juan P. Silva ◽  
Ernesto Dufrechou ◽  
Pabl Ezzatti ◽  
Enrique S. Quintana-Ortí ◽  
Alfredo Remón ◽  
...  

The high performance computing community has traditionally focused uniquely on the reduction of execution time, though in the last years, the optimization of energy consumption has become a main issue. A reduction of energy usage without a degradation of performance requires the adoption of energy-efficient hardware platforms accompanied by the development of energy-aware algorithms and computational kernels. The solution of linear systems is a key operation for many scientific and engineering problems. Its relevance has motivated an important amount of work, and consequently, it is possible to find high performance solvers for a wide variety of hardware platforms. In this work, we aim to develop a high performance and energy-efficient linear system solver. In particular, we develop two solvers for a low-power CPU-GPU platform, the NVIDIA Jetson TK1. These solvers implement the Gauss-Huard algorithm yielding an efficient usage of the target hardware as well as an efficient memory access. The experimental evaluation shows that the novel proposal reports important savings in both time and energy-consumption when compared with the state-of-the-art solvers of the platform.


Author(s):  
Liangxiu Han

This chapter identifies challenges and requirements for resource sharing to support high performance distributed Service-Oriented Computing (SOC) systems. The chapter draws attention to two popular and important design paradigms: Grid and Peer-to-Peer (P2P) computing systems, which are evolving as two practical solutions to supporting wide-area resource sharing over the Internet. As a fundamental task of resource sharing, the efficient resource discovery is playing an important role in the context of the SOC setting. The chapter presents the resource discovery in Grid and P2P environments through an overview of related systems, both historical and emerging. The chapter then discusses the exploitation of both technologies for facilitating the resource discovery within large-scale distributed computing systems in a flexible, scalable, fault-tolerant, interoperable and security fashion.


2020 ◽  
Vol 17 (9) ◽  
pp. 4411-4418
Author(s):  
S. Jagannatha ◽  
B. N. Tulasimala

In the world of information communication technology (ICT) the term Cloud Computing has been the buzz word. Cloud computing is changing its definition the way technocrats are using it according to the environment. Cloud computing as a definition remains very contentious. Definition is stated liable to a particular application with no unanimous definition, making it altogether elusive. In spite of this, it is this technology which is revolutionizing the traditional usage of computer hardware, software, data storage media, processing mechanism with more of benefits to the stake holders. In the past, the use of autonomous computers and the nodes that were interconnected forming the computer networks with shared software resources had minimized the cost on hardware and also on the software to certain extent. Thus evolutionary changes in computing technology over a few decades has brought in the platform and environment changes in machine architecture, operating system, network connectivity and application workload. This has made the commercial use of technology more predominant. Instead of centralized systems, parallel and distributed systems will be more preferred to solve computational problems in the business domain. These hardware are ideal to solve large-scale problems over internet. This computing model is data-intensive and networkcentric. Most of the organizations with ICT used to feel storing of huge data, maintaining, processing of the same and communication through internet for automating the entire process a challenge. In this paper we explore the growth of CC technology over several years. How high performance computing systems and high throughput computing systems enhance computational performance and also how cloud computing technology according to various experts, scientific community and also the service providers is going to be more cost effective through different dimensions of business aspects.


Author(s):  
D. E. Keyes ◽  
H. Ltaief ◽  
G. Turkiyyah

A traditional goal of algorithmic optimality, squeezing out flops, has been superseded by evolution in architecture. Flops no longer serve as a reasonable proxy for all aspects of complexity. Instead, algorithms must now squeeze memory, data transfers, and synchronizations, while extra flops on locally cached data represent only small costs in time and energy. Hierarchically low-rank matrices realize a rarely achieved combination of optimal storage complexity and high-computational intensity for a wide class of formally dense linear operators that arise in applications for which exascale computers are being constructed. They may be regarded as algebraic generalizations of the fast multipole method. Methods based on these hierarchical data structures and their simpler cousins, tile low-rank matrices, are well proportioned for early exascale computer architectures, which are provisioned for high processing power relative to memory capacity and memory bandwidth. They are ushering in a renaissance of computational linear algebra. A challenge is that emerging hardware architecture possesses hierarchies of its own that do not generally align with those of the algorithm. We describe modules of a software toolkit, hierarchical computations on manycore architectures, that illustrate these features and are intended as building blocks of applications, such as matrix-free higher-order methods in optimization and large-scale spatial statistics. Some modules of this open-source project have been adopted in the software libraries of major vendors. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.


Sign in / Sign up

Export Citation Format

Share Document