Resource-Aware Load Balancing of Parallel Applications

Handbook of Research on Grid Technologies and Utility Computing ◽

10.4018/978-1-60566-184-1.ch002 ◽

2009 ◽

pp. 12-21 ◽

Cited By ~ 5

Author(s):

Eric Aubanel

Keyword(s):

Load Balancing ◽

Dynamic Network ◽

Computational Grid ◽

Parallel Applications ◽

Computational Grids ◽

Concurrent Execution ◽

Processor Performance ◽

Wide Range ◽

Tightly Coupled ◽

Resource Aware

The problem of load balancing parallel applications is particularly challenging on computational grids, since the characteristics of both the application and the platform must be taken into account. This chapter reviews the wide range of solutions that have been proposed. It considers tightly coupled parallel applications that can be described by an undirected graph representing concurrent execution of tasks and communication of tasks, executing on computational grids with static and dynamic network and processor performance. While a rich set of solution techniques have been proposed, there has not been of yet any performance comparisons between them. Such comparisons will require parallel benchmarks and computational grid emulators and simulators.

Download Full-text

Dynamic Network Optimization for Effective Qos Support in Large Grid Infrastructures

Quantitative Quality of Service for Grid Computing ◽

10.4018/978-1-60566-370-8.ch002 ◽

2011 ◽

pp. 28-45

Author(s):

Francesco Palmieri ◽

Ugo Fiore

Keyword(s):

Communication Networks ◽

Electrical Power ◽

Dynamic Network ◽

Service Level ◽

Computational Grid ◽

Computational Grids ◽

Collaborative Computing ◽

Software Infrastructure ◽

Remarkable Change ◽

Computing Power

In the past decade there has been a remarkable change from mainframe-based centralized computing to a distributed client/server approach. In the coming decade this trend is likely to continue with further shifts towards network centric collaborative computing. At the state of the art, the key technology in collaborative computing is the computational grid paradigm. Like an electrical power grid, the computational Grid will aim to provide a steady, reliable source of computing power. More precisely, the term grid is now adopted to designate a common computational and/or data processing infrastructure built on distributed resources, highly heterogeneous (in their role, computing power and architecture), interconnected by heterogeneous communication networks and communicating through some basic services realized by a middleware stratum that offers a reliable, simple, uniform and often transparent interface to its resources such that an unaware user can submit jobs to the Grid just as if he/she was facing a large virtual supercomputer, so that large computing endeavors, consisting of one or more related jobs or tasks, are then transparently distributed over the network on the available computing resources. Such a workload distribution strategy, that is, to balance the tasks on different idle computers on the underlying networks, is the most important functionality in computational Grids, usually provided at the service level of the grid software infrastructure.

Download Full-text

Online Thread and Data Mapping Using a Sharing-Aware Memory Management Unit

ACM Transactions on Modeling and Performance Evaluation of Computing Systems ◽

10.1145/3433687 ◽

2021 ◽

Vol 5 (4) ◽

pp. 1-28

Author(s):

Eduardo H. M. Cruz ◽

Matthias Diener ◽

Laércio L. Pilla ◽

Philippe O. A. Navaux

Keyword(s):

Energy Efficiency ◽

Memory Management ◽

Substantial Reduction ◽

Management Unit ◽

Memory Access ◽

Parallel Applications ◽

Data Mapping ◽

Wide Range ◽

Memory Accesses ◽

Level Parallelism

Current and future architectures rely on thread-level parallelism to sustain performance growth. These architectures have introduced a complex memory hierarchy, consisting of several cores organized hierarchically with multiple cache levels and NUMA nodes. These memory hierarchies can have an impact on the performance and energy efficiency of parallel applications as the importance of memory access locality is increased. In order to improve locality, the analysis of the memory access behavior of parallel applications is critical for mapping threads and data. Nevertheless, most previous work relies on indirect information about the memory accesses, or does not combine thread and data mapping, resulting in less accurate mappings. In this paper, we propose the Sharing-Aware Memory Management Unit (SAMMU), an extension to the memory management unit that allows it to detect the memory access behavior in hardware. With this information, the operating system can perform online mapping without any previous knowledge about the behavior of the application. In the evaluation with a wide range of parallel applications (NAS Parallel Benchmarks and PARSEC Benchmark Suite), performance was improved by up to 35.7% (10.0% on average) and energy efficiency was improved by up to 11.9% (4.1% on average). These improvements happened due to a substantial reduction of cache misses and interconnection traffic.

Download Full-text

Periodic hierarchical load balancing for large supercomputers

The International Journal of High Performance Computing Applications ◽

10.1177/1094342010394383 ◽

2011 ◽

Vol 25 (4) ◽

pp. 371-385 ◽

Cited By ~ 34

Author(s):

Gengbin Zheng ◽

Abhinav Bhatelé ◽

Esteban Meneses ◽

Laxmikant V. Kalé

Keyword(s):

Load Balancing ◽

Large Scale ◽

Parallel Machines ◽

National Laboratory ◽

Argonne National Laboratory ◽

Parallel Applications ◽

Scientific Application ◽

Computing Center ◽

Blue Gene ◽

Advanced Computing

Large parallel machines with hundreds of thousands of processors are becoming more prevalent. Ensuring good load balance is critical for scaling certain classes of parallel applications on even thousands of processors. Centralized load balancing algorithms suffer from scalability problems, especially on machines with a relatively small amount of memory. Fully distributed load balancing algorithms, on the other hand, tend to take longer to arrive at good solutions. In this paper, we present an automatic dynamic hierarchical load balancing method that overcomes the scalability challenges of centralized schemes and longer running times of traditional distributed schemes. Our solution overcomes these issues by creating multiple levels of load balancing domains which form a tree. This hierarchical method is demonstrated within a measurement-based load balancing framework in Charm++. We discuss techniques to deal with scalability challenges of load balancing at very large scale. We present performance data of the hierarchical load balancing method on up to 16,384 cores of Ranger (at the Texas Advanced Computing Center) and 65,536 cores of Intrepid (the Blue Gene/P at Argonne National Laboratory) for a synthetic benchmark. We also demonstrate the successful deployment of the method in a scientific application, NAMD, with results on Intrepid.

Download Full-text

How Computational Grid Refinement in Three Dimensions Affects CFD-DEM Results for Psuedo-2D Fluidized Gas-Solid Beds

Volume 1B, Symposia: Fluid Measurement and Instrumentation; Fluid Dynamics of Wind Energy; Renewable and Sustainable Energy Conversion; Energy and Process Engineering; Microfluidics and Nanofluidics; Development and Applications in Computational Fluid Dynamics; DNS/LES and Hybrid RANS/LES Methods ◽

10.1115/fedsm2017-69222 ◽

2017 ◽

Cited By ~ 2

Author(s):

Annette Volk ◽

Urmila Ghia

Keyword(s):

Three Dimensional ◽

Computational Grid ◽

Three Dimensions ◽

Computational Grids ◽

Grid Refinement ◽

2D Simulation ◽

Computational Fluid Dynamics Cfd ◽

Bed Thickness ◽

Simulation Results ◽

Accuracy Of Results

Computational Fluid Dynamics (CFD)-Discrete Element Method (DEM) simulations are designed to model a pseudo-two-dimensional fluidized bed. Bed behavior and accuracy of results are shown to change as the simulations are conducted on increasingly refined computational grids. Trends of the results with grid refinement are reported for both three-dimensional, uniform refinement, and for grid refinement in only the direction of bed thickness. Pseudo-2D simulation results are examined against previously published experimental data to assess relative accuracy compared to fully 3D simulation results. Two drag laws are employed in the simulations, resulting in different trends of results with computational grid refinement. From these results, we present suggestions for accurate model design.

Download Full-text

Multi-Softcore Architecture on FPGA

International Journal of Reconfigurable Computing ◽

10.1155/2014/979327 ◽

2014 ◽

Vol 2014 ◽

pp. 1-13 ◽

Cited By ~ 4

Author(s):

Mouna Baklouti ◽

Mohamed Abid

Keyword(s):

High Performance ◽

Design Methodology ◽

Matrix Multiplication ◽

Rapid Prototype ◽

General Purpose ◽

Parallel Applications ◽

Multicore Systems ◽

Processor Core ◽

Nios Ii ◽

Wide Range

To meet the high performance demands of embedded multimedia applications, embedded systems are integrating multiple processing units. However, they are mostly based on custom-logic design methodology. Designing parallel multicore systems using available standards intellectual properties yet maintaining high performance is also a challenging issue. Softcore processors and field programmable gate arrays (FPGAs) are a cheap and fast option to develop and test such systems. This paper describes a FPGA-based design methodology to implement a rapid prototype of parametric multicore systems. A study of the viability of making the SoC using the NIOS II soft-processor core from Altera is also presented. The NIOS II features a general-purpose RISC CPU architecture designed to address a wide range of applications. The performance of the implemented architecture is discussed, and also some parallel applications are used for testing speedup and efficiency of the system. Experimental results demonstrate the performance of the proposed multicore system, which achieves better speedup than the GPU (29.5% faster for the FIR filter and 23.6% faster for the matrix-matrix multiplication).

Download Full-text

A Heuristic Algorithm for Mapping Parallel Applications on Computational Grids

Advances in Grid Computing - EGC 2005 - Lecture Notes in Computer Science ◽

10.1007/11508380_111 ◽

2005 ◽

pp. 1086-1096 ◽

Cited By ~ 4

Author(s):

Panu Phinjaroenphan ◽

Savitri Bevinakoppa ◽

Panlop Zeephongsekul

Keyword(s):

Heuristic Algorithm ◽

Parallel Applications ◽

Computational Grids

Download Full-text

Reliability Based Scheduling Model (RSM) for Computational Grids

Development of Distributed Systems from Design to Application and Maintenance ◽

10.4018/978-1-4666-2647-8.ch003 ◽

2012 ◽

pp. 37-54

Author(s):

Zahid Raza ◽

Deo P. Vidyarthi

Keyword(s):

Data Exchange ◽

Large Scale ◽

A Priori ◽

Dynamic Environment ◽

Computational Grid ◽

Computational Grids ◽

Scheduling Model ◽

Reliability Computation ◽

Grid Resources ◽

Maximum Reliability

Computational Grid attributed with distributed load sharing has evolved as a platform to large scale problem solving. Grid is a collection of heterogeneous resources, offering services of varying natures, in which jobs are submitted to any of the participating nodes. Scheduling these jobs in such a complex and dynamic environment has many challenges. Reliability analysis of the grid gains paramount importance because grid involves a large number of resources which may fail anytime, making it unreliable. These failures result in wastage of both computational power and money on the scarce grid resources. It is normally desired that the job should be scheduled in an environment that ensures maximum reliability to the job execution. This work presents a reliability based scheduling model for the jobs on the computational grid. The model considers the failure rate of both the software and hardware grid constituents like application demanding execution, nodes executing the job, and the network links supporting data exchange between the nodes. Job allocation using the proposed scheme becomes trusted as it schedules the job based on a priori reliability computation.

Download Full-text

Using Genetic Algorithm for Scheduling Tasks of Computational Grids in the Gridsim Simulator

Advanced Research and Trends in New Technologies, Software, Human-Computer Interaction, and Communicability - Advances in Human and Social Aspects of Technology ◽

10.4018/978-1-4666-4490-8.ch047 ◽

2014 ◽

pp. 521-535

Author(s):

João Phellipe ◽

Carla Katarina ◽

Francisco das Chagas ◽

Dario Aloise

Keyword(s):

Genetic Algorithm ◽

Genetic Algorithms ◽

Distributed System ◽

Task Scheduling ◽

Computational Grid ◽

Computer Processing ◽

Computational Grids ◽

Processing Power

Computer processing power has evolved considerably in recent years. However, there are problems that still require many machines to perform a large amount of processing in a parallel and distributed way. In this context, the task scheduling in a distributed system present many algorithms. In this chapter, the authors present a scheduler based on genetic algorithms in order to distribute tasks more efficiently in a computational grid; it has been implemented in GRIDSIM, a computational grid simulator with the features and attributes of a real grid.

Download Full-text

Computation of Pressure Fields around a Two-Dimensional Circular Cylinder Using the Vortex-In-Cell and Penalization Methods

Modelling and Simulation in Engineering ◽

10.1155/2014/708372 ◽

2014 ◽

Vol 2014 ◽

pp. 1-13 ◽

Cited By ~ 4

Author(s):

Seung-Jae Lee ◽

Jun-Hyeok Lee ◽

Jung-Chun Suh

Keyword(s):

Circular Cylinder ◽

Solid Body ◽

Pressure Field ◽

Stokes Equations ◽

Computational Cost ◽

Fixed Time ◽

Computational Grids ◽

Penalty Term ◽

Pressure Fields ◽

Wide Range

The vorticity-velocity formulation of the Navier-Stokes equations allows purely kinematical problems to be decoupled from the pressure term, since the pressure is eliminated by applying the curl operator. The Vortex-In-Cell (VIC) method, which is based on the vorticity-velocity formulation, offers particle-mesh algorithms to numerically simulate flows past a solid body. The penalization method is used to enforce boundary conditions at a body surface with a decoupling between body boundaries and computational grids. Its main advantage is a highly efficient implementation for solid boundaries of arbitrary complexity on Cartesian grids. We present an efficient algorithm to numerically implement the vorticity-velocity-pressure formulation including a penalty term to simulate the pressure fields around a solid body. In vorticity-based methods, pressure field can be independently computed from the solution procedure for vorticity. This clearly simplifies the implementation and reduces the computational cost. Obtaining the pressure field at any fixed time represents the most challenging goal of this study. We validate the implementation by numerical simulations of an incompressible viscous flow around an impulsively started circular cylinder in a wide range of Reynolds numbers: Re=40, 550, 3000, and 9500.

Download Full-text

Renewable Energy-Aware Sustainable Cellular Networks with Load Balancing and Energy-Sharing Technique

Sustainability ◽

10.3390/su12229340 ◽

2020 ◽

Vol 12 (22) ◽

pp. 9340

Author(s):

Md. Sanwar Hossain ◽

Khondoker Ziaul Islam ◽

Abu Jahid ◽

Khondokar Mizanur Rahman ◽

Sarwar Ahmed ◽

...

Keyword(s):

Energy Efficiency ◽

Renewable Energy ◽

Load Balancing ◽

Outage Probability ◽

Cellular Networks ◽

Renewable Energy Sources ◽

Green Energy ◽

Solar Pv ◽

Wide Range ◽

Energy Sharing

With the proliferation of cellular networks, the ubiquitous availability of new-generation multimedia devices, and their wide-ranging data applications, telecom network operators are increasingly deploying the number of cellular base stations (BSs) to deal with unprecedented service demand. The rapid and radical deployment of the cellular network significantly exerts energy consumption and carbon footprints to the atmosphere. The ultimate objective of this work is to develop a sustainable and environmentally-friendly cellular infrastructure through compelling utilization of the locally available renewable energy sources (RES) namely solar photovoltaic (PV), wind turbine (WT), and biomass generator (BG). This article addresses the key challenges of envisioning the hybrid solar PV/WT/BG powered macro BSs in Bangladesh considering the dynamic profile of the RES and traffic intensity in the tempo-spatial domain. The optimal system architecture and technical criteria of the proposed system are critically evaluated with the help of HOMER optimization software for both on-grid and off-grid conditions to downsize the electricity generation cost and waste outflows while ensuring the desired quality of experience (QoE) over 20 years duration. Besides, the green energy-sharing mechanism under the off-grid condition and the grid-tied condition has been critically analyzed for optimal use of green energy. Moreover, the heuristic algorithm of the load balancing technique among collocated BSs has been incorporated for elevating the throughput and energy efficiency (EE) as well. The spectral efficiency (SE), energy efficiency, and outage probability performance of the contemplated wireless network are substantially examined using Matlab based Monte–Carlo simulation under a wide range of network configurations. Simulation results reveal that the proper load balancing technique pledges zero outage probability with expected system performance whereas energy cooperation policy offers an attractive solution for developing green mobile communications employing better utilization of renewable energy under the proposed hybrid solar PV/WT/BG scheme.

Download Full-text