Intra-Tile Parallelization for Two-Level Perfectly Nested Loops With Non-Uniform Dependences

Abstract Most important scientific and engineering applications have complex computations or large data. In all these applications, a huge amount of time is consumed by nested loops. Therefore, loops are the main source of the parallelization of scientific and engineering programs. Many parallelizing compilers focus on parallelization of nested loops with uniform dependences, and parallelization of nested loops with non-uniform dependences has not been extensively investigated. This paper addresses the problem of parallelizing two-level nested loops with non-uniform dependences. The aim is to minimize the execution time by improving the load balancing and minimizing the inter-processor communication. We propose a new tiling algorithm, k-StepIntraTiling, using bin packing problem to minimize the execution time. We demonstrate the effectiveness of the proposed method in several experiments. Simulation and experimental results show that the algorithm effectively reduces the total execution time of several benchmarks compared to the other tiling methods.

Download Full-text

Effective Utilization of Resources Through Optimal Allocation and Opportunistic Migration of Virtual Machines in Cloud Environment

International Journal of Cloud Applications and Computing ◽

10.4018/ijcac.2021070105 ◽

2021 ◽

Vol 11 (3) ◽

pp. 72-91

Author(s):

Priyanka H. ◽

Mary Cherian

Keyword(s):

Data Center ◽

Execution Time ◽

Data Centers ◽

Optimal Allocation ◽

Virtual Machines ◽

Large Data ◽

Migration Time ◽

Resource Usage ◽

Total Execution Time ◽

Effective Utilization

Cloud computing has become more prominent, and it is used in large data centers. Distribution of well-organized resources (bandwidth, CPU, and memory) is the major problem in the data centers. The genetically enhanced shuffling frog leaping algorithm (GESFLA) framework is proposed to select the optimal virtual machines to schedule the tasks and allocate them in physical machines (PMs). The proposed GESFLA-based resource allocation technique is useful in minimizing the wastage of resource usage and also minimizes the power consumption of the data center. The proposed GESFL algorithm is compared with task-based particle swarm optimization (TBPSO) for efficiency. The experimental results show the excellence of GESFLA over TBPSO in terms of resource usage ratio, migration time, and total execution time. The proposed GESFLA framework reduces the energy consumption of data center up to 79%, migration time by 67%, and CPU utilization is improved by 9% for Planet Lab workload traces. For the random workload, the execution time is minimized by 71%, transfer time is reduced up to 99%, and the CPU consumption is improved by 17% when compared to TBPSO.

Download Full-text

Bin Packing with Queues

Journal of Applied Probability ◽

10.1017/s0021900200004885 ◽

2008 ◽

Vol 45 (04) ◽

pp. 922-939 ◽

Cited By ~ 3

Author(s):

Devavrat Shah ◽

John N. Tsitsiklis

Keyword(s):

Single Unit ◽

Bin Packing ◽

Packing Problem ◽

The Other ◽

Queue Size ◽

Unit Size ◽

Total Size ◽

Bin Packing Problem ◽

Dynamic Version ◽

Size Scales

We study the best achievable performance (in terms of the average queue size and delay) in a stochastic and dynamic version of the bin-packing problem. Items arrive to a queue according to a Poisson process with rate 2ρ, where ρ ∈ (0, 1). The item sizes are independent and identically distributed (i.i.d.) with a uniform distribution in [0, 1]. At each time unit, a single unit-size bin is available and can receive any of the queued items, as long as their total size does not exceed 1. Coffman and Stolyar (1999) and Gamarnik (2004) have established that there exist packing policies under which the average queue size is finite for every ρ ∈ (0, 1). In this paper we study the precise scaling of the average queue size, as a function of ρ, with emphasis on the critical regime where ρ approaches 1. Standard results on the probabilistic (but static) bin-packing problem can be readily applied to produce policies under which the queue size scales as O(h 2), where h = 1 / (1 - ρ), which raises the question of whether this is the best possible. We establish that the average queue size scales as Ω(hlogh), under any policy. Furthermore, we provide an easily implementable policy, which packs at most two items per bin. Under that policy, the average queue size scales as O(hlog3/2 h), which is nearly optimal. On the other hand, if we impose the additional requirement that any two items packed together must have near-complementary sizes (in a sense to be made precise), we show that the average queue size must scale as Θ(h 2).

Download Full-text

Bin Packing with Queues

Journal of Applied Probability ◽

10.1239/jap/1231340224 ◽

2008 ◽

Vol 45 (4) ◽

pp. 922-939 ◽

Cited By ~ 4

Author(s):

Devavrat Shah ◽

John N. Tsitsiklis

Keyword(s):

Single Unit ◽

Bin Packing ◽

Packing Problem ◽

The Other ◽

Queue Size ◽

Unit Size ◽

Total Size ◽

Bin Packing Problem ◽

Dynamic Version ◽

Size Scales

We study the best achievable performance (in terms of the average queue size and delay) in a stochastic and dynamic version of the bin-packing problem. Items arrive to a queue according to a Poisson process with rate 2ρ, where ρ ∈ (0, 1). The item sizes are independent and identically distributed (i.i.d.) with a uniform distribution in [0, 1]. At each time unit, a single unit-size bin is available and can receive any of the queued items, as long as their total size does not exceed 1. Coffman and Stolyar (1999) and Gamarnik (2004) have established that there exist packing policies under which the average queue size is finite for every ρ ∈ (0, 1). In this paper we study the precise scaling of the average queue size, as a function of ρ, with emphasis on the critical regime where ρ approaches 1. Standard results on the probabilistic (but static) bin-packing problem can be readily applied to produce policies under which the queue size scales as O(h2), where h = 1 / (1 - ρ), which raises the question of whether this is the best possible. We establish that the average queue size scales as Ω(hlogh), under any policy. Furthermore, we provide an easily implementable policy, which packs at most two items per bin. Under that policy, the average queue size scales as O(hlog3/2h), which is nearly optimal. On the other hand, if we impose the additional requirement that any two items packed together must have near-complementary sizes (in a sense to be made precise), we show that the average queue size must scale as Θ(h2).

Download Full-text

A Novel Clone Selection Algorithm and its Application for Three Dimension Bin Packing Problem of Web Mode

2010 Third International Joint Conference on Computational Science and Optimization ◽

10.1109/cso.2010.169 ◽

2010 ◽

Author(s):

Jian-hua Li ◽

Hui Chen ◽

Lina Ren

Keyword(s):

Bin Packing ◽

Packing Problem ◽

Three Dimension ◽

Selection Algorithm ◽

Clone Selection ◽

Bin Packing Problem

Download Full-text

Bin Packing Problem with Compatible Categories Utilizing an Enhanced Variable Neighborhood Search Algorithm

2019 IEEE 4th International Conference on Technology, Informatics, Management, Engineering & Environment (TIME-E) ◽

10.1109/time-e47986.2019.9353310 ◽

2019 ◽

Author(s):

Jasmin A. Caliwag ◽

Ma. Christina C. Aragon ◽

Ruji P. Medina

Keyword(s):

Variable Neighborhood Search ◽

Bin Packing ◽

Search Algorithm ◽

Packing Problem ◽

Neighborhood Search ◽

Bin Packing Problem ◽

Neighborhood Search Algorithm

Download Full-text

A Three-Stage Layer-Based Heuristic to Solve the 3D Bin-Packing Problem under Balancing Constraint

Journal of King Saud University - Computer and Information Sciences ◽

10.1016/j.jksuci.2021.07.007 ◽

2021 ◽

Author(s):

Youssef Harrath

Keyword(s):

Bin Packing ◽

Packing Problem ◽

Bin Packing Problem

Download Full-text

Preemptive Scheduling for Two-Processor Systems

Fundamenta Informaticae ◽

10.3233/fi-1988-11102 ◽

1988 ◽

Vol 11 (1) ◽

pp. 1-19

Author(s):

Andrzej Rowicki

Keyword(s):

Execution Time ◽

Preemptive Scheduling ◽

Total Execution Time ◽

Schedule Length ◽

Execution Times ◽

Dependent Tasks

The purpose of the paper is to consider an algorithm for preemptive scheduling for two-processor systems with identical processors. Computations submitted to the systems are composed of dependent tasks with arbitrary execution times and contain no loops and have only one output. We assume that preemptions times are completely unconstrained, and preemptions consume no time. Moreover, the algorithm determines the total execution time of the computation. It has been proved that this algorithm is optimal, that is, the total execution time of the computation (schedule length) is minimized.

Download Full-text