Increasing the Efficiency of the DaCS Programming Model for Heterogeneous Systems

Active Data: A programming model to manage data life cycle across heterogeneous systems and infrastructures

Future Generation Computer Systems ◽

10.1016/j.future.2015.05.015 ◽

2015 ◽

Vol 53 ◽

pp. 25-42 ◽

Cited By ~ 11

Author(s):

Anthony Simonet ◽

Gilles Fedak ◽

Matei Ripeanu

Keyword(s):

Life Cycle ◽

Programming Model ◽

Heterogeneous Systems ◽

Data Life Cycle ◽

Active Data

Download Full-text

Available Task-Level Parallelism on the Cell BE

Scientific Programming ◽

10.1155/2009/741282 ◽

2009 ◽

Vol 17 (1-2) ◽

pp. 59-76 ◽

Cited By ~ 9

Author(s):

Alejandro Rico ◽

Alex Ramirez ◽

Mateo Valero

Keyword(s):

Power Efficiency ◽

Chip Multiprocessors ◽

Programming Model ◽

Heterogeneous Systems ◽

Fundamental Aspect ◽

Suitable Model ◽

Task Management ◽

Working Set ◽

The Impact ◽

Order Execution

There is a clear industrial trend towards chip multiprocessors (CMP) as the most power efficient way of further increasing performance. Heterogeneous CMP architectures take one more step along this power efficiency trend by using multiple types of processors, tailored to the workloads they will execute. Programming these CMP architectures has been identified as one of the main challenges in the near future, and programming heterogeneous systems is even more challenging. High-level programming models which allow the programmer to identify parallel tasks, and the runtime management of the inter-task dependencies, have been identified as a suitable model for programming such heterogeneous CMP architectures. In this paper we analyze the performance of Cell Superscalar, a task-based programming model for the Cell Broadband Engine Architecture, in terms of its scalability to higher number of on-chip processors. Our results show that the low performance of the PPE component limits the scalability of some applications to less than 16 processors. Since the PPE has been identified as the limiting element, we perform a set of simulation studies evaluating the impact of out-of-order execution, branch prediction and larger caches on the task management overhead. We conclude that out-of-order execution is a very desirable feature, since it increases task management performance by 50%. We also identify memory latency as a fundamental aspect in performance, while the working set is not that large. We expect a significant performance impact if task management would run using a fast private memory to store the task dependency graph instead of relying on the cache hierarchy.

Download Full-text

Multi-GPU Support on Single Node Using Directive-Based Programming Model

Scientific Programming ◽

10.1155/2015/621730 ◽

2015 ◽

Vol 2015 ◽

pp. 1-15 ◽

Cited By ~ 5

Author(s):

Rengan Xu ◽

Xiaonan Tian ◽

Sunita Chandrasekaran ◽

Barbara Chapman

Keyword(s):

Hybrid Model ◽

Programming Model ◽

Heterogeneous Systems ◽

Single Node ◽

Heterogeneous Processors ◽

Multiple Gpus ◽

Significant Performance ◽

Performance Gains ◽

Immense Potential ◽

Model Approach

Existing studies show that using single GPU can lead to obtaining significant performance gains. We should be able to achieve further performance speedup if we use more than one GPU. Heterogeneous processors consisting of multiple CPUs and GPUs offer immense potential and are often considered as a leading candidate for porting complex scientific applications. Unfortunately programming heterogeneous systems requires more effort than what is required for traditional multicore systems. Directive-based programming approaches are being widely adopted since they make it easy to use/port/maintain application code. OpenMP and OpenACC are two popular models used to port applications to accelerators. However, neither of the models provides support for multiple GPUs. A plausible solution is to use combination of OpenMP and OpenACC that forms a hybrid model; however, building this model has its own limitations due to lack of necessary compilers’ support. Moreover, the model also lacks support for direct device-to-device communication. To overcome these limitations, an alternate strategy is to extend OpenACC by proposing and developing extensions that follow a task-based implementation for supporting multiple GPUs. We critically analyze the applicability of the hybrid model approach and evaluate the proposed strategy using several case studies and demonstrate their effectiveness.

Download Full-text

Hardware and Software Synthesis of Heterogeneous Systems from Dataflow Programs

Journal of Electrical and Computer Engineering ◽

10.1155/2012/484962 ◽

2012 ◽

Vol 2012 ◽

pp. 1-11 ◽

Cited By ~ 10

Author(s):

Ghislain Roquier ◽

Endri Bezati ◽

Marco Mattavelli

Keyword(s):

Programming Model ◽

Multicore Processors ◽

Heterogeneous Systems ◽

Reconfigurable Hardware ◽

Design Flow ◽

Parallel Applications ◽

Application Development ◽

Software Synthesis ◽

Heterogeneous Platforms ◽

High Level

The new generation of multicore processors and reconfigurable hardware platforms provides a dramatic increase of the available parallelism and processing capabilities. However, one obstacle for exploiting all the promises of such platforms is deeply rooted in sequential thinking. The sequential programming model does not naturally expose potential parallelism that effectively permits to build parallel applications that can be efficiently mapped on different kind of platforms. A shift of paradigm is necessary at all levels of application development to yield portable and scalable implementations on the widest range of heterogeneous platforms. This paper presents a design flow for the hardware and software synthesis of heterogeneous systems allowing to automatically generate hardware and software components as well as appropriate interfaces, from a unique high-level description of the application, based on the dataflow paradigm, running onto heterogeneous architectures composed by reconfigurable hardware units and multicore processors. Experimental results based on the implementation of several video coding algorithms onto heterogeneous platforms are also provided to show the effectiveness of the approach both in terms of portability and scalability.

Download Full-text

OmpSs-OpenCL Programming Model for Heterogeneous Systems

Languages and Compilers for Parallel Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-37658-0_7 ◽

2013 ◽

pp. 96-111 ◽

Cited By ~ 8

Author(s):

Vinoth Krishnan Elangovan ◽

Rosa. M. Badia ◽

Eduard Ayguade Parra

Keyword(s):

Programming Model ◽

Heterogeneous Systems

Download Full-text

Device Hopping

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3471909 ◽

2021 ◽

Vol 18 (4) ◽

pp. 1-25

Author(s):

Paul Metzger ◽

Volker Seeker ◽

Christian Fensch ◽

Murray Cole

Keyword(s):

Programming Model ◽

Heterogeneous Systems ◽

Code Size ◽

Fine Grained ◽

Scheduling Policy ◽

High Level ◽

Many Core ◽

Execution Models ◽

Current Systems

Existing OS techniques for homogeneous many-core systems make it simple for single and multithreaded applications to migrate between cores. Heterogeneous systems do not benefit so fully from this flexibility, and applications that cannot migrate in mid-execution may lose potential performance. The situation is particularly challenging when a switch of language runtime would be desirable in conjunction with a migration. We present a case study in making heterogeneous CPU + GPU systems more flexible in this respect. Our technique for fine-grained application migration, allows switches between OpenMP, OpenCL, and CUDA execution, in conjunction with migrations from GPU to CPU, and CPU to GPU. To achieve this, we subdivide iteration spaces into slices, and consider migration on a slice-by-slice basis. We show that slice sizes can be learned offline by machine learning models. To further improve performance, memory transfers are made migration-aware. The complexity of the migration capability is hidden from programmers behind a high-level programming model. We present a detailed evaluation of our mid-kernel migration mechanism with the First Come, First Served scheduling policy. We compare our technique in a focused evaluation scenario against idealized kernel-by-kernel scheduling, which is typical for current systems, and makes perfect kernel to device scheduling decisions, but cannot migrate kernels mid-execution. Models show that up to 1.33× speedup can be achieved over these systems by adding fine-grained migration. Our experimental results with all nine applicable SHOC and Rodinia benchmarks achieve speedups of up to 1.30× (1.08× on average) over an implementation of a perfect but kernel-migration incapable scheduler when migrated to a faster device. Our mechanism and slice size choices introduce an average slowdown of only 2.44% if kernels never migrate. Lastly, our programming model reduces the code size by at least 88% if compared to manual implementations of migratable kernels.

Download Full-text

Straightforward Heterogeneous Computing with the oneAPI Coexecutor Runtime

Electronics ◽

10.3390/electronics10192386 ◽

2021 ◽

Vol 10 (19) ◽

pp. 2386

Author(s):

Raúl Nozal ◽

Jose Luis Bosque

Keyword(s):

Energy Efficiency ◽

High Performance ◽

Heterogeneous Computing ◽

Programming Model ◽

Heterogeneous Systems ◽

Ease Of Use ◽

Embedded Devices ◽

Computing Systems ◽

Key Points ◽

Integrated Gpu

Heterogeneous systems are the core architecture of most computing systems, from high-performance computing nodes to embedded devices, due to their excellent performance and energy efficiency. Efficiently programming these systems has become a major challenge due to the complexity of their architectures and the efforts required to provide them with co-execution capabilities that can fully exploit the applications. There are many proposals to simplify the programming and management of acceleration devices and multi-core CPUs. However, in many cases, portability and ease of use compromise the efficiency of different devices—even more so when co-executing. Intel oneAPI, a new and powerful standards-based unified programming model, built on top of SYCL, addresses these issues. In this paper, oneAPI is provided with co-execution strategies to run the same kernel between different devices, enabling the exploitation of static and dynamic policies. This work evaluates the performance and energy efficiency for a well-known set of regular and irregular HPC benchmarks, using two heterogeneous systems composed of an integrated GPU and CPU. Static and dynamic load balancers are integrated and evaluated, highlighting single and co-execution strategies and the most significant key points of this promising technology. Experimental results show that co-execution is worthwhile when using dynamic algorithms and improves the efficiency even further when using unified shared memory.

Download Full-text

A comment on Anandalingam (1988). A mathematical programming model of decentralized multi-level systems. J Opl Res Soc 39: 1021-1033

Journal of the Operational Research Society ◽

10.1038/sj.jors.2601112 ◽

2001 ◽

Vol 52 (5) ◽

pp. 594-596

Author(s):

S Sinha

Keyword(s):

Mathematical Programming ◽

Programming Model ◽

Mathematical Programming Model ◽

Multi Level

Download Full-text

Rationale and Design Considerations for a Semantic Mediator in Health Information Systems

Methods of Information in Medicine ◽

10.1055/s-0038-1634545 ◽

1998 ◽

Vol 37 (04/05) ◽

pp. 518-526 ◽

Cited By ~ 9

Author(s):

D. Sauquet ◽

M.-C. Jaulent ◽

E. Zapletal ◽

M. Lavril ◽

P. Degoulet

Keyword(s):

Information Systems ◽

Software Engineering ◽

Health Information ◽

Rapid Development ◽

Heterogeneous Systems ◽

Health Information Systems ◽

Semantic Interoperability ◽

Point Of View ◽

Current State ◽

Engineering Environment

AbstractRapid development of community health information networks raises the issue of semantic interoperability between distributed and heterogeneous systems. Indeed, operational health information systems originate from heterogeneous teams of independent developers and have to cooperate in order to exchange data and services. A good cooperation is based on a good understanding of the messages exchanged between the systems. The main issue of semantic interoperability is to ensure that the exchange is not only possible but also meaningful. The main objective of this paper is to analyze semantic interoperability from a software engineering point of view. It describes the principles for the design of a semantic mediator (SM) in the framework of a distributed object manager (DOM). The mediator is itself a component that should allow the exchange of messages independently of languages and platforms. The functional architecture of such a SM is detailed. These principles have been partly applied in the context of the HEllOS object-oriented software engineering environment. The resulting service components are presented with their current state of achievement.

Download Full-text

A Linear Programming Model for Product Mix Profit Maximization in A Small Medium Enterprise Company

International Journal of Industrial Management ◽

10.15282/ijim6120205330 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

S Mohd Baki ◽

Jack Kie Cheng

Keyword(s):

Linear Programming ◽

Production Planning ◽

Business Performance ◽

Programming Model ◽

Profit Maximization ◽

Optimal Level ◽

Linear Programming Model ◽

Product Mix ◽

Small Medium Enterprises ◽

Medium Enterprises

Production planning is often challenging for small medium enterprises (SMEs) company. Most of the SMEs are having difficulty in determining the optimal level of the production output which can affect their business performance. Product mix optimization is one of the main key for production planning. Many company have used linear programming model in determining the optimal combination of various products that need to be produced in order to maximize profit. Thus, this study aims for profit maximization of a SME company in Malaysia by using linear programming model. The purposes of this study are to identify the current process in the production line and to formulate a linear programming model that would suggest a viable product mix to ensure optimum profitability for the company. ABC Sdn Bhd is selected as a case study company for product mix profit maximization study. Some conclusive observations have been drawn and recommendations have been suggested. This study will provide the company and other companies, particularly in Malaysia, an exposure of linear programming method in making decisions to determine the maximum profit for different product mix.

Download Full-text