Straightforward Heterogeneous Computing with the oneAPI Coexecutor Runtime

Raúl Nozal; Jose Luis Bosque

doi:10.3390/electronics10192386

Straightforward Heterogeneous Computing with the oneAPI Coexecutor Runtime

Electronics ◽

10.3390/electronics10192386 ◽

2021 ◽

Vol 10 (19) ◽

pp. 2386

Author(s):

Raúl Nozal ◽

Jose Luis Bosque

Keyword(s):

Energy Efficiency ◽

High Performance ◽

Heterogeneous Computing ◽

Programming Model ◽

Heterogeneous Systems ◽

Ease Of Use ◽

Embedded Devices ◽

Computing Systems ◽

Key Points ◽

Integrated Gpu

Heterogeneous systems are the core architecture of most computing systems, from high-performance computing nodes to embedded devices, due to their excellent performance and energy efficiency. Efficiently programming these systems has become a major challenge due to the complexity of their architectures and the efforts required to provide them with co-execution capabilities that can fully exploit the applications. There are many proposals to simplify the programming and management of acceleration devices and multi-core CPUs. However, in many cases, portability and ease of use compromise the efficiency of different devices—even more so when co-executing. Intel oneAPI, a new and powerful standards-based unified programming model, built on top of SYCL, addresses these issues. In this paper, oneAPI is provided with co-execution strategies to run the same kernel between different devices, enabling the exploitation of static and dynamic policies. This work evaluates the performance and energy efficiency for a well-known set of regular and irregular HPC benchmarks, using two heterogeneous systems composed of an integrated GPU and CPU. Static and dynamic load balancers are integrated and evaluated, highlighting single and co-execution strategies and the most significant key points of this promising technology. Experimental results show that co-execution is worthwhile when using dynamic algorithms and improves the efficiency even further when using unified shared memory.

Download Full-text

Mobile Platform Challenges in Interactive Computer Vision

Advances in Computational Intelligence and Robotics - Multi-Core Computer Vision and Image Processing for Intelligent Applications ◽

10.4018/978-1-5225-0889-2.ch002 ◽

2017 ◽

pp. 47-73

Author(s):

Miguel Bordallo López

Keyword(s):

Computer Vision ◽

Energy Efficiency ◽

User Interfaces ◽

High Performance ◽

Heterogeneous Computing ◽

Low Power Consumption ◽

Mobile Environment ◽

Trade Off ◽

And Performance ◽

Important Design

Computer vision can be used to increase the interactivity of existing and new camera-based applications. It can be used to build novel interaction methods and user interfaces. The computing and sensing needs of this kind of applications require a careful balance between quality and performance, a practical trade-off. This chapter shows the importance of using all the available resources to hide application latency and maximize computational throughput. The experience gained during the developing of interactive applications is utilized to characterize the constraints imposed by the mobile environment, discussing the most important design goals: high performance and low power consumption. In addition, this chapter discusses the use of heterogeneous computing via asymmetric multiprocessing to improve the throughput and energy efficiency of interactive vision-based applications.

Download Full-text

Apache Nemo: A Framework for Optimizing Distributed Data Processing

ACM Transactions on Computer Systems ◽

10.1145/3468144 ◽

2020 ◽

Vol 38 (3-4) ◽

pp. 1-31

Author(s):

Won Wook Song ◽

Youngseok Yang ◽

Jeongyoon Eo ◽

Jangho Seo ◽

Joo Yeon Kim ◽

...

Keyword(s):

Data Processing ◽

High Performance ◽

Programming Model ◽

Compiler Optimization ◽

Ease Of Use ◽

Distributed Data ◽

Performance Improvements ◽

Distributed Data Processing ◽

Fine Control ◽

High Level

Optimizing scheduling and communication of distributed data processing for resource and data characteristics is crucial for achieving high performance. Existing approaches to such optimizations largely fall into two categories. First, distributed runtimes provide low-level policy interfaces to apply the optimizations, but do not ensure the maintenance of correct application semantics and thus often require significant effort to use. Second, policy interfaces that extend a high-level application programming model ensure correctness, but do not provide sufficient fine control. We describe Apache Nemo, an optimization framework for distributed dataflow processing that provides fine control for high performance and also ensures correctness for ease of use. We combine several techniques to achieve this, including an intermediate representation of dataflow, compiler optimization passes, and runtime extensions. Our evaluation results show that Nemo enables composable and reusable optimizations that bring performance improvements on par with existing specialized runtimes tailored for a specific deployment scenario. Apache Nemo is open-sourced at https://nemo.apache.org as an Apache incubator project.

Download Full-text

Artificial Intelligence: An Energy Efficiency Tool for Enhanced High performance computing

Symmetry ◽

10.3390/sym12061029 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1029

Author(s):

Anabi Hilary Kelechi ◽

Mohammed H. Alsharif ◽

Okpe Jonah Bameyi ◽

Paul Joan Ezra ◽

Iorshase Kator Joseph ◽

...

Keyword(s):

Artificial Intelligence ◽

Energy Efficiency ◽

High Performance Computing ◽

High Performance ◽

Large Data ◽

Computing Systems ◽

Computing Power ◽

Product Delivery ◽

High Level ◽

Performance Computing

Power-consuming entities such as high performance computing (HPC) sites and large data centers are growing with the advance in information technology. In business, HPC is used to enhance the product delivery time, reduce the production cost, and decrease the time it takes to develop a new product. Today’s high level of computing power from supercomputers comes at the expense of consuming large amounts of electric power. It is necessary to consider reducing the energy required by the computing systems and the resources needed to operate these computing systems to minimize the energy utilized by HPC entities. The database could improve system energy efficiency by sampling all the components’ power consumption at regular intervals and the information contained in a database. The information stored in the database will serve as input data for energy-efficiency optimization. More so, device workload information and different usage metrics are stored in the database. There has been strong momentum in the area of artificial intelligence (AI) as a tool for optimizing and processing automation by leveraging on already existing information. This paper discusses ideas for improving energy efficiency for HPC using AI.

Download Full-text

MAGMA templates for scalable linear algebra on emerging architectures

The International Journal of High Performance Computing Applications ◽

10.1177/1094342020938421 ◽

2020 ◽

Vol 34 (6) ◽

pp. 645-658

Author(s):

Mohammed Al Farhan ◽

Ahmad Abdelfattah ◽

Stanimire Tomov ◽

Mark Gates ◽

Dalal Sukkari ◽

...

Keyword(s):

Linear Algebra ◽

High Performance ◽

Programming Model ◽

State Of The Art ◽

Ease Of Use ◽

Science And Engineering ◽

Engineering Applications ◽

Memory Hierarchies ◽

Data Movement ◽

Heterogeneous Node

With the acquisition and widespread use of more resources that rely on accelerator/wide vector–based computing, there has been a strong demand for science and engineering applications to take advantage of these latest assets. This, however, has been extremely challenging due to the diversity of systems to support their extreme concurrency, complex memory hierarchies, costly data movement, and heterogeneous node architectures. To address these challenges, we design a programming model and describe its ease of use in the development of a new MAGMA Templates library that delivers high-performance scalable linear algebra portable on current and emerging architectures. MAGMA Templates derives its performance and portability by (1) building on existing state-of-the-art linear algebra libraries, like MAGMA, SLATE, Trilinos, and vendor-optimized math libraries, and (2) providing access (seamlessly to the users) to the latest algorithms and architecture-specific optimizations through a single, easy-to-use C++-based API.

Download Full-text

Characterizing Power and Energy Efficiency of Legion Data-Centric Runtime and Applications on Heterogeneous High-Performance Computing Systems

High Performance Parallel Computing ◽

10.5772/intechopen.81124 ◽

2019 ◽

Author(s):

Song Huang ◽

Song Fu ◽

Scott Pakin ◽

Michael Lang

Keyword(s):

Energy Efficiency ◽

High Performance Computing ◽

High Performance ◽

Computing Systems ◽

Power And Energy ◽

Performance Computing

Download Full-text

A Composable Monitoring System for Heterogeneous Embedded Platforms

ACM Transactions on Embedded Computing Systems ◽

10.1145/3461647 ◽

2021 ◽

Vol 20 (5) ◽

pp. 1-34

Author(s):

Giacomo Valente ◽

Tiziana Fanni ◽

Carlo Sau ◽

Tania Di Mascio ◽

Luigi Pomante ◽

...

Keyword(s):

Monitoring System ◽

High Performance ◽

Heterogeneous Computing ◽

Monitoring Systems ◽

Embedded Devices ◽

Computing Platform ◽

System On Programmable Chip ◽

Xilinx Fpga ◽

Heterogeneous Computing Platform ◽

Hardware Monitoring

Advanced computations on embedded devices are nowadays a must in any application field. Often, to cope with such a need, embedded systems designers leverage on complex heterogeneous reconfigurable platforms that offer high performance, thanks to the possibility of specializing/customizing some computing elements on board, and are usually flexible enough to be optimized at runtime. In this context, monitoring the system has gained increasing interest. Ideally, monitoring systems should be non-intrusive, serve several purposes, and provide aggregated information about the behavior of the different system components. However, current literature is not close to such ideality: For example, existing monitoring systems lack in being applicable to modern heterogeneous platforms. This work presents a hardware monitoring system that is intended to be minimally invasive on system performance and resources, composable, and capable of providing to the user homogeneous observability and transparent access to the different components of a heterogeneous computing platform, so system metrics can be easily computed from the aggregation of the collected information. Building on a previous work, this article is primarily focused on the extension of an existing hardware monitoring system to cover also specialized coprocessing units, and the assessment is done on a Xilinx FPGA-based System on Programmable Chip. Different explorations are presented to explain the level of customizability of the proposed hardware monitoring system, the tradeoffs available to the user, and the benefits with respect to standard de facto monitoring support made available by the targeted FPGA vendor.

Download Full-text

Scalable Energy Efficiency with Resilience for High Performance Computing Systems

ACM Transactions on Architecture and Code Optimization ◽

10.1145/2822893 ◽

2016 ◽

Vol 12 (4) ◽

pp. 1-27 ◽

Cited By ~ 4

Author(s):

Li Tan ◽

Zizhong Chen ◽

Shuaiwen Leon Song

Keyword(s):

Energy Efficiency ◽

High Performance Computing ◽

High Performance ◽

Computing Systems ◽

Performance Computing

Download Full-text

DYNAMIC JOB SCHEDULING IN GRID COMPUTING

International Journal of Computer and Communication Technology ◽

10.47893/ijcct.2016.1364 ◽

2016 ◽

pp. 186-189

Author(s):

JANI KUNTESH KETAN ◽

ARPITA SHAH

Keyword(s):

Grid Computing ◽

High Performance ◽

Large Scale ◽

Heterogeneous Computing ◽

Ant Colony Algorithm ◽

Job Scheduling ◽

Heuristic Algorithms ◽

Optimal Schedule ◽

Heterogeneous Systems ◽

Sequential Method

Grid computing is growing rapidly in the distributed heterogeneous systems for utilizing and sharing large-scale resources to solve complex scientific problems. Scheduling is the most recent topic used to achieve high performance in grid environments. It aims to find a suitable allocation of resources for each job. A typical problem which arises during this task is the decision of scheduling. It is about an effective utilization of processor to minimize tardiness time of a job, when it is being scheduled. Scheduling jobs to resources in grid computing is complicated due to the distributed and heterogeneous nature of the resources. The efficient scheduling of independent jobs in a heterogeneous computing environment is an important problem in domains such as grid computing. In general, finding optimal schedule for such an environment using the traditional sequential method is an NP-hard problem whereas heuristic approaches will provide near optimal solutions for complex problems. The Ant colony algorithm, which is one of the heuristic algorithms, suits well for the grid scheduling environment using stigmeric communication.

Download Full-text

ENERGY CERTIFICATION AND ENERGY AUDIT OF HIGHER EDUCATION AS A METHODOLOGICAL TOOL TO IMPROVE THE ENERGY EFFICIENCY OF THE UNIVERSITY

Management ◽

10.30857/2415-3206.2021.2.2 ◽

2022 ◽

Vol 34 (2) ◽

pp. 18-25

Author(s):

Liudmyla Hanushchak-Yefimenko

Keyword(s):

Energy Efficiency ◽

High Performance ◽

Energy Performance ◽

Building Envelope ◽

Ease Of Use ◽

Smooth Transition ◽

Certification Program ◽

Energy Audit ◽

Local Climate ◽

University Buildings

BACKGROUND AND OBJECTIVES. Improving the energy performance of buildings is one of the least expensive ways to reduce energy consumption and greenhouse gas emissions. Building energy performance certification increases public knowledge about energy conservation and allows consumers and other decision makers to compare buildings based on their lifetime performance. In addition, energy performance certifications are an incentive for owners to improve the efficiency of existing buildings.METHODS. It is proposed to use in the process of energy certification and energy audit of university buildings collection and evaluation of basic information (including information about local climate, method of use, value of thermal conductivity coefficient and building envelope area, orientation) to determine the level of energy efficiency of the building on a generally accepted scale. In the Certificate of energy efficiency to take into account the calculated results from the assessment of the energy performance of the building.FINDINGS. It is suggested that the results of the energy certification of university buildings be presented in a simple, clear form, to ensure clarity, ease of use and comparability. For the energy certification of university buildings, a comparative labeling from A to G is proposed for use. The scale, on which the current national building standard is at "C," provides ample room for improving the rating of both new and existing buildings. If necessary, the scale should be expanded to add a label such as A1, A2, or A+, A++ when it comes to high-performance buildings.CONCLUSION. Accurate and reliable energy performance certification is a necessary foundation that will help ensure consumer confidence and the success of the certification program. The certification program must be clearly coordinated to ensure a smooth transition of the construction industry to the new rules.

Download Full-text

Exploiting Heterogeneous Computing Systems for Energy Eﬃciency

Handbook of Energy-Aware and Green Computing, Volume 2 ◽

10.1201/b11640-15 ◽

2013 ◽

pp. 215-232

Keyword(s):

Energy Efficiency ◽

Heterogeneous Computing ◽

Computing Systems ◽

Heterogeneous Computing Systems

Download Full-text