HipaccVX: wedding of OpenVX and DSL-based code generation

Abstract Writing programs for heterogeneous platforms optimized for high performance is hard since this requires the code to be tuned at a low level with architecture-specific optimizations that are most times based on fundamentally differing programming paradigms and languages. OpenVX promises to solve this issue for computer vision applications with a royalty-free industry standard that is based on a graph-execution model. Yet, the OpenVX ’ algorithm space is constrained to a small set of vision functions. This hinders accelerating computations that are not included in the standard. In this paper, we analyze OpenVX vision functions to find an orthogonal set of computational abstractions. Based on these abstractions, we couple an existing domain-specific language (DSL) back end to the OpenVX environment and provide language constructs to the programmer for the definition of user-defined nodes. In this way, we enable optimizations that are not possible to detect with OpenVX graph implementations using the standard computer vision functions. These optimizations can double the throughput on an Nvidia GTX GPU and decrease the resource usage of a Xilinx Zynq FPGA by 50% for our benchmarks. Finally, we show that our proposed compiler framework, called HipaccVX, can achieve better results than the state-of-the-art approaches Nvidia VisionWorks and Halide-HLS.

Download Full-text

Application of the CHARM++ software model as a target platform for a domain-specific language compiler for the analysis of static graphs

Numerical Methods and Programming (Vychislitel'nye Metody i Programmirovanie) ◽

10.26089/nummet.v18r208 ◽

2017 ◽

pp. 103-114

Author(s):

А.С. Фролов

Keyword(s):

Code Generation ◽

Programming Model ◽

Graph Representation ◽

Connected Components ◽

Domain Specific Language ◽

Specific Language ◽

Domain Specific ◽

Software Model ◽

Execution Model ◽

Target Platform

Представлена реализация модуля генерации параллельного программного кода на Charm++ в компиляторе проблемно-ориентированного языка программирования Green-Marl, предназначенного для разработки параллельных алгоритмов анализа статических графов. Приводится описание представления графа в генерируемом коде и способов отображения основных конструкций языка Green-Marl в параллельный код на Charm++. Проведенное оценочное тестирование с использованием типовых графовых задач (поиск кратчайших путей от заданной вершины до остальных вершин графа (SSSP), поиск связных компонент (CC) и вычисление рангов вершин с использованием алгоритма PageRank) показало, что производительность программ на Green-Marl, странслированных в Charm++, находится на одном уровне с реализациями на Charm++, разработанными вручную. The implementation of a code generation mechanism in the domain-specific language (DSL) Green-Marl compiler targeted in the Charm++ framework is presented. Green-Marl is used for the parallel static graph analysis and adopts an imperative shared memory programming model, whereas Charm++ implements a message-driven execution model. The graph representation in the generated Charm++ code and the translation of the basic Green-Marl constructs to Charm++ are described. The evaluation of the typical graph algorithms: Single-Source Shortest Path (SSSP), Connected Components (CC), and PageRank shows that the performance of Green-Marl programs translated to Charm++ is the same as for native Charm++ implementations.

Download Full-text

Domain‐specific virtual processors as a portable programming and execution model for parallel computational workloads on modern heterogeneous high‐performance computing architectures

International Journal of Quantum Chemistry ◽

10.1002/qua.25926 ◽

2019 ◽

Vol 119 (12) ◽

pp. e25926 ◽

Cited By ~ 2

Author(s):

Dmitry I. Lyakh

Keyword(s):

High Performance Computing ◽

High Performance ◽

Domain Specific ◽

Execution Model ◽

Performance Computing

Download Full-text

Modeling and Execution of Floating Point Parallel Processing Operation for RISC Processor

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c6203.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3783-3789

Keyword(s):

Video Processing ◽

High Speed ◽

High Performance ◽

Floating Point ◽

Double Precision ◽

Risc Processor ◽

Small Set ◽

Reduced Instruction Set Computer ◽

Definition Of ◽

Instruction Format

The development of processors with sundry suggestions have been made regarding a exactitude definition of RISC, but the prosaic concept is that such a computer has a small set of simple and prosaic instructions, instead of an outsized set of intricate and specialized instructions. This project proposes the planning of a high speed 64 bit RISC processor. The miens of this processor consume less power and it contrives on high speed. The processor comprises of sections namely Instruction Fetch section, Instruction Decode section, and Execution section. The ALU within the execution section comprises a double-precision floating-point multiplier designed during a corollary architecture thus improving the speed and veracity of the execution. All the sections are designed using Verilog coding. Monotonous instruction format, cognate prosaic-purpose registers, and pellucid addressing modes were the other miens. RISC exemplified as Reduced Instruction Set Computer. For designing high-performance processors, RISC is considered to be the footing. The RISC processor has a diminished number of Instructions, fixed instruction length, more prosaic-purpose register which are catalogued into the register file, load-store architecture and facilitate addressing modes which make diacritic instruction execute faster and achieve a net gain in performance. Thus the cardinal intent of this paper is to consummate the veridicality by devouring less power, area and with merest delay and it would be done by reinstating the floating-point ALU with single precision section by floating- point double precision section. Video processing, telecommunications and image processing were the high end applications used by architecture

Download Full-text

Towards High-Performance Code Generation for Multi-GPU Clusters Based on a Domain-Specific Language for Algorithmic Skeletons

International Journal of Parallel Programming ◽

10.1007/s10766-020-00659-x ◽

2020 ◽

Vol 48 (4) ◽

pp. 713-728

Author(s):

Fabian Wrede ◽

Herbert Kuchen

Keyword(s):

Code Generation ◽

High Performance ◽

Domain Specific Language ◽

Algorithmic Skeletons ◽

Specific Language ◽

Domain Specific ◽

Gpu Clusters

Download Full-text

Automatic Task-Based Code Generation for High Performance Domain Specific Embedded Language

International Journal of Parallel Programming ◽

10.1007/s10766-015-0354-9 ◽

2015 ◽

Vol 44 (3) ◽

pp. 449-465 ◽

Cited By ~ 5

Author(s):

Antoine Tran Tan ◽

Joel Falcou ◽

Daniel Etiemble ◽

Hartmut Kaiser

Keyword(s):

Code Generation ◽

High Performance ◽

Domain Specific

Download Full-text

Brian 2: an intuitive and efficient neural simulator

10.1101/595710 ◽

2019 ◽

Cited By ~ 1

Author(s):

Marcel Stimberg ◽

Romain Brette ◽

Dan F. M. Goodman

Keyword(s):

Code Generation ◽

High Performance ◽

Computational Experiment ◽

Neural Models ◽

Low Level ◽

Domain Specific ◽

Pyloric Network ◽

Neuroscience Research ◽

High Level ◽

Limited Scope

AbstractTo be maximally useful for neuroscience research, neural simulators must make it possible to define original models. This is especially important because a computational experiment might not only need descriptions of neurons and synapses, but also models of interactions with the environment (e.g. muscles), or the environment itself. To preserve high performance when defining new models, current simulators offer two options: low-level programming, or mark-up languages (and other domain specific languages). The first option requires time and expertise, is prone to errors, and contributes to problems with reproducibility and replicability. The second option has limited scope, since it can only describe the range of neural models covered by the ontology. Other aspects of a computational experiment, such as the stimulation protocol, cannot be expressed within this framework. “Brian” 2 is a complete rewrite of Brian that addresses this issue by using runtime code generation with a procedural equation-oriented approach. Brian 2 enables scientists to write code that is particularly simple and concise, closely matching the way they conceptualise their models, while the technique of runtime code generation automatically transforms high level descriptions of models into efficient low level code tailored to different hardware (e.g. CPU or GPU). We illustrate it with several challenging examples: a plastic model of the pyloric network of crustaceans, a closed-loop sensorimotor model, programmatic exploration of a neuron model, and an auditory model with real-time input from a microphone.

Download Full-text

Touch: A Textual Programming Language for Developing APPs of Insect Intelligent Building

Scientific Programming ◽

10.1155/2020/8887588 ◽

2020 ◽

Vol 2020 ◽

pp. 1-26

Author(s):

Wenjie Chen ◽

Qiliang Yang ◽

Ziyan Jiang ◽

Jianchun Xing ◽

Qianchuan Zhao ◽

...

Keyword(s):

Parallel Computing ◽

Code Generation ◽

Control Strategies ◽

Structural Features ◽

Intelligent Building ◽

Development Environment ◽

Domain Specific ◽

Complex Structural ◽

Definition Of ◽

High Level

Insect intelligent building (I2B) is a novel decentralized, flat-structured intelligent building platform with excellent flexibility and scalability. I2B allows users to develop applications that include control strategies for efficiently managing and controlling buildings. However, developing I2B APPs (applications) is considered a challenging and complex task due to the complex structural features and parallel computing models of the I2B platform. Existing studies have been shown to encounter difficulty in supporting a high degree of abstraction and in allowing users to define control scenarios in a concise and comprehensible way. This paper aims to facilitate the development of such applications and to reduce the programming difficulty. We propose Touch, a textual domain-specific language (DSL) that provides a high-level abstraction of I2B APPs. Specifically, we first establish the conceptual programming architecture of the I2B APP, making the application more intuitive by abstracting different levels of physical entities in I2B. Then, we present special language elements to effectively support the parallel computing model of the I2B platform and provide a formal definition of the concrete Touch syntax. We also implement supporting tools for Touch, including a development environment as well as target code generation. Finally, we present experimental results to demonstrate the effectiveness and efficiency of Touch.

Download Full-text

Coding Productivity in MapReduce Applications for Distributed and Shared Memory Architectures

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194015710096 ◽

2015 ◽

Vol 25 (09n10) ◽

pp. 1739-1741

Author(s):

Daniel Adornes ◽

Dalvan Griebler ◽

Cleverson Ledur ◽

Luiz Gustavo Fernandes

Keyword(s):

Shared Memory ◽

Code Generation ◽

High Performance ◽

Domain Specific ◽

Significant Performance ◽

Memory Architectures ◽

Programming Interfaces ◽

Shared Memory Architectures ◽

Performance Computing

MapReduce was originally proposed as a suitable and efficient approach for analyzing and processing large amounts of data. Since then, many researches contributed with MapReduce implementations for distributed and shared memory architectures. Nevertheless, different architectural levels require different optimization strategies in order to achieve high-performance computing. Such strategies in turn have caused very different MapReduce programming interfaces among these researches. This paper presents some research notes on coding productivity when developing MapReduce applications for distributed and shared memory architectures. As a case study, we introduce our current research on a unified MapReduce domain-specific language with code generation for Hadoop and Phoenix++, which has achieved a coding productivity increase from 41.84% and up to 94.71% without significant performance losses (below 3%) compared to those frameworks.

Download Full-text

APA Book Talk: The Psychology of High Performance: Developing Human Potential Into Domain-Specific Talent

PsycEXTRA Dataset ◽

10.1037/e501472020-001 ◽

2020 ◽

Author(s):

Jamie Buck ◽

Rena Subotnik ◽

Frank Worrell ◽

Paula Olszewski-Kubilius ◽

Chi Wang

Keyword(s):

High Performance ◽

Human Potential ◽

Domain Specific ◽

Book Talk

Download Full-text

Towards Open Graphical Tool-Building Framework

Scientific Journal of Riga Technical University Computer Sciences ◽

10.2478/v10143-011-0011-8 ◽

2011 ◽

Vol 43 (1) ◽

pp. 80-87 ◽

Cited By ~ 4

Author(s):

Edgars Rencis ◽

Janis Barzdins ◽

Sergejs Kozlovics

Keyword(s):

Deep Understanding ◽

Easy Task ◽

Domain Specific ◽

Activity Diagrams ◽

Graphical Tool ◽

One Step ◽

Uml Activity Diagrams ◽

Definition Of ◽

Tool Building

Towards Open Graphical Tool-Building Framework Nowadays, there are many frameworks for developing domain-specific tools. However, if we want to create a really sophisticated tool with specific functionality requirements, it is not always an easy task to do. Although tool-building platforms offer some means for extending the tool functionality and accessing it from external applications, it usually requires a deep understanding of various technical implementation details. In this paper we try to go one step closer to a really open graphical tool-building framework that would allow both to change the behavior of the tool and to access the tool from the outside easily. We start by defining a specialization of metamodels which is a great and powerful facility itself. Then we go on and show how this can be applied in the field of graphical domain-specific tool building. The approach is demonstrated on an example of a subset of UML activity diagrams. The benefits of the approach are also clearly indicated. These include a natural and intuitive definition of tools, a strict logic/presentation separation and the openness for extensions as well as for external applications.

Download Full-text