USING METAPROGRAMMING TO PARALLELIZE FUNCTIONAL SPECIFICATIONS

2002 ◽  
Vol 12 (02) ◽  
pp. 193-210 ◽  
Author(s):  
CHRISTOPH A. HERRMANN ◽  
CHRISTIAN LENGAUER

Metaprogramming is a paradigm for enhancing a general-purpose programming language with features catering for a special-purpose application domain, without a need for a reimplementation of the language. In a staged compilation, the special-purpose features are translated and optimised by a domain-specific preprocessor, which hands over to the general-purpose compiler for translation of the domain-independent part of the program. The domain we work in is high-performance parallel computing. We use metaprogramming to enhance the functional language Haskell with features for the efficient, parallel implementation of certain computational patterns, called skeletons.

2014 ◽  
Vol 24 (03) ◽  
pp. 1441003 ◽  
Author(s):  
Marcel Köster ◽  
Roland Leißa ◽  
Sebastian Hack ◽  
Richard Membarth ◽  
Philipp Slusallek

A straightforward implementation of an algorithm in a general-purpose programming language does usually not deliver peak performance: Compilers often fail to automatically tune the code for certain hardware peculiarities like memory hierarchy or vector execution units. Manually tuning the code is firstly error-prone as well as time-consuming and secondly taints the code by exposing those peculiarities to the implementation. A popular method to avoid these problems is to implement the algorithm in a Domain-Specific Language (DSL). A DSL compiler can then automatically tune the code for the target platform. In this article we show how to embed a DSL for stencil codes in another language. In contrast to prior approaches we only use a single language for this task which offers explicit control over code refinement. This is used to specialize stencils for particular scenarios. Our results show that our specialized programs achieve competitive performance compared to hand-tuned CUDA programs while maintaining a convenient coding experience.


1999 ◽  
Vol 9 (5) ◽  
pp. 483-525 ◽  
Author(s):  
PETER THIEMANN

We present a general method to transform a compositional specification of a specializer for a functional programming language into a set of combinators that can be used to perform the same specialization more efficiently. The main transformation steps are the transition to higher-order abstract syntax and untagging. All transformation steps are proved correct. The resulting combinators can be implemented in any functional language, typed or untyped, pure or impure. They may also be considered as forming a domain-specific language for meta-programming. We demonstrate the generality of the method by applying it to several specializers of increasing strength. We demonstrate its efficiency by comparing it with a traditional specialization system based on self-application.


Author(s):  
Richard Schumi ◽  
Jun Sun

AbstractCompilers are error-prone due to their high complexity. They are relevant for not only general purpose programming languages, but also for many domain specific languages. Bugs in compilers can potentially render all programs at risk. It is thus crucial that compilers are systematically tested, if not verified. Recently, a number of efforts have been made to formalise and standardise programming language semantics, which can be applied to verify the correctness of the respective compilers. In this work, we present a novel specification-based testing method named SpecTest to better utilise these semantics for testing. By applying an executable semantics as test oracle, SpecTest can discover deep semantic errors in compilers. Compared to existing approaches, SpecTest is built upon a novel test coverage criterion called semantic coverage which brings together mutation testing and fuzzing to specifically target less tested language features. We apply SpecTest to systematically test two compilers, i.e., the Java compiler and the Solidity compiler. SpecTest improves the semantic coverage of both compilers considerably and reveals multiple previously unknown bugs.


2018 ◽  
Author(s):  
Neel Gala ◽  
Madhusudan G. S. ◽  
Paul George ◽  
Anmol Sahoo ◽  
Arjun Menon ◽  
...  

Processors have become ubiquitous in all the appliances and machines we use, in both consumer and industrial settings. These processors range from extremely small and low power micro-controllers (used in motor controls, home robots and appliances) to high-performance multi-core processors (used in servers and supercomputers). However, the growth of modern AI/ML environments (like Caffe[Jia et al. 2014], Tensorflow[Abadi et al. 2016]) and the need for features like enhanced security has forced the industry to look beyond general purpose solutions and towards domain-specific-customizations. While a large number of companies today can develop custom ASICs (Application Specific Integrated Chips) and license specific silicon blocks from chip-vendors to develop a customized SoCs (System on Chips), at the heart of every design is the processor and the associated hardware. To serve modern workloads better, these processors also need to be customized, upgraded, re-designed and augmented suitably. This requires that vendors/consumers have access to appropriate processor variants and the flexibility to make modifications and ship them at an affordable cost.


Author(s):  
Harry H. Cheng

Abstract The CH programming language, a high-performance C, is designed to be a superset of ANSI C. CH bridges the gap between ANSI C and FORTRAN; it encompasses almost all the programming capabilities of FORTRAN 77 in the current implementation and consists of features of many other programming languages and software packages. Unlike other general-purpose programming languages, CH is designed to be especially suitable for applications in mechanical systems engineering. Because of our research interests, many programming features in CH have been implemented for design automation, although they are useful in other applications as well. In this paper we will describe these new programming features for design automation, as they are currently implemented in CH in comparison with ANSI C and FORTRAN 77.


2020 ◽  
Author(s):  
Jamie Buck ◽  
Rena Subotnik ◽  
Frank Worrell ◽  
Paula Olszewski-Kubilius ◽  
Chi Wang

2012 ◽  
Vol 17 (4) ◽  
pp. 207-216 ◽  
Author(s):  
Magdalena Szymczyk ◽  
Piotr Szymczyk

Abstract The MATLAB is a technical computing language used in a variety of fields, such as control systems, image and signal processing, visualization, financial process simulations in an easy-to-use environment. MATLAB offers "toolboxes" which are specialized libraries for variety scientific domains, and a simplified interface to high-performance libraries (LAPACK, BLAS, FFTW too). Now MATLAB is enriched by the possibility of parallel computing with the Parallel Computing ToolboxTM and MATLAB Distributed Computing ServerTM. In this article we present some of the key features of MATLAB parallel applications focused on using GPU processors for image processing.


2014 ◽  
Vol 36 (4) ◽  
pp. 790-798
Author(s):  
Kai ZHANG ◽  
Shu-Ming CHEN ◽  
Yao-Hua WANG ◽  
Xi NING

2011 ◽  
Vol 28 (1) ◽  
pp. 1-14 ◽  
Author(s):  
W. van Straten ◽  
M. Bailes

Abstractdspsr is a high-performance, open-source, object-oriented, digital signal processing software library and application suite for use in radio pulsar astronomy. Written primarily in C++, the library implements an extensive range of modular algorithms that can optionally exploit both multiple-core processors and general-purpose graphics processing units. After over a decade of research and development, dspsr is now stable and in widespread use in the community. This paper presents a detailed description of its functionality, justification of major design decisions, analysis of phase-coherent dispersion removal algorithms, and demonstration of performance on some contemporary microprocessor architectures.


Author(s):  
Breno A. de Melo Menezes ◽  
Nina Herrmann ◽  
Herbert Kuchen ◽  
Fernando Buarque de Lima Neto

AbstractParallel implementations of swarm intelligence algorithms such as the ant colony optimization (ACO) have been widely used to shorten the execution time when solving complex optimization problems. When aiming for a GPU environment, developing efficient parallel versions of such algorithms using CUDA can be a difficult and error-prone task even for experienced programmers. To overcome this issue, the parallel programming model of Algorithmic Skeletons simplifies parallel programs by abstracting from low-level features. This is realized by defining common programming patterns (e.g. map, fold and zip) that later on will be converted to efficient parallel code. In this paper, we show how algorithmic skeletons formulated in the domain specific language Musket can cope with the development of a parallel implementation of ACO and how that compares to a low-level implementation. Our experimental results show that Musket suits the development of ACO. Besides making it easier for the programmer to deal with the parallelization aspects, Musket generates high performance code with similar execution times when compared to low-level implementations.


Sign in / Sign up

Export Citation Format

Share Document