scholarly journals OORS: An object-oriented rewrite system

2007 ◽  
Vol 4 (2) ◽  
pp. 2-26
Author(s):  
Gernot Gebhard ◽  
Philipp Lucas

Retargeting a compiler?s back end to a new architecture is a time-consuming process. This becomes an evident problem in the area of programmable graphics hardware (graphics processing units, GPUs) or embedded processors, where architectural changes are faster than elsewhere. We propose the object-oriented rewrite system OORS to overcome this problem. Using the OORS language, a compiler developer can express the code generation and optimization phase in terms of cost-annotated rewrite rules supporting complex non-linearmatching and replacing patterns. Retargetability is achieved by organizing rules into profiles, one for each supported target architecture. Featuring a rule and profile inheritance mechanism, OORS makes the reuse of existing specifications possible. This is an improvement regarding traditional approaches. Altogether OORS increases the maintainability of the compiler?s back end and thus both decreases the complexity and reduces the effort of the retargeting process. To show the potential of this approach, we have implemented a code generation and a code optimization pattern matcher supporting different target architectures using the OORS language and introduced them in a compiler of a programming language for CPUs and GPUs.

2018 ◽  
Author(s):  
Marcel Stimberg ◽  
Dan F. M. Goodman ◽  
Thomas Nowotny

“Brian” is a popular Python-based simulator for spiking neural networks, commonly used in computational neuroscience. GeNN is a C++-based meta-compiler for accelerating spiking neural network simulations using consumer or high performance grade graphics processing units (GPUs). Here we introduce a new software package, Brian2GeNN, that connects the two systems so that users can make use of GeNN GPU acceleration when developing their models in Brian, without requiring any technical knowledge about GPUs, C++ or GeNN. The new Brian2GeNN software uses a pipeline of code generation to translate Brian scripts into C++ code that can be used as input to GeNN, and subsequently can be run on suitable NVIDIA GPU accelerators. From the user’s perspective, the entire pipeline is invoked by adding two simple lines to their Brian scripts. We have shown that using Brian2GeNN, typical models can run tens to hundreds of times faster than on CPU.


2013 ◽  
Vol 13 (3) ◽  
pp. 867-879 ◽  
Author(s):  
Stuart D. C. Walsh ◽  
Martin O. Saar

AbstractLattice-Boltzmann methods are versatile numerical modeling techniques capable of reproducing a wide variety of fluid-mechanical behavior. These methods are well suited to parallel implementation, particularly on the single-instruction multiple data (SIMD) parallel processing environments found in computer graphics processing units (GPUs).Although recent programming tools dramatically improve the ease with which GPUbased applications can be written, the programming environment still lacks the flexibility available to more traditional CPU programs. In particular, it may be difficult to develop modular and extensible programs that require variable on-device functionality with current GPU architectures.This paper describes a process of automatic code generation that overcomes these difficulties for lattice-Boltzmann simulations. It details the development of GPU-based modules for an extensible lattice-Boltzmann simulation package – LBHydra. The performance of the automatically generated code is compared to equivalent purposewritten codes for both single-phase,multiphase, andmulticomponent flows. The flexibility of the new method is demonstrated by simulating a rising, dissolving droplet moving through a porous medium with user generated lattice-Boltzmann models and subroutines.


Author(s):  
Vadim Bulavintsev ◽  
Dmitry Zhdanov

We propose a generalized method for adapting and optimizing algorithms for efficient execution on modern graphics processing units (GPU). The method consists of several steps. First, build a control flow graph (CFG) of the algorithm. Next, transform the CFG into a tree of loops and merge non-parallelizable loops into parallelizable ones. Finally, map the resulting loops tree to the tree of GPU computational units, unrolling the algorithm’s loops as necessary for the match. The mapping should be performed bottom-up, from the lowest GPU architecture levels to the highest ones, to minimize off-chip memory access and maximize register file usage. The method provides programmer with a convenient and robust mental framework and strategy for GPU code optimization. We demonstrate the method by adapting to a GPU the DPLL backtracking search algorithm for solving the Boolean satisfiability problem (SAT). The resulting GPU version of DPLL outperforms the CPU version in raw tree search performance sixfold for regular Boolean satisfiability problems and twofold for irregular ones.


2016 ◽  
Vol 52 (4) ◽  
pp. 1-9 ◽  
Author(s):  
Sidi Fu ◽  
Weilong Cui ◽  
Matthew Hu ◽  
Ruinan Chang ◽  
Michael J. Donahue ◽  
...  

Author(s):  
Steven J. Lind ◽  
Benedict D. Rogers ◽  
Peter K. Stansby

This paper presents a review of the progress of smoothed particle hydrodynamics (SPH) towards high-order converged simulations. As a mesh-free Lagrangian method suitable for complex flows with interfaces and multiple phases, SPH has developed considerably in the past decade. While original applications were in astrophysics, early engineering applications showed the versatility and robustness of the method without emphasis on accuracy and convergence. The early method was of weakly compressible form resulting in noisy pressures due to spurious pressure waves. This was effectively removed in the incompressible (divergence-free) form which followed; since then the weakly compressible form has been advanced, reducing pressure noise. Now numerical convergence studies are standard. While the method is computationally demanding on conventional processors, it is well suited to parallel processing on massively parallel computing and graphics processing units. Applications are diverse and encompass wave–structure interaction, geophysical flows due to landslides, nuclear sludge flows, welding, gearbox flows and many others. In the state of the art, convergence is typically between the first- and second-order theoretical limits. Recent advances are improving convergence to fourth order (and higher) and these will also be outlined. This can be necessary to resolve multi-scale aspects of turbulent flow.


Sign in / Sign up

Export Citation Format

Share Document