scholarly journals Energy-efficient algebra kernels in FPGA for High Performance Computing

2021 ◽  
Vol 21 (2) ◽  
pp. e09
Author(s):  
Federico Favaro ◽  
Ernesto Dufrechou ◽  
Pablo Ezzatti ◽  
Juan Pablo Oliver

The dissemination of multi-core architectures and the later irruption of massively parallel devices, led to a revolution in High-Performance Computing (HPC) platforms in the last decades. As a result, Field-Programmable Gate Arrays (FPGAs) are re-emerging as a versatile and more energy-efficient alternative to other platforms. Traditional FPGA design implies using low-level Hardware Description Languages (HDL) such as VHDL or Verilog, which follow an entirely different programming model than standard software languages, and their use requires specialized knowledge of the underlying hardware. In the last years, manufacturers started to make big efforts to provide High-Level Synthesis (HLS) tools, in order to allow a grater adoption of FPGAs in the HPC community.Our work studies the use of multi-core hardware and different FPGAs to address Numerical Linear Algebra (NLA) kernels such as the general matrix multiplication GEMM and the sparse matrix-vector multiplication SpMV. Specifically, we compare the behavior of fine-tuned kernels in a multi-core CPU processor and HLS implementations on FPGAs. We perform the experimental evaluation of our implementations on a low-end and a cutting-edge FPGA platform, in terms of runtime and energy consumption, and compare the results against the Intel MKL library in CPU.  

Author(s):  
Simon McIntosh–Smith ◽  
Rob Hunt ◽  
James Price ◽  
Alex Warwick Vesztrocy

High-performance computing systems continue to increase in size in the quest for ever higher performance. The resulting increased electronic component count, coupled with the decrease in feature sizes of the silicon manufacturing processes used to build these components, may result in future exascale systems being more susceptible to soft errors caused by cosmic radiation than in current high-performance computing systems. Through the use of techniques such as hardware-based error-correcting codes and checkpoint-restart, many of these faults can be mitigated at the cost of increased hardware overhead, run-time, and energy consumption that can be as much as 10–20%. Some predictions expect these overheads to continue to grow over time. For extreme scale systems, these overheads will represent megawatts of power consumption and millions of dollars of additional hardware costs, which could potentially be avoided with more sophisticated fault-tolerance techniques. In this paper we present new software-based fault tolerance techniques that can be applied to one of the most important classes of software in high-performance computing: iterative sparse matrix solvers. Our new techniques enables us to exploit knowledge of the structure of sparse matrices in such a way as to improve the performance, energy efficiency, and fault tolerance of the overall solution.


2019 ◽  
Author(s):  
Andreas Müller ◽  
Willem Deconinck ◽  
Christian Kühnlein ◽  
Gianmarco Mengaldo ◽  
Michael Lange ◽  
...  

Abstract. In the simulation of complex multi-scale flow problems, such as those arising in weather and climate modelling, one of the biggest challenges is to satisfy operational requirements in terms of time-to-solution and energy-to-solution yet without compromising the accuracy and stability of the calculation. These competing factors require the development of state-of-the-art algorithms that can optimally exploit the targeted underlying hardware and efficiently deliver the extreme computational capabilities typically required in operational forecast production. These algorithms should (i) minimise the energy footprint along with the time required to produce a solution, (ii) maintain a satisfying level of accuracy, (iii) be numerically stable and resilient, in case of hardware or software failure. The European Centre for Medium Range Weather Forecasts (ECMWF) is leading a project called ESCAPE (Energy-efficient SCalable Algorithms for weather Prediction on Exascale supercomputers) which is funded by Horizon 2020 (H2020) under initiative Future and Emerging Technologies in High Performance Computing (FET-HPC). The goal of the ESCAPE project is to develop a sustainable strategy to evolve weather and climate prediction models to next-generation computing technologies. The project partners incorporate the expertise of leading European regional forecasting consortia, university research, experienced high-performance computing centres and hardware vendors. This paper presents an overview of results obtained in the ESCAPE project in which weather prediction have been broken down into smaller building blocks called dwarfs. The participating weather prediction models are: IFS (Integrated Forecasting System), ALARO – a combination of AROME (Application de la Recherche à l'Opérationnel a Meso-Echelle) and ALADIN (Aire Limitée Adaptation Dynamique Développement International) and COSMO-EULAG – a combination of COSMO (Consortium for Small-scale Modeling) and EULAG (Eulerian/semi-Lagrangian fluid solver). The dwarfs are analysed and optimised in terms of computing performance for different hardware architectures (mainly Intel Skylake CPUs, NVIDIA GPUs, Intel Xeon Phi). The ESCAPE project includes the development of new algorithms that are specifically designed for better energy efficiency and improved portability through domain specific languages. In addition, the modularity of the algorithmic framework, naturally allows testing different existing numerical approaches, and their interplay with the emerging heterogeneous hardware landscape. Throughout the paper, we will compare different numerical techniques to solve the main building blocks that constitute weather models, in terms of energy efficiency and performance, on a variety of computing technologies.


Author(s):  
ROBERT STEWART ◽  
PATRICK MAIER ◽  
PHIL TRINDER

AbstractReliability is set to become a major concern on emergent large-scale architectures. While there are many parallel languages, and indeed many parallel functional languages, very few address reliability. The notable exception is the widely emulated Erlang distributed actor model that provides explicit supervision and recovery of actors with isolated state. We investigate scalable transparent fault tolerant functional computation with automatic supervision and recovery of tasks. We do so by developing HdpH-RS, a variant of the Haskell distributed parallel Haskell (HdpH) DSL with Reliable Scheduling. Extending the distributed work stealing protocol of HdpH for task supervision and recovery is challenging. To eliminate elusive concurrency bugs, we validate the HdpH-RS work stealing protocol using the SPIN model checker. HdpH-RS differs from the actor model in that its principal entities are tasks, i.e. independent stateless computations, rather than isolated stateful actors. Thanks to statelessness, fault recovery can be performed automatically and entirely hidden in the HdpH-RS runtime system. Statelessness is also key for proving a crucial property of the semantics of HdpH-RS: fault recovery does not change the result of the program, akin to deterministic parallelism. HdpH-RS provides a simple distributed fork/join-style programming model, with minimal exposure of fault tolerance at the language level, and a library of higher level abstractions such as algorithmic skeletons. In fact, the HdpH-RS DSL is exactly the same as the HdpH DSL, hence users can opt in or out of fault tolerant execution without any refactoring. Computations in HdpH-RS are always as reliable as the root node, no matter how many nodes and cores are actually used. We benchmark HdpH-RS on conventional clusters and an High Performance Computing platform: all benchmarks survive Chaos Monkey random fault injection; the system scales well e.g. up to 1,400 cores on the High Performance Computing; reliability and recovery overheads are consistently low even at scale.


2017 ◽  
Vol 8 ◽  
pp. 2689-2710 ◽  
Author(s):  
Igor I Soloviev ◽  
Nikolay V Klenov ◽  
Sergey V Bakurskiy ◽  
Mikhail Yu Kupriyanov ◽  
Alexander L Gudkov ◽  
...  

The predictions of Moore’s law are considered by experts to be valid until 2020 giving rise to “post-Moore’s” technologies afterwards. Energy efficiency is one of the major challenges in high-performance computing that should be answered. Superconductor digital technology is a promising post-Moore’s alternative for the development of supercomputers. In this paper, we consider operation principles of an energy-efficient superconductor logic and memory circuits with a short retrospective review of their evolution. We analyze their shortcomings in respect to computer circuits design. Possible ways of further research are outlined.


Sign in / Sign up

Export Citation Format

Share Document