Generating high performance code for irregular data structures using dependent types

2021 ◽  
Author(s):  
Federico Pizzuti ◽  
Michel Steuwer ◽  
Christophe Dubach
2010 ◽  
Vol 2010 ◽  
pp. 1-16 ◽  
Author(s):  
James Coole ◽  
Greg Stitt

Field-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limited to applications exhibiting sequential memory access patterns, thereby prohibiting acceleration of important applications with irregular patterns (e.g., pointer-based data structures). In this paper, we present a design pattern for RC application development that serializes irregular data structure traversals online into a traversal cache, which allows the corresponding data to be efficiently streamed to the FPGA. The paper presents a generalized framework that benefits applications with repeated traversals, which we show can achieve between 7x and 29x speedup over pointer-based software. For applications without strictly repeated traversals, we present application-specialized extensions that benefit applications with highly similar traversals by exploiting similarity to improve memory bandwidth and execute multiple traversals in parallel. We show that these extensions can achieve a speedup between 11x and 70x on a Virtex4 LX100 for Barnes-Hut n-body simulation.


1997 ◽  
Vol 6 (1) ◽  
pp. 115-126 ◽  
Author(s):  
Rainer Koppler ◽  
Siegfried Grabner ◽  
Jens Volkert

This article motivates the usage of graphics and visualization for efficient utilization of High Performance Fortran's (HPF's) data distribution facilities. It proposes a graphical toolkit consisting of exploratory and estimation tools which allow the programmer to navigate through complex distributions and to obtain graphical ratings with respect to load distribution and communication. The toolkit has been implemented in a mapping design and visualization tool which is coupled with a compilation system for the HPF predecessor Vienna Fortran. Since this language covers a superset of HPF's facilities, the tool may also be used for visualization of HPF data structures.


2021 ◽  
Vol 244 ◽  
pp. 07001
Author(s):  
Anatoliy Nyrkov ◽  
Konstantin Ianiushkin ◽  
Andrey Nyrkov ◽  
Yulia Romanova ◽  
Vagiz Gaskarov

Recent achievements in high-performance computing significantly narrow the performance gap between single and multi-node computing, and open up opportunities for systems with remote shared memory. The combination of in-memory storage, remote direct memory access and remote calls requires rethinking how data organized, protected and queried in distributed systems. Reviewed models let us implement new interpretations of distributed algorithms allowing us to validate different approaches to avoid race conditions, decrease resource acquisition or synchronization time. In this paper, we describe the data model for mixed memory access with analysis of optimized data structures. We also provide the result of experiments, which contain a performance comparison of data structures, operating with different approaches, evaluate the limitations of these models, and show that the model does not always meet expectations. The purpose of this paper to assist developers in designing data structures that will help to achieve architectural benefits or improve the design of existing distributed system.


Author(s):  
A. Hahanova ◽  
V. Hahanov ◽  
S. Chumachenko ◽  
E. Litvinova ◽  
D. Rakhlis

Context. It is known that data structures are decisive for the creation of efficient parallel algorithms and high-performance computing devices. Therefore, the development of mathematically perfect and technologically simple data structures takes about 80 percent of the design time, when about 20 percent of time and material resources are spent on algorithms and their hardware-software coding. This lead to search for such primitives of data structures that will significantly simplify the parallel high-performance algorithms which are working on them. Models and methods for testing and simulation of digital systems are proposed, which containing certain advantages of quantum computing in terms of implementation of vector qubit data structures in technology of classical computational processes. Objective. The goal of the work is development of an innovative technology for qubit-vector synthesis and deductive analysis of tests for their verification based on vector data structures that greatly simplify algorithms that can be embedded as BIST components in digital systems on chips. Method. The deductive faults simulation is used to obtain analytical expressions focused on transporting fault lists through a functional or logical element based on the xor-operation, which serves as a measure of similarity-difference between a test, a function and faults which is specified in the same way in one of the formats − a table, graph, equation. A binary vector is proposed as the most technologically advanced primitive of data structures for setting logical functionality for the purpose of parallel synthesis and analysis of digital systems. The parallelism of solving combinatorial problems is a physical property of quantum computing, which in classical computing, for parallel simulation and faults diagnostics, is provided by unitary-coded data structures due to excess memory. Results. 1) A method of analytical synthesis of deductive logic for functional elements on the gate level and register transfer level has been developed. 2) A deductive processor for faults simulation based on transporting input lists or faults vectors to external outputs of digital circuits was proposed. 3) The qubit-vector form of logic setting and methods of qubit synthesis of deductive equations for faults simulation were described. 4) A qubit-vector method for the tests’ synthesis which is using derivatives calculated by vector coverage of logic has been developed. 5) Models and methods verification is performed on test examples in the software implementation of structures and algorithms. Conclusions. The scientific novelty lies in the new paradigm of the technology for the synthesis of deductive RTL logic based on metric test equation, which forms the. A vector form for structures description is introduced, which makes it possible to apply wellknown technologies for the synthesis and analysis of logical circuits tests to effectively solve the problems of graph structures testing and state machine models of digital devices. The practical significance is reflected in the examples of analytical synthesis of deductive logic for functional elements on gate level and register transfer level. A deductive processor for faults simulation which is focused on implementation as a BIST tool, which is used in online testing, simulation and fault diagnosis for digital systems on chips is proposed. A qubit-vector form of the digital systems description is proposed, which surpasses the existing methods of computing devices development in terms of the metric: manufacturability, compactness, speed and quality. A software application has been developed that implements the main testing, simulation and diagnostics services which are used in the educational process to study the advantages of qubit-vector data structures and algorithms. The computational complexity of synthesis processes and deductive formulas for logic and their usage in fault simulation are given.


Sign in / Sign up

Export Citation Format

Share Document