Optimal Basic Block Instruction Scheduling for Multiple-Issue Processors Using Constraing Programming

Author(s):  
Abid M. Malik ◽  
Jim McInnes ◽  
Peter van Beek
2007 ◽  
Vol 14 (6) ◽  
pp. 549-569 ◽  
Author(s):  
Abid M. Malik ◽  
Tyrel Russell ◽  
Michael Chase ◽  
Peter van Beek

1993 ◽  
Vol 2 (3) ◽  
pp. 1-5
Author(s):  
Martin Charles Golumbic ◽  
Vladimir Rainish

Instruction scheduling algorithms are used in compilers to reduce run-time delays for the compiled code by the reordering or transformation of program statements, usually at the intermediate language or assembly code level. Considerable research has been carried out on scheduling code within the scope of basic blocks, i.e., straight line sections of code, and very effective basic block schedulers are now included in most modern compilers and especially for pipeline processors. In previous work Golumbic and Rainis: IBM J. Res. Dev., Vol. 34, pp.93–97, 1990, we presented code replication techniques for scheduling beyond the scope of basic blocks that provide reasonable improvements of running time of the compiled code, but which still leaves room for further improvement. In this article we present a new method for scheduling beyond basic blocks called SHACOOF. This new technique takes advantage of a conventional, high quality basic block scheduler by first suppressing selected subsequences of instructions and then scheduling the modified sequence of instructions using the basic block scheduler. A candidate subsequence for suppression can be found by identifying a region of a program control flow graph, called an S-region, which has a unique entry and a unique exit and meets predetermined criteria. This enables scheduling of a sequence of instructions beyond basic block boundaries, with only minimal changes to an existing compiler, by identifying beneficial opportunities to cover delays that would otherwise have been beyond its scope.


2008 ◽  
Vol 17 (01) ◽  
pp. 37-54 ◽  
Author(s):  
ABID M. MALIK ◽  
JIM McINNES ◽  
PETER VAN BEEK

Instruction scheduling is one of the most important steps for improving the performance of object code produced by a compiler. A fundamental problem that arises in instruction scheduling is to find a minimum length schedule for a basic block — a straight-line sequence of code with a single entry point and a single exit point — subject to precedence, latency, and resource constraints. Solving the problem exactly is NP-complete, and heuristic approaches are currently used in most compilers. In contrast, we present a scheduler that finds provably optimal schedules for basic blocks using techniques from constraint programming. In developing our optimal scheduler, the key to scaling up to large, real problems was in the development of preprocessing techniques for improving the constraint model. We experimentally evaluated our optimal scheduler on the SPEC 2000 integer and floating point benchmarks. On this benchmark suite, the optimal scheduler was very robust — all but a handful of the hundreds of thousands of basic blocks in our benchmark suite were solved optimally within a reasonable time limit — and scaled to the largest basic blocks, including basic blocks with up to 2600 instructions. This compares favorably to the best previous exact approaches.


2009 ◽  
Vol 31 (1) ◽  
pp. 127-132
Author(s):  
Zhi-Xiong ZHOU ◽  
Hu HE ◽  
Xu YANG ◽  
Yan-Jun ZHANG ◽  
Yi-He SUN

Author(s):  
Lei Cao ◽  
Guo-Ping Liu ◽  
Wenshan Hu ◽  
Jahan Zaib Bhatti

The Android-based networked control system laboratory (NCSLab) is a remote control laboratory that adopts an extensible architecture, mainly including Android mobile devices, MATLAB servers, controllers and test rigs. In order to conduct various simulations and experiments more effectively in NCSLab, the first key issue that needs to be solved is to enable users to design their own control algorithms or functional blocks on the Android client, rather than just using the basic block libraries provided by the system. So, this paper proposes and implements a scheme for Android-based compilation of C-MEX S-functions. With this new feature, users can design personalized algorithm according to their requirements in the form of S-functions, which can be called and executed after being compiled by MATLAB server. Finally, through the experiment validation of the three-degree-of-freedom air bearing spacecraft platform, it is proved that the method of Android-based C-MEX S-functions is reliable and efficient, and this scheme well enhances the functionality and mobility of Android-based NCSLab.


1991 ◽  
Vol 26 (4) ◽  
pp. 122-131 ◽  
Author(s):  
David G. Bradlee ◽  
Susan J. Eggers ◽  
Robert R. Henry

2021 ◽  
Vol 31 (2) ◽  
pp. 1-28
Author(s):  
Gopinath Chennupati ◽  
Nandakishore Santhi ◽  
Phill Romero ◽  
Stephan Eidenbenz

Hardware architectures become increasingly complex as the compute capabilities grow to exascale. We present the Analytical Memory Model with Pipelines (AMMP) of the Performance Prediction Toolkit (PPT). PPT-AMMP takes high-level source code and hardware architecture parameters as input and predicts runtime of that code on the target hardware platform, which is defined in the input parameters. PPT-AMMP transforms the code to an (architecture-independent) intermediate representation, then (i) analyzes the basic block structure of the code, (ii) processes architecture-independent virtual memory access patterns that it uses to build memory reuse distance distribution models for each basic block, and (iii) runs detailed basic-block level simulations to determine hardware pipeline usage. PPT-AMMP uses machine learning and regression techniques to build the prediction models based on small instances of the input code, then integrates into a higher-order discrete-event simulation model of PPT running on Simian PDES engine. We validate PPT-AMMP on four standard computational physics benchmarks and present a use case of hardware parameter sensitivity analysis to identify bottleneck hardware resources on different code inputs. We further extend PPT-AMMP to predict the performance of a scientific application code, namely, the radiation transport mini-app SNAP. To this end, we analyze multi-variate regression models that accurately predict the reuse profiles and the basic block counts. We validate predicted SNAP runtimes against actual measured times.


Sign in / Sign up

Export Citation Format

Share Document