iterative compilation
Recently Published Documents


TOTAL DOCUMENTS

21
(FIVE YEARS 4)

H-INDEX

7
(FIVE YEARS 0)

2022 ◽  
Vol 19 (1) ◽  
pp. 1-25
Author(s):  
Hongzhi Liu ◽  
Jie Luo ◽  
Ying Li ◽  
Zhonghai Wu

Pass selection and phase ordering are two critical compiler auto-tuning problems. Traditional heuristic methods cannot effectively address these NP-hard problems especially given the increasing number of compiler passes and diverse hardware architectures. Recent research efforts have attempted to address these problems through machine learning. However, the large search space of candidate pass sequences, the large numbers of redundant and irrelevant features, and the lack of training program instances make it difficult to learn models well. Several methods have tried to use expert knowledge to simplify the problems, such as using only the compiler passes or subsequences in the standard levels (e.g., -O1, -O2, and -O3) provided by compiler designers. However, these methods ignore other useful compiler passes that are not contained in the standard levels. Principal component analysis (PCA) and exploratory factor analysis (EFA) have been utilized to reduce the redundancy of feature data. However, these unsupervised methods retain all the information irrelevant to the performance of compilation optimization, which may mislead the subsequent model learning. To solve these problems, we propose a compiler pass selection and phase ordering approach, called Iterative Compilation based on Metric learning and Collaborative filtering (ICMC) . First, we propose a data-driven method to construct pass subsequences according to the observed collaborative interactions and dependency among passes on a given program set. Therefore, we can make use of all available compiler passes and prune the search space. Then, a supervised metric learning method is utilized to retain useful feature information for compilation optimization while removing both the irrelevant and the redundant information. Based on the learned similarity metric, a neighborhood-based collaborative filtering method is employed to iteratively recommend a few superior compiler passes for each target program. Last, an iterative data enhancement method is designed to alleviate the problem of lacking training program instances and to enhance the performance of iterative pass recommendations. The experimental results using the LLVM compiler on all 32 cBench programs show the following: (1) ICMC significantly outperforms several state-of-the-art compiler phase ordering methods, (2) it performs the same or better than the standard level -O3 on all the test programs, and (3) it can reach an average performance speedup of 1.20 (up to 1.46) compared with the standard level -O3.


2021 ◽  
Vol 40 (3) ◽  
pp. 543-574
Author(s):  
Hameeza Ahmed ◽  
Muhammad Ali Ismail

2020 ◽  
Author(s):  
Kyriakos Georgiou ◽  
Zbigniew Chamski ◽  
Andres Amaya Garcia ◽  
David May ◽  
Kerstin Eder

Abstract Existing iterative compilation and machine learning-based optimization techniques have been proven very successful in achieving better optimizations than the standard optimization levels of a compiler. However, they were not engineered to support the tuning of a compiler’s optimizer as part of the compiler’s daily development cycle. In this paper, we first establish the required properties that a technique must exhibit to enable such tuning. We then introduce an enhancement to the classic nightly routine testing of compilers, which exhibits all the required properties and thus is capable of driving the improvement and tuning of the compiler’s common optimizer. This is achieved by leveraging resource usage and compilation information collected while systematically exploiting prefixes of the transformations applied at standard optimization levels. Experimental evaluation using the LLVM v6.0.1 compiler demonstrated that the new approach was able to reveal hidden cross-architecture and architecture-dependent potential optimizations on two popular processors: the Intel i5-6300U and the Arm Cortex-A53-based Broadcom BCM2837 used in the Raspberry Pi 3B+. As a case study, we demonstrate how the insights from our approach enabled us to identify and remove a significant shortcoming of the CFG simplification pass of the LLVM v6.0.1 compiler.


2013 ◽  
Vol 433-435 ◽  
pp. 1410-1414
Author(s):  
Qi Shen Zhu

The GCC is an auto-vectorization compiler across iterations of loops to parallelism data. Turning GCC compiler optimizations flags for auto-vectorization is a way to improve the performance ability, which is a popular approach to speed up program performance. However, there are many options in GCC compiler and selecting the best combination of these options to improve program performance through vectorization is non-trivial ( as the search space is very large ).In this work we focus on the selection of compiler transformations to auto-vectorize loops with conditional statements. The selection of compiler transformations is based on the correlation between program features, speed-up, and the analysis of the code generated and a small number of passes of iterative compilation. Our preliminary experimental results show that proposed technique attains performance improvements the best ~ 6x using loops in the TSVC benchmark suite on the state-of-the-art Intel Core i3 processor.


Sign in / Sign up

Export Citation Format

Share Document