scholarly journals Profile inference revisited

2022 ◽  
Vol 6 (POPL) ◽  
pp. 1-24
Author(s):  
Wenlei He ◽  
Julián Mestre ◽  
Sergey Pupyrev ◽  
Lei Wang ◽  
Hongtao Yu

Profile-guided optimization (PGO) is an important component in modern compilers. By allowing the compiler to leverage the program’s dynamic behavior, it can often generate substantially faster binaries. Sampling-based profiling is the state-of-the-art technique for collecting execution profiles in data-center environments. However, the lowered profile accuracy caused by sampling fully optimized binary often hurts the benefits of PGO; thus, an important problem is to overcome the inaccuracy in a profile after it is collected. In this paper we tackle the problem, which is also known as profile inference and profile rectification . We investigate the classical approach for profile inference, based on computing minimum-cost maximum flows in a control-flow graph, and develop an extended model capturing the desired properties of real-world profiles. Next we provide a solid theoretical foundation of the corresponding optimization problem by studying its algorithmic aspects. We then describe a new efficient algorithm for the problem along with its implementation in an open-source compiler. An extensive evaluation of the algorithm and existing profile inference techniques on a variety of applications, including Facebook production workloads and SPEC CPU benchmarks, indicates that the new method outperforms its competitors by significantly improving the accuracy of profile data and the performance of generated binaries.

2021 ◽  
Vol 7 ◽  
pp. e377
Author(s):  
Hamid Ali ◽  
Muhammad Zaid Rafique ◽  
Muhammad Shahzad Sarfraz ◽  
Muhammad Sheraz Arshad Malik ◽  
Mohammed A. Alqahtani ◽  
...  

Real-world optimization problems are getting more and more complex due to the involvement of inter dependencies. These complex problems need more advanced optimizing techniques. The Traveling Thief Problem (TTP) is an optimization problem that combines two well-known NP-Hard problems including the 0/1 knapsack problem and traveling salesman problem. TTP contains a person known as a thief who plans a tour to collect multiple items to fill his knapsack to gain maximum profit while incurring minimum cost in a standard time interval of 600 s. This paper proposed an efficient technique to solve the TTP problem by rearranging the steps of the knapsack. Initially, the picking strategy starts randomly and then a traversal plan is generated through the Lin-Kernighan heuristic. This traversal is then improved by eliminating the insignificant cities which contribute towards profit adversely by applying the modified simulated annealing technique. The proposed technique on different instances shows promising results as compared to other state-of-the-art algorithms. This technique has outperformed on a small and medium-size instance and competitive results have been obtained in the context of relatively larger instances.


2015 ◽  
Vol 3 (2) ◽  
Author(s):  
Nazimuddin Ahmed ◽  
Shobha Duttadeka

In this paper we have considered a version of restricted constraints in the problem of maximum attainable flow (capacity) in minimum cost (distance) in a network (Ahmed, Das, & Purusotham, 2012b). By restricted constraints one means that the link(s) (cities or stations or nodes) are completed (or visited) in such a way that a particular link is to be preceded (completed or visited) by another link (precedence relation need not be immediate). Here the aim is to obtained an optimal route of a more realistic situation as to scheduling a restricted constraints to a maximum flows at a minimum cost from a source to a destination. The distance (cost) and arc capacity between any two stations are given.


2021 ◽  
Vol 50 (1) ◽  
pp. 33-40
Author(s):  
Chenhao Ma ◽  
Yixiang Fang ◽  
Reynold Cheng ◽  
Laks V.S. Lakshmanan ◽  
Wenjie Zhang ◽  
...  

Given a directed graph G, the directed densest subgraph (DDS) problem refers to the finding of a subgraph from G, whose density is the highest among all the subgraphs of G. The DDS problem is fundamental to a wide range of applications, such as fraud detection, community mining, and graph compression. However, existing DDS solutions suffer from efficiency and scalability problems: on a threethousand- edge graph, it takes three days for one of the best exact algorithms to complete. In this paper, we develop an efficient and scalable DDS solution. We introduce the notion of [x, y]-core, which is a dense subgraph for G, and show that the densest subgraph can be accurately located through the [x, y]-core with theoretical guarantees. Based on the [x, y]-core, we develop both exact and approximation algorithms. We have performed an extensive evaluation of our approaches on eight real large datasets. The results show that our proposed solutions are up to six orders of magnitude faster than the state-of-the-art.


2021 ◽  
Vol 11 (3) ◽  
pp. 1093
Author(s):  
Jeonghyun Lee ◽  
Sangkyun Lee

Convolutional neural networks (CNNs) have achieved tremendous success in solving complex classification problems. Motivated by this success, there have been proposed various compression methods for downsizing the CNNs to deploy them on resource-constrained embedded systems. However, a new type of vulnerability of compressed CNNs known as the adversarial examples has been discovered recently, which is critical for security-sensitive systems because the adversarial examples can cause malfunction of CNNs and can be crafted easily in many cases. In this paper, we proposed a compression framework to produce compressed CNNs robust against such adversarial examples. To achieve the goal, our framework uses both pruning and knowledge distillation with adversarial training. We formulate our framework as an optimization problem and provide a solution algorithm based on the proximal gradient method, which is more memory-efficient than the popular ADMM-based compression approaches. In experiments, we show that our framework can improve the trade-off between adversarial robustness and compression rate compared to the existing state-of-the-art adversarial pruning approach.


1994 ◽  
Vol 04 (03) ◽  
pp. 271-280 ◽  
Author(s):  
FLORIN BALASA ◽  
FRANK H.M. FRANSSEN ◽  
FRANCKY V.M. CATTHOOR ◽  
HUGO J. DE MAN

For multi-dimensional (M-D) signal and data processing systems, transformation of algorithmic specifications is a major instrument both in code optimization and code generation for parallelizing compilers and in control flow optimization as a preprocessor for architecture synthesis. State-of-the-art transformation techniques are limited to affine index expressions. This is however not sufficient for many important applications in image, speech and numerical processing. In this paper, a novel transformation method is introduced, oriented to the subclass of algorithm specifications that contains modulo expressions of affine functions to index M-D signals. The method employs extensively the concept of Hermite normal form. The transformation method can be carried out in polynomial time, applying only integer arithmetic.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Asmaa El Hannani ◽  
Rahhal Errattahi ◽  
Fatima Zahra Salmam ◽  
Thomas Hain ◽  
Hassan Ouahmane

AbstractSpeech based human-machine interaction and natural language understanding applications have seen a rapid development and wide adoption over the last few decades. This has led to a proliferation of studies that investigate Error detection and classification in Automatic Speech Recognition (ASR) systems. However, different data sets and evaluation protocols are used, making direct comparisons of the proposed approaches (e.g. features and models) difficult. In this paper we perform an extensive evaluation of the effectiveness and efficiency of state-of-the-art approaches in a unified framework for both errors detection and errors type classification. We make three primary contributions throughout this paper: (1) we have compared our Variant Recurrent Neural Network (V-RNN) model with three other state-of-the-art neural based models, and have shown that the V-RNN model is the most effective classifier for ASR error detection in term of accuracy and speed, (2) we have compared four features’ settings, corresponding to different categories of predictor features and have shown that the generic features are particularly suitable for real-time ASR error detection applications, and (3) we have looked at the post generalization ability of our error detection framework and performed a detailed post detection analysis in order to perceive the recognition errors that are difficult to detect.


2021 ◽  
Vol 14 (11) ◽  
pp. 2445-2458
Author(s):  
Valerio Cetorelli ◽  
Paolo Atzeni ◽  
Valter Crescenzi ◽  
Franco Milicchio

We introduce landmark grammars , a new family of context-free grammars aimed at describing the HTML source code of pages published by large and templated websites and therefore at effectively tackling Web data extraction problems. Indeed, they address the inherent ambiguity of HTML, one of the main challenges of Web data extraction, which, despite over twenty years of research, has been largely neglected by the approaches presented in literature. We then formalize the Smallest Extraction Problem (SEP), an optimization problem for finding the grammar of a family that best describes a set of pages and contextually extract their data. Finally, we present an unsupervised learning algorithm to induce a landmark grammar from a set of pages sharing a common HTML template, and we present an automatic Web data extraction system. The experiments on consolidated benchmarks show that the approach can substantially contribute to improve the state-of-the-art.


2021 ◽  
Vol 12 (4) ◽  
pp. 98-116
Author(s):  
Noureddine Boukhari ◽  
Fatima Debbat ◽  
Nicolas Monmarché ◽  
Mohamed Slimane

Evolution strategies (ES) are a family of strong stochastic methods for global optimization and have proved their capability in avoiding local optima more than other optimization methods. Many researchers have investigated different versions of the original evolution strategy with good results in a variety of optimization problems. However, the convergence rate of the algorithm to the global optimum stays asymptotic. In order to accelerate the convergence rate, a hybrid approach is proposed using the nonlinear simplex method (Nelder-Mead) and an adaptive scheme to control the local search application, and the authors demonstrate that such combination yields significantly better convergence. The new proposed method has been tested on 15 complex benchmark functions and applied to the bi-objective portfolio optimization problem and compared with other state-of-the-art techniques. Experimental results show that the performance is improved by this hybridization in terms of solution eminence and strong convergence.


Author(s):  
Guiying Li ◽  
Chao Qian ◽  
Chunhui Jiang ◽  
Xiaofen Lu ◽  
Ke Tang

Layer-wise magnitude-based pruning (LMP) is a very popular method for deep neural network (DNN) compression. However, tuning the layer-specific thresholds is a difficult task, since the space of threshold candidates is exponentially large and the evaluation is very expensive. Previous methods are mainly by hand and require expertise. In this paper, we propose an automatic tuning approach based on optimization, named OLMP. The idea is to transform the threshold tuning problem into a constrained optimization problem (i.e., minimizing the size of the pruned model subject to a constraint on the accuracy loss), and then use powerful derivative-free optimization algorithms to solve it. To compress a trained DNN, OLMP is conducted within a new iterative pruning and adjusting pipeline. Empirical results show that OLMP can achieve the best pruning ratio on LeNet-style models (i.e., 114 times for LeNet-300-100 and 298 times for LeNet-5) compared with some state-of-the- art DNN pruning methods, and can reduce the size of an AlexNet-style network up to 82 times without accuracy loss.


2020 ◽  
Vol 34 (06) ◽  
pp. 10044-10052 ◽  
Author(s):  
Syrine Belakaria ◽  
Aryan Deshwal ◽  
Nitthilan Kannappan Jayakodi ◽  
Janardhan Rao Doppa

We consider the problem of multi-objective (MO) blackbox optimization using expensive function evaluations, where the goal is to approximate the true Pareto set of solutions while minimizing the number of function evaluations. For example, in hardware design optimization, we need to find the designs that trade-off performance, energy, and area overhead using expensive simulations. We propose a novel uncertainty-aware search framework referred to as USeMO to efficiently select the sequence of inputs for evaluation to solve this problem. The selection method of USeMO consists of solving a cheap MO optimization problem via surrogate models of the true functions to identify the most promising candidates and picking the best candidate based on a measure of uncertainty. We also provide theoretical analysis to characterize the efficacy of our approach. Our experiments on several synthetic and six diverse real-world benchmark problems show that USeMO consistently outperforms the state-of-the-art algorithms.


Sign in / Sign up

Export Citation Format

Share Document