scholarly journals Study of the vectorization efficiency of loop nests with an irregular number of iterations

2019 ◽  
Vol 10 (4) ◽  
pp. 77-96
Author(s):  
Алексей Анатольевич Рыбаков ◽  
Сергей Сергеевич Шумилин

Векторизация вычислений является важной низкоуровневой оптимизацией, используемой для создания высокоэффективного параллельного кода. Особенности набора инструкций AVX-512 позволяют применять векторизацию для сложного программного контекста, в частности для гнезд циклов и циклов с сильно разветвленным управлением. При использовании векторных инструкций для контекста с неизвестным профилем исполнения существует опасность низкой эффективности векторизации. Особенно ярко это проявляется при векторизации гнезд циклов с нерегулярным числом итераций внутреннего цикла. В статье рассматривается практический подход к векторизации гнезд циклов, основанный на предикатном представлении программы. В качестве примера приводится реализация сортировки Шелла, компактная реализация которой состоит из гнезда циклов, в котором количество итераций внутреннего цикла носит нерегулярный характер и зависит от номеров итераций внешних циклов. Такой контекст является крайне неудобным для векторизации. Приводится сравнение теоретической и практической эффективности векторизации сортировки Шелла, рассматриваются особенности этого программного контекста и объясняется их негативное влияние на производительность векторизованного кода. Полученные результаты могут быть использованы исследователями и разработчиками программного обеспечения для обнаружения причин низкой эффективности векторизации программного кода с похожими особенностями.

2002 ◽  
Vol 14 (6) ◽  
pp. 1267-1281 ◽  
Author(s):  
Shuo-Peng Liao ◽  
Hsuan-Tien Lin ◽  
Chih-Jen Lin

The dual formulation of support vector regression involves two closely related sets of variables. When the decomposition method is used, many existing approaches use pairs of indices from these two sets as the working set. Basically, they select a base set first and then expand it so all indices are pairs. This makes the implementation different from that for support vector classification. In addition, a larger optimization subproblem has to be solved in each iteration. We provide theoretical proofs and conduct experiments to show that using the base set as the working set leads to similar convergence (number of iterations). Therefore, by using a smaller working set while keeping a similar number of iterations, the program can be simpler and more efficient.


Author(s):  
Kumudha Narasimhan ◽  
Aravind Acharya ◽  
Abhinav Baid ◽  
Uday Bondhugula

Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 465
Author(s):  
Agnieszka Prusińska ◽  
Krzysztof Szkatuła ◽  
Alexey Tret’yakov

This paper proposes a method for solving optimisation problems involving piecewise quadratic functions. The method provides a solution in a finite number of iterations, and the computational complexity of the proposed method is locally polynomial of the problem dimension, i.e., if the initial point belongs to the sufficiently small neighbourhood of the solution set. Proposed method could be applied for solving large systems of linear inequalities.


Mathematics ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 1306
Author(s):  
Elsayed Badr ◽  
Sultan Almotairi ◽  
Abdallah El Ghamry

In this paper, we propose a novel blended algorithm that has the advantages of the trisection method and the false position method. Numerical results indicate that the proposed algorithm outperforms the secant, the trisection, the Newton–Raphson, the bisection and the regula falsi methods, as well as the hybrid of the last two methods proposed by Sabharwal, with regard to the number of iterations and the average running time.


Signals ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 159-173
Author(s):  
Simone Fontana ◽  
Domenico Giorgio Sorrenti

Probabilistic Point Clouds Registration (PPCR) is an algorithm that, in its multi-iteration version, outperformed state-of-the-art algorithms for local point clouds registration. However, its performances have been tested using a fixed high number of iterations. To be of practical usefulness, we think that the algorithm should decide by itself when to stop, on one hand to avoid an excessive number of iterations and waste computational time, on the other to avoid getting a sub-optimal registration. With this work, we compare different termination criteria on several datasets, and prove that the chosen one produces very good results that are comparable to those obtained using a very large number of iterations, while saving computational time.


4OR ◽  
2020 ◽  
Author(s):  
Michele Conforti ◽  
Marianna De Santis ◽  
Marco Di Summa ◽  
Francesco Rinaldi

AbstractWe consider the integer points in a unimodular cone K ordered by a lexicographic rule defined by a lattice basis. To each integer point x in K we associate a family of inequalities (lex-inequalities) that define the convex hull of the integer points in K that are not lexicographically smaller than x. The family of lex-inequalities contains the Chvátal–Gomory cuts, but does not contain and is not contained in the family of split cuts. This provides a finite cutting plane method to solve the integer program $$\min \{cx: x\in S\cap \mathbb {Z}^n\}$$ min { c x : x ∈ S ∩ Z n } , where $$S\subset \mathbb {R}^n$$ S ⊂ R n is a compact set and $$c\in \mathbb {Z}^n$$ c ∈ Z n . We analyze the number of iterations of our algorithm.


2020 ◽  
Vol 11 (1) ◽  
pp. 177
Author(s):  
Pasi Fränti ◽  
Teemu Nenonen ◽  
Mingchuan Yuan

Travelling salesman problem (TSP) has been widely studied for the classical closed loop variant but less attention has been paid to the open loop variant. Open loop solution has property of being also a spanning tree, although not necessarily the minimum spanning tree (MST). In this paper, we present a simple branch elimination algorithm that removes the branches from MST by cutting one link and then reconnecting the resulting subtrees via selected leaf nodes. The number of iterations equals to the number of branches (b) in the MST. Typically, b << n where n is the number of nodes. With O-Mopsi and Dots datasets, the algorithm reaches gap of 1.69% and 0.61 %, respectively. The algorithm is suitable especially for educational purposes by showing the connection between MST and TSP, but it can also serve as a quick approximation for more complex metaheuristics whose efficiency relies on quality of the initial solution.


Sign in / Sign up

Export Citation Format

Share Document