superlinear speedup
Recently Published Documents


TOTAL DOCUMENTS

31
(FIVE YEARS 4)

H-INDEX

9
(FIVE YEARS 1)

2019 ◽  
Vol 9 (24) ◽  
pp. 5368 ◽  
Author(s):  
José Crispín Zavala-Díaz ◽  
Marco Antonio Cruz-Chávez ◽  
Jacqueline López-Calderón ◽  
José Alberto Hernández-Aguilar ◽  
Martha Elena Luna-Ortíz

This paper presents a process that is based on sets of parts, where elements are fixed and removed to form different binary branch-and-bound (BB) trees, which in turn are used to build a parallel algorithm called “multi-BB”. These sequential and parallel algorithms calculate the exact solution for the 0–1 knapsack problem. The sequential algorithm solves the instances published by other researchers (and the proposals by Pisinger) to solve the not-so-complex (uncorrelated) class and some problems of the medium-complex (weakly correlated) class. The parallel algorithm solves the problems that cannot be solved with the sequential algorithm of the weakly correlated class in a cluster of multicore processors. The multi-branch-and-bound algorithms obtained parallel efficiencies of approximately 75%, but in some cases, it was possible to obtain a superlinear speedup.


2018 ◽  
Vol 35 (6) ◽  
pp. 2327-2348 ◽  
Author(s):  
Beichuan Yan ◽  
Richard Regueiro

Purpose This paper aims to present performance comparison between O(n2) and O(n) neighbor search algorithms, studies their effects for different particle shape complexity and computational granularity (CG) and investigates the influence on superlinear speedup of 3D discrete element method (DEM) for complex-shaped particles. In particular, it aims to answer the question: O(n2) or O(n) neighbor search algorithm, which performs better in parallel 3D DEM computational practice? Design/methodology/approach The O(n2) and O(n) neighbor search algorithms are carefully implemented in the code paraEllip3d, which is executed on the Department of Defense supercomputers across five orders of magnitude of simulation scale (2,500; 12,000; 150,000; 1 million and 10 million particles) to evaluate and compare the performance, using both strong and weak scaling measurements. Findings The more complex the particle shapes (from sphere to ellipsoid to poly-ellipsoid), the smaller the neighbor search fraction (NSF); and the lower is the CG, the smaller is the NSF. In both serial and parallel computing of complex-shaped 3D DEM, the O(n2) algorithm is inefficient at coarse CG; however, it executes faster than O(n) algorithm at fine CGs that are mostly used in computational practice to achieve the best performance. This means that O(n2) algorithm outperforms O(n) in parallel 3D DEM generally. Practical implications Taking for granted that O(n) outperforms O(n2) unconditionally, complex-shaped 3D DEM is a misconception commonly encountered in the computational engineering and science literature. Originality/value The paper clarifies that performance of O(n2) and O(n) neighbor search algorithms for complex-shaped 3D DEM is affected by particle shape complexity and CG. In particular, the O(n2) algorithm outperforms the O(n) algorithm in large-scale parallel 3D DEM simulations generally, even though this outperformance is counterintuitive.


2013 ◽  
Vol 29 (3) ◽  
pp. 798-806 ◽  
Author(s):  
Michael Otte ◽  
Nikolaus Correll

Sign in / Sign up

Export Citation Format

Share Document