Near Optimal Work-Stealing Tree Scheduler for Highly Irregular Data-Parallel Workloads

Languages and Compilers for Parallel Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-319-09967-5_4 ◽

2014 ◽

pp. 55-86 ◽

Cited By ~ 5

Author(s):

Aleksandar Prokopec ◽

Martin Odersky

Keyword(s):

Work Stealing ◽

Data Parallel ◽

Irregular Data

Download Full-text

Efficient Lock-Free Work-Stealing Iterators for Data-Parallel Collections

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing ◽

10.1109/pdp.2015.65 ◽

2015 ◽

Cited By ~ 7

Author(s):

Aleksandar Prokopec ◽

Dmitry Petrashko ◽

Martin Odersky

Keyword(s):

Work Stealing ◽

Data Parallel

Download Full-text

PROGRAMMING MODELS, COMPILERS, AND ALGORITHMS FOR IRREGULAR DATA-PARALLEL COMPUTATIONS

International Journal of High Speed Computing ◽

10.1142/s012905339400010x ◽

1994 ◽

Vol 06 (02) ◽

pp. 183-222 ◽

Cited By ~ 2

Author(s):

SIDDHARTHA CHATTERJEE

Keyword(s):

Parallel Computations ◽

Programming Models ◽

Data Parallel ◽

Irregular Data

Download Full-text

Rectilinear Partitioning of Irregular Data Parallel Computations

Journal of Parallel and Distributed Computing ◽

10.1006/jpdc.1994.1126 ◽

1994 ◽

Vol 23 (2) ◽

pp. 119-134 ◽

Cited By ~ 41

Author(s):

D.M. Nicol

Keyword(s):

Parallel Computations ◽

Data Parallel ◽

Irregular Data

Download Full-text

Irregular data-parallel objects in C++

Vector and Parallel Processing — VECPAR'96 - Lecture Notes in Computer Science ◽

10.1007/3-540-62828-2_113 ◽

1997 ◽

pp. 65-80

Author(s):

Jean-Luc Dekeyser ◽

Boris Kokoszko ◽

Jean-Luc Levaire ◽

Philippe Marquet

Keyword(s):

Data Parallel ◽

Irregular Data

Download Full-text

Data-parallel line relaxation method for the Navier-Stokes equations

AIAA Journal ◽

10.2514/3.14012 ◽

1998 ◽

Vol 36 ◽

pp. 1603-1609 ◽

Cited By ~ 3

Author(s):

Michael J. Wright ◽

Graham V. Candler ◽

Deepak Bose

Keyword(s):

Stokes Equations ◽

Relaxation Method ◽

Parallel Line ◽

Navier Stokes ◽

Navier Stokes Equations ◽

Data Parallel

Download Full-text

Compute cache for data parallel acceleration

Proceedings of the 12th International Workshop on Network on Chip Architectures - NoCArc ◽

10.1145/3356045.3365385 ◽

2019 ◽

Author(s):

Reetu Das

Keyword(s):

Data Parallel ◽

Parallel Acceleration

Download Full-text

Work-stealing with configurable scheduling strategies

ACM SIGPLAN Notices ◽

10.1145/2517327.2442562 ◽

2013 ◽

Vol 48 (8) ◽

pp. 315-316 ◽

Cited By ~ 1

Author(s):

Martin Wimmer ◽

Daniel Cederman ◽

Jesper Larsson Träff ◽

Philippas Tsigas

Keyword(s):

Work Stealing ◽

Scheduling Strategies

Download Full-text

Multi-GPU approach to global induction of classification trees for large-scale data mining

Applied Intelligence ◽

10.1007/s10489-020-01952-5 ◽

2021 ◽

Author(s):

Krzysztof Jurczuk ◽

Marcin Czajkowski ◽

Marek Kretowski

Keyword(s):

Data Mining ◽

Large Scale ◽

Real Life ◽

Population Based ◽

Tree Structure ◽

Global Approach ◽

Data Parallel ◽

Large Scale Data ◽

The Impact ◽

Scale Data

AbstractThis paper concerns the evolutionary induction of decision trees (DT) for large-scale data. Such a global approach is one of the alternatives to the top-down inducers. It searches for the tree structure and tests simultaneously and thus gives improvements in the prediction and size of resulting classifiers in many situations. However, it is the population-based and iterative approach that can be too computationally demanding to apply for big data mining directly. The paper demonstrates that this barrier can be overcome by smart distributed/parallel processing. Moreover, we ask the question whether the global approach can truly compete with the greedy systems for large-scale data. For this purpose, we propose a novel multi-GPU approach. It incorporates the knowledge of global DT induction and evolutionary algorithm parallelization together with efficient utilization of memory and computing GPU’s resources. The searches for the tree structure and tests are performed simultaneously on a CPU, while the fitness calculations are delegated to GPUs. Data-parallel decomposition strategy and CUDA framework are applied. Experimental validation is performed on both artificial and real-life datasets. In both cases, the obtained acceleration is very satisfactory. The solution is able to process even billions of instances in a few hours on a single workstation equipped with 4 GPUs. The impact of data characteristics (size and dimension) on convergence and speedup of the evolutionary search is also shown. When the number of GPUs grows, nearly linear scalability is observed what suggests that data size boundaries for evolutionary DT mining are fading.

Download Full-text