Resource-Aware Device Allocation of Data-Parallel Applications on Heterogeneous Systems

Donghyeon Kim; Seokwon Kang; Junsu Lim; Sunwook Jung; Woosung Kim; Yongjun Park

doi:10.3390/electronics9111825

Resource-Aware Device Allocation of Data-Parallel Applications on Heterogeneous Systems

Electronics ◽

10.3390/electronics9111825 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1825

Author(s):

Donghyeon Kim ◽

Seokwon Kang ◽

Junsu Lim ◽

Sunwook Jung ◽

Woosung Kim ◽

...

Keyword(s):

Heterogeneous Systems ◽

Parallel Applications ◽

Small Data ◽

Data Set ◽

Computing Device ◽

Parallel Application ◽

Multiple Gpus ◽

Data Parallel ◽

Multiple Data ◽

Resource Aware

As recent heterogeneous systems comprise multi-core CPUs and multiple GPUs, efficient allocation of multiple data-parallel applications has become a primary goal to achieve both maximum total performance and efficiency. However, the efficient orchestration of multiple applications is highly challenging because a detailed runtime status such as expected remaining time and available memory size of each computing device is hidden. To solve these problems, we propose a dynamic data-parallel application allocation framework called ADAMS. Evaluations show that our framework improves the average total execution device time by 1.85× over the round-robin policy in the non-shared-memory system with small data set.

Download Full-text

Static Mapping with Dynamic Switching of Multiple Data-Parallel Applications on Embedded Many-Core SoCs

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2014edp7012 ◽

2014 ◽

Vol E97.D (11) ◽

pp. 2827-2834 ◽

Cited By ~ 2

Author(s):

Ittetsu TANIGUCHI ◽

Junya KAIDA ◽

Takuji HIEDA ◽

Yuko HARA-AZUMI ◽

Hiroyuki TOMIYAMA

Keyword(s):

Parallel Applications ◽

Data Parallel ◽

Dynamic Switching ◽

Multiple Data ◽

Many Core

Download Full-text

Static Mapping of Multiple Data-Parallel Applications on Embedded Many-Core SoCs

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e96.d.2268 ◽

2013 ◽

Vol E96.D (10) ◽

pp. 2268-2271

Author(s):

Junya KAIDA ◽

Yuko HARA-AZUMI ◽

Takuji HIEDA ◽

Ittetsu TANIGUCHI ◽

Hiroyuki TOMIYAMA ◽

...

Keyword(s):

Parallel Applications ◽

Data Parallel ◽

Multiple Data ◽

Many Core

Download Full-text

Work Distribution of Data-Parallel Applications on Heterogeneous Systems

Lecture Notes in Computer Science - High Performance Computing ◽

10.1007/978-3-319-46079-6_6 ◽

2016 ◽

pp. 69-81

Author(s):

Suejb Memeti ◽

Sabri Pllana

Keyword(s):

Heterogeneous Systems ◽

Parallel Applications ◽

Data Parallel ◽

Work Distribution

Download Full-text

EXTENDING OPENMP FOR TASK PARALLELISM

Parallel Processing Letters ◽

10.1142/s012962640300132x ◽

2003 ◽

Vol 13 (03) ◽

pp. 341-352 ◽

Cited By ~ 2

Author(s):

AMI MAROWKA

Keyword(s):

Parallel Programs ◽

Parallel Applications ◽

Task Graph ◽

Data Parallelism ◽

Task Parallelism ◽

Design Decision ◽

Parallel Application ◽

Additional Degree ◽

Data Parallel ◽

Precedence Relations

In a wide variety of scientific parallel applications, both task and data parallelism must be exploited to achieve the best possible performance on a multiprocessor machine. These applications induce task-graph parallelism with coarse-grain granularity. Nevertheless, using the available task-graph parallelism and combining it with data parallelism can increase the performance of parallel applications considerably since an additional degree of parallelism is exploited. The OpenMP standard supports data parallelism but does not support task-graph parallelism. In this paper we present an integration of task-graph parallelism in OpenMP by extending the parallel sections constructs to include task-index and precedence-relations matrix clauses. There are many ways in which task-graph parallelism can be supported in a programming environment. A fundamental design decision is whether the programmer has to write programs with explicit precedence relations, or if the responsibility of precedence relations generation is delegated to the compiler. One of the benefits provided by parallel programming models like OpenMP is that they liberate the programmer from dealing with the underlying details of communication and synchronization, which are cumbersome and error-prone tasks. If task-graph parallelism is to find acceptance, writing task-graph parallel programs must be no harder than writing data parallel programs, and therefore, in our design, precedence relations are described through simple programmer annotations, with implementation details handled by the system. This paper concludes with a description of several parallel application kernels that were developed to study the practical aspects of task-graph parallelism in OpenMP. The examples demonstrate that exploiting data and task parallelism in a single framework is the key to achieving good performance in a variety of applications.

Download Full-text

Energy efficiency of load balancing for data-parallel applications in heterogeneous systems

The Journal of Supercomputing ◽

10.1007/s11227-016-1864-y ◽

2016 ◽

Vol 73 (1) ◽

pp. 330-342 ◽

Cited By ~ 10

Author(s):

Borja Pérez ◽

Esteban Stafford ◽

José Luis Bosque ◽

Ramón Beivide

Keyword(s):

Energy Efficiency ◽

Load Balancing ◽

Heterogeneous Systems ◽

Parallel Applications ◽

Data Parallel

Download Full-text

Improving Performance of Data-Parallel Applications on CPU-GPU Heterogeneous Systems

10.23860/thesis-duarte-ronald-2013 ◽

2013 ◽

Author(s):

◽

Ronald Duarte

Keyword(s):

Heterogeneous Systems ◽

Parallel Applications ◽

Data Parallel

Download Full-text

Simplifying programming and load balancing of data parallel applications on heterogeneous systems

Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit - GPGPU '16 ◽

10.1145/2884045.2884051 ◽

2016 ◽

Cited By ~ 11

Author(s):

Borja Pérez ◽

José Luis Bosque ◽

Ramón Beivide

Keyword(s):

Load Balancing ◽

Heterogeneous Systems ◽

Parallel Applications ◽

Data Parallel

Download Full-text

Maintenance of a Genetic Data Set on Multiple Data Base and Statistical Analysis Systems

Acta Obstetricia Et Gynecologica Scandinavica ◽

10.3109/00016348209156160 ◽

1982 ◽

Vol 61 (s109) ◽

pp. 34-34

Author(s):

Samuel J. Agronow ◽

Federico C. Mariona ◽

Frederick C. Koppitch ◽

Kazutoshi Mayeda

Keyword(s):

Statistical Analysis ◽

Data Base ◽

Genetic Data ◽

Data Set ◽

Multiple Data

Download Full-text

The Lyapunov Exponent Variance of an Electronic Manufacturing Enterprise’s Daily Qualified Rate Time Series by Improved Small Data Sets Approach

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.197.271 ◽

2012 ◽

Vol 197 ◽

pp. 271-277

Author(s):

Zhu Ping Gong

Keyword(s):

Time Series ◽

Lyapunov Exponent ◽

Chaotic Time Series ◽

Quality System ◽

Quality Level ◽

Largest Lyapunov Exponent ◽

Small Data ◽

Data Set ◽

Electronic Manufacturing ◽

Small Data Set

Small data set approach is used for the estimation of Largest Lyapunov Exponent (LLE). Primarily, the mean period drawback of Small data set was corrected. On this base, the LLEs of daily qualified rate time series of HZ, an electronic manufacturing enterprise, were estimated and all positive LLEs were taken which indicate that this time series is a chaotic time series and the corresponding produce process is a chaotic process. The variance of the LLEs revealed the struggle between the divergence nature of quality system and quality control effort. LLEs showed sharp increase in getting worse quality level coincide with the company shutdown. HZ’s daily qualified rate, a chaotic time series, shows us the predictable nature of quality system in a short-run.

Download Full-text

Classification of jujube defects in small data sets based on transfer learning

Neural Computing and Applications ◽

10.1007/s00521-021-05715-2 ◽

2021 ◽

Author(s):

Jianping Ju ◽

Hong Zheng ◽

Xiaohang Xu ◽

Zhongyuan Guo ◽

Zhaohui Zheng ◽

...

Keyword(s):

Transfer Learning ◽

Loss Function ◽

Training Model ◽

Parameter Distribution ◽

Test Accuracy ◽

Small Data ◽

Data Sets ◽

Data Set ◽

Small Data Sets

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.

Download Full-text