scholarly journals Resource-Aware Device Allocation of Data-Parallel Applications on Heterogeneous Systems

Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1825
Author(s):  
Donghyeon Kim ◽  
Seokwon Kang ◽  
Junsu Lim ◽  
Sunwook Jung ◽  
Woosung Kim ◽  
...  

As recent heterogeneous systems comprise multi-core CPUs and multiple GPUs, efficient allocation of multiple data-parallel applications has become a primary goal to achieve both maximum total performance and efficiency. However, the efficient orchestration of multiple applications is highly challenging because a detailed runtime status such as expected remaining time and available memory size of each computing device is hidden. To solve these problems, we propose a dynamic data-parallel application allocation framework called ADAMS. Evaluations show that our framework improves the average total execution device time by 1.85× over the round-robin policy in the non-shared-memory system with small data set.

2014 ◽  
Vol E97.D (11) ◽  
pp. 2827-2834 ◽  
Author(s):  
Ittetsu TANIGUCHI ◽  
Junya KAIDA ◽  
Takuji HIEDA ◽  
Yuko HARA-AZUMI ◽  
Hiroyuki TOMIYAMA

2013 ◽  
Vol E96.D (10) ◽  
pp. 2268-2271
Author(s):  
Junya KAIDA ◽  
Yuko HARA-AZUMI ◽  
Takuji HIEDA ◽  
Ittetsu TANIGUCHI ◽  
Hiroyuki TOMIYAMA ◽  
...  

2003 ◽  
Vol 13 (03) ◽  
pp. 341-352 ◽  
Author(s):  
AMI MAROWKA

In a wide variety of scientific parallel applications, both task and data parallelism must be exploited to achieve the best possible performance on a multiprocessor machine. These applications induce task-graph parallelism with coarse-grain granularity. Nevertheless, using the available task-graph parallelism and combining it with data parallelism can increase the performance of parallel applications considerably since an additional degree of parallelism is exploited. The OpenMP standard supports data parallelism but does not support task-graph parallelism. In this paper we present an integration of task-graph parallelism in OpenMP by extending the parallel sections constructs to include task-index and precedence-relations matrix clauses. There are many ways in which task-graph parallelism can be supported in a programming environment. A fundamental design decision is whether the programmer has to write programs with explicit precedence relations, or if the responsibility of precedence relations generation is delegated to the compiler. One of the benefits provided by parallel programming models like OpenMP is that they liberate the programmer from dealing with the underlying details of communication and synchronization, which are cumbersome and error-prone tasks. If task-graph parallelism is to find acceptance, writing task-graph parallel programs must be no harder than writing data parallel programs, and therefore, in our design, precedence relations are described through simple programmer annotations, with implementation details handled by the system. This paper concludes with a description of several parallel application kernels that were developed to study the practical aspects of task-graph parallelism in OpenMP. The examples demonstrate that exploiting data and task parallelism in a single framework is the key to achieving good performance in a variety of applications.


2016 ◽  
Vol 73 (1) ◽  
pp. 330-342 ◽  
Author(s):  
Borja Pérez ◽  
Esteban Stafford ◽  
José Luis Bosque ◽  
Ramón Beivide

1982 ◽  
Vol 61 (s109) ◽  
pp. 34-34
Author(s):  
Samuel J. Agronow ◽  
Federico C. Mariona ◽  
Frederick C. Koppitch ◽  
Kazutoshi Mayeda

2012 ◽  
Vol 197 ◽  
pp. 271-277
Author(s):  
Zhu Ping Gong

Small data set approach is used for the estimation of Largest Lyapunov Exponent (LLE). Primarily, the mean period drawback of Small data set was corrected. On this base, the LLEs of daily qualified rate time series of HZ, an electronic manufacturing enterprise, were estimated and all positive LLEs were taken which indicate that this time series is a chaotic time series and the corresponding produce process is a chaotic process. The variance of the LLEs revealed the struggle between the divergence nature of quality system and quality control effort. LLEs showed sharp increase in getting worse quality level coincide with the company shutdown. HZ’s daily qualified rate, a chaotic time series, shows us the predictable nature of quality system in a short-run.


Author(s):  
Jianping Ju ◽  
Hong Zheng ◽  
Xiaohang Xu ◽  
Zhongyuan Guo ◽  
Zhaohui Zheng ◽  
...  

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.


Sign in / Sign up

Export Citation Format

Share Document