task parallelism Latest Research Papers

A Hierarchical Hybrid Locking Protocol for Parallel Real-Time Tasks

ACM Transactions on Embedded Computing Systems ◽

10.1145/3477017 ◽

2021 ◽

Vol 20 (5s) ◽

pp. 1-22

Author(s):

Zewei Chen ◽

Hang Lei ◽

Maolin Yang ◽

Yong Liao ◽

Lei Qiao

Keyword(s):

Real Time ◽

State Of The Art ◽

Linear Optimization ◽

Optimization Technique ◽

Mutual Exclusion ◽

Complete Analysis ◽

Task Parallelism ◽

Real Time Systems ◽

Locking Protocol ◽

Time Systems

Parallel tasks have been paid growing attention in recent years, and the scheduling with shared resources is of significant importance to real-time systems. As an efficient mechanism to provide mutual exclusion for parallel processing, spin-locks are ubiquitous in multi-processor real-time systems. However, the spin-locks suffer the scalability problem, and the intra-task parallelism further exacerbates the analytical pessimism. To overcome such deficiencies, we propose a Hierarchical Hybrid Locking Protocol (H2LP) under federated scheduling. The proposed H2LP integrates the classical Multiprocessor Stack Resource Policy (MSRP) and uses a token mechanism to reduce global contentions. We provide a complete analysis framework supporting both heavy and light tasks under federated scheduling and develop a blocking analysis with the state-of-the-art linear optimization technique. Empirical evaluations showed that the H2LP outperformed the other state-of-the-art locking protocols in at least configurations when considering exclusive clustering. Furthermore, our partitioned approach for light tasks can substantially improve schedulability by mitigating the over-provisioning problem.

Task Parallelism to Optimize Performance of Environmental Modeling Software

10.2172/1814747 ◽

2021 ◽

Author(s):

Althea Denlinger

Keyword(s):

Environmental Modeling ◽

Task Parallelism ◽

Modeling Software

Towards an optimized GROUP by abstraction for large-scale machine learning

Proceedings of the VLDB Endowment ◽

10.14778/3476249.3476284 ◽

2021 ◽

Vol 14 (11) ◽

pp. 2327-2340

Author(s):

Side Li ◽

Arun Kumar

Keyword(s):

Machine Learning ◽

Large Scale ◽

Linear Models ◽

Hybrid Approach ◽

Empirical Evaluation ◽

Parallel Execution ◽

Task Parallelism ◽

Data Systems ◽

Benchmark Datasets ◽

Boosted Decision Trees

Many applications that use large-scale machine learning (ML) increasingly prefer different models for subgroups (e.g., countries) to improve accuracy, fairness, or other desiderata. We call this emerging popular practice learning over groups , analogizing to GROUP BY in SQL, albeit for ML training instead of SQL aggregates. From the systems standpoint, this practice compounds the already data-intensive workload of ML model selection (e.g., hyperparameter tuning). Often, thousands of models may need to be trained, necessitating high-throughput parallel execution. Alas, most ML systems today focus on training one model at a time or at best, parallelizing hyperparameter tuning. This status quo leads to resource wastage, low throughput, and high runtimes. In this work, we take the first step towards enabling and optimizing learning over groups from the data systems standpoint for three popular classes of ML: linear models, neural networks, and gradient-boosted decision trees. Analytically and empirically, we compare standard approaches to execute this workload today: task-parallelism and data-parallelism. We find neither is universally dominant. We put forth a novel hybrid approach we call grouped learning that avoids redundancy in communications and I/O using a novel form of parallel gradient descent we call Gradient Accumulation Parallelism (GAP). We prototype our ideas into a system we call Kingpin built on top of existing ML tools and the flexible massively-parallel runtime Ray. An extensive empirical evaluation on large ML benchmark datasets shows that Kingpin matches or is 4x to 14x faster than state-of-the-art ML systems, including Ray's native execution and PyTorch DDP.

Enabling OpenMP Task Parallelism on Multi-FPGAs

2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) ◽

10.1109/fccm51124.2021.00047 ◽

2021 ◽

Author(s):

Ramon Nepomuceno ◽

Renan Sterle ◽

Guilherme Valarini ◽

Marcio Pereira ◽

Herve Yviquel ◽

...

Keyword(s):

Task Parallelism

Code Generation from Simulink Models with Task and Data Parallelism

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v21i.9004 ◽

2021 ◽

Vol 21 ◽

pp. 1-13

Author(s):

Pin Xu ◽

Masato Edahiro ◽

Kondo Masaki

Keyword(s):

Hierarchical Clustering ◽

Code Generation ◽

Heterogeneous Computing ◽

Parallel Programs ◽

Data Parallelism ◽

Task Parallelism ◽

Clustering Method ◽

Computing Environment ◽

Sequential Programs ◽

Data Parallel

In this paper, we propose a method to automatically generate parallelized code from Simulink models, while exploiting both task and data parallelism. Building on previous research, we propose a model-based parallelizer (MBP) that exploits task parallelism and assigns tasks to CPU cores using a hierarchical clustering method. We also propose amethod in which data-parallel SYCL code is generated from Simulink models; computations with data parallelism are expressed in the form of S-Function Builder blocks and are executed in a heterogeneous computing environment. Most parts of the procedure can be automated with scripts, and the two methods can be applied together. In the evaluation, the data-parallel programs generated using our proposed method achieved a maximum speedup of approximately 547 times, compared to sequential programs, without observable differences in the computed results. In addition, the programs generated while exploiting both task and data parallelism were confirmed to have achieved better performance than those exploiting either one of the two.

Design and development of a parallelized algorithm for face recognition in mobile cloud environment

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v10.i1.pp47-55 ◽

2021 ◽

Vol 10 (1) ◽

pp. 47

Author(s):

K. N. Bhatt ◽

Sanket S Naik Dessai ◽

V. S. Yerragudi

Keyword(s):

Face Recognition ◽

Data Storage ◽

Recognition Algorithm ◽

Training Dataset ◽

Mobile Cloud ◽

Task Parallelism ◽

Cloud Environment ◽

Huge Data ◽

Cloud Server ◽

The Face

<p>Face recognition is the biometric application to recognise the identity. Face recognition application holds a set of images which are called databases stored by the user at cloud database. Cloud computing environment, database can be stored in the cloud environment to achieve huge data storage area. The problem with these data storages are that because of that huge size processing on this storage takes too much of compiling time. This paper aims to develop face recognition in mobile cloud environment by exploiting data or task parallelism in existing face recognition algorithms. To design and develop parallel PCA based face recognition algorithm. The parallel PCA face recognition algorithm has been deployed in the cloud server for performing PCA by request of user. It matches the image on the cloud server and gives response back to the user in the fewer amounts of time and with reduced latency. The developed Parallel PCA face recognition algorithm has minimized the overall response time for the face recognition algorithm. The performance of the developed system is tested and analysed on real face images. To analyse the developed system, a centralized and distributed based server methods are developed and comparison is being carried out. The conclusion drawn that the distributed server improves the efficiency as well as the computing power as compared with centralized server system. The comparison of centralized and distributed based servers is carried out by observing the time taken while varying the number of images in the training dataset.</p>

Secure Deep Neural Network Models Publishing Against Membership Inference Attacks Via Training Task Parallelism

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2021.3129612 ◽

2021 ◽

pp. 1-1

Author(s):

Yunlong Mao ◽

Wenbo Hong ◽

Boyu Zhu ◽

Zhifei Zhu ◽

Yuan Zhang ◽

...

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Network Models ◽

Task Parallelism ◽

Training Task ◽

Neural Network Models ◽

Inference Attacks

Exploring Task Parallelism for the Multilevel Fast Multipole Algorithm

2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC) ◽

10.1109/hipc50609.2020.00018 ◽

2020 ◽

Author(s):

Michael P. Lingg ◽

Stephen M. Hughey ◽

Doga Dikbayir ◽

Balasubramaniam Shanker ◽

Hasan Metin Aktulga

Keyword(s):

Task Parallelism ◽

Fast Multipole ◽

Multilevel Fast Multipole Algorithm ◽

Fast Multipole Algorithm

A Fast Tile-Pyramid Construction Algorithm Based on Multilevel Task Parallelism

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/513/1/012056 ◽

2020 ◽

Vol 513 ◽

pp. 012056

Author(s):

Haozheng Liu ◽

Ye Wu ◽

Ning Jing ◽

Shihao Feng

Keyword(s):

Task Parallelism ◽

Construction Algorithm

torcpy: Supporting task parallelism in Python

SoftwareX ◽

10.1016/j.softx.2020.100517 ◽

2020 ◽

Vol 12 ◽

pp. 100517

Author(s):

P.E. Hadjidoukas ◽

A. Bartezzaghi ◽

F. Scheidegger ◽

R. Istrate ◽

C. Bekas ◽

...

Keyword(s):

Task Parallelism

task parallelism
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Hierarchical Hybrid Locking Protocol for Parallel Real-Time Tasks

Task Parallelism to Optimize Performance of Environmental Modeling Software

Towards an optimized GROUP by abstraction for large-scale machine learning

Enabling OpenMP Task Parallelism on Multi-FPGAs

Code Generation from Simulink Models with Task and Data Parallelism

Design and development of a parallelized algorithm for face recognition in mobile cloud environment

Secure Deep Neural Network Models Publishing Against Membership Inference Attacks Via Training Task Parallelism

Exploring Task Parallelism for the Multilevel Fast Multipole Algorithm

A Fast Tile-Pyramid Construction Algorithm Based on Multilevel Task Parallelism

torcpy: Supporting task parallelism in Python

Export Citation Format

task parallelismRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Hierarchical Hybrid Locking Protocol for Parallel Real-Time Tasks

Task Parallelism to Optimize Performance of Environmental Modeling Software

Towards an optimized GROUP by abstraction for large-scale machine learning

Enabling OpenMP Task Parallelism on Multi-FPGAs

Code Generation from Simulink Models with Task and Data Parallelism

Design and development of a parallelized algorithm for face recognition in mobile cloud environment

Secure Deep Neural Network Models Publishing Against Membership Inference Attacks Via Training Task Parallelism

Exploring Task Parallelism for the Multilevel Fast Multipole Algorithm

A Fast Tile-Pyramid Construction Algorithm Based on Multilevel Task Parallelism

torcpy: Supporting task parallelism in Python

task parallelism
Recently Published Documents