level parallelism Latest Research Papers

Although dataflow models are known to thrive at exploiting task-level parallelism of an application, it is difficult to exploit the parallelism of data, represented well with loop structures, since these structures are not explicitly specified in existing dataflow models. SDF/L model overcomes this shortcoming by specifying the loop structures explicitly in a hierarchical fashion. We introduce a scheduling technique of an application represented by the SDF/L model onto heterogeneous processors. In the proposed method, we explore the mapping of tasks using an evolutionary meta-heuristic and schedule hierarchically in a bottom-up fashion, creating parallel loop schedules at lower levels first and then re-using them when constructing the schedule at a higher level. The efficiency of the proposed scheduling methodology is verified with benchmark examples and randomly generated SDF/L graphs.

Download Full-text

An energy efficient multi-target binary translator for instruction and data level parallelism exploitation

Design Automation for Embedded Systems ◽

10.1007/s10617-021-09258-6 ◽

2022 ◽

Author(s):

Tiago Knorst ◽

Julio Vicenzi ◽

Michael G. Jordan ◽

Jonathan H. de Almeida ◽

Guilherme Korol ◽

...

Keyword(s):

Energy Efficient ◽

Level Parallelism ◽

Data Level

Download Full-text

GraphAttack

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3469846 ◽

2021 ◽

Vol 18 (4) ◽

pp. 1-26

Author(s):

Aninda Manocha ◽

Tyler Sorensen ◽

Esin Tureci ◽

Opeoluwa Matthews ◽

Juan L. Aragón ◽

...

Keyword(s):

Large Datasets ◽

Data Parallelism ◽

Multicore Architectures ◽

Natural Representation ◽

Large Power ◽

Efficient System ◽

Software Frameworks ◽

Long Latency ◽

Diverse Application ◽

Level Parallelism

Graph structures are a natural representation of important and pervasive data. While graph applications have significant parallelism, their characteristic pointer indirect loads to neighbor data hinder scalability to large datasets on multicore systems. A scalable and efficient system must tolerate latency while leveraging data parallelism across millions of vertices. Modern Out-of-Order (OoO) cores inherently tolerate a fraction of long latencies, but become clogged when running severely memory-bound applications. Combined with large power/area footprints, this limits their parallel scaling potential and, consequently, the gains that existing software frameworks can achieve. Conversely, accelerator and memory hierarchy designs provide performant hardware specializations, but cannot support diverse application demands. To address these shortcomings, we present GraphAttack, a hardware-software data supply approach that accelerates graph applications on in-order multicore architectures. GraphAttack proposes compiler passes to (1) identify idiomatic long-latency loads and (2) slice programs along these loads into data Producer/ Consumer threads to map onto pairs of parallel cores. Each pair shares a communication queue; the Producer asynchronously issues long-latency loads, whose results are buffered in the queue and used by the Consumer. This scheme drastically increases memory-level parallelism (MLP) to mitigate latency bottlenecks. In equal-area comparisons, GraphAttack outperforms OoO cores, do-all parallelism, prefetching, and prior decoupling approaches, achieving a 2.87× speedup and 8.61× gain in energy efficiency across a range of graph applications. These improvements scale; GraphAttack achieves a 3× speedup over 64 parallel cores. Lastly, it has pragmatic design principles; it enhances in-order architectures that are gaining increasing open-source support.

Download Full-text

Accelerator-level parallelism

Communications of the ACM ◽

10.1145/3460970 ◽

2021 ◽

Vol 64 (12) ◽

pp. 36-38

Author(s):

Mark D. Hill ◽

Vijay Janapa Reddi

Keyword(s):

Computer Scientists ◽

Level Parallelism

Charging computer scientists to develop the science needed to best achieve the performance and cost goals of accelerator-level parallelism hardware and software.

Download Full-text

HYPPO: A Surrogate-Based Multi-Level Parallelism Tool for Hyperparameter Optimization

10.1109/mlhpc54614.2021.00013 ◽

2021 ◽

Author(s):

Vincent Dumont ◽

Casey Garner ◽

Anuradha Trivedi ◽

Chelsea Jones ◽

Vidya Ganapati ◽

...

Keyword(s):

Hyperparameter Optimization ◽

Multi Level ◽

Level Parallelism

Download Full-text

Exploit the Data Level Parallelism and Schedule Dependent Tasks on the Multi-core Processors

Information Sciences ◽

10.1016/j.ins.2021.10.072 ◽

2021 ◽

Author(s):

Zijun Han ◽

Guangzhi Qu ◽

Bo Liu ◽

Feng Zhang

Keyword(s):

Dependent Tasks ◽

Level Parallelism ◽

Data Level

Download Full-text

Enabling Branch-Mispredict Level Parallelism by Selectively Flushing Instructions

10.1145/3466752.3480045 ◽

2021 ◽

Author(s):

Stijn Eyerman ◽

Wim Heirman ◽

Sam Van Den Steen ◽

Ibrahim Hur

Keyword(s):

Level Parallelism

Download Full-text

COSPlay: Leveraging Task-Level Parallelism for High-Throughput Synchronous Persistence

10.1145/3466752.3480075 ◽

2021 ◽

Author(s):

Marina Vemmou ◽

Alexandros Daglis

Keyword(s):

High Throughput ◽

Level Parallelism ◽

Task Level

Download Full-text

A Multithreading Based Enhanced Process Scheduling Technique for Heterogeneous Distributed Environment

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit217543 ◽

2021 ◽

pp. 125-129

Author(s):

Krishan Kumar ◽

Renu

Keyword(s):

Optimal Schedule ◽

Fixed Number ◽

Instruction Level Parallelism ◽

Processing Unit ◽

Scheduling Problem ◽

Process Scheduling ◽

Central Processing ◽

Gang Scheduling ◽

Level Parallelism ◽

Multi Core Processor

Multithreading is ability of a central processing unit (CPU) or a single core within a multi-core processor to execute multiple processes or threads concurrently, appropriately supported by operating system. This approach differs from multiprocessing, as with multithreading processes & threads have to share resources of a single or multiple cores: computing units, CPU caches, & translation lookaside buffer (TLB). Multiprocessing systems include multiple complete processing units, multithreading aims to increase utilization of a single core by using thread-level as well as instruction-level parallelism. Objective of research is increase efficiency of scheduling dependent task using enhanced multithreading. gang scheduling of parallel implicit-deadline periodic task systems upon identical multiprocessor platforms is considered. In this scheduling problem, parallel tasks use several processors simultaneously. first algorithm is based on linear programming & is first one to be proved optimal for considered gang scheduling problem. Furthermore, it runs in polynomial time for a fixed number m of processors & an efficient implementation is fully detailed. Second algorithm is an approximation algorithm based on a fixed-priority rule that is competitive under resource augmentation analysis in order to compute an optimal schedule pattern. Precisely, its speedup factor is bounded by (2?1/m). Both algorithms are also evaluated through intensive numerical experiments. In our research we have enhanced capability of Gang Scheduling by integration of multi core processor & Cache & make simulation of performance in MATLAB.

Download Full-text

Multi-level parallelism for structural acoustic uncertainty quantification

The Journal of the Acoustical Society of America ◽

10.1121/10.0008012 ◽

2021 ◽

Vol 150 (4) ◽

pp. A169-A169

Author(s):

Andrew S. Wixom ◽

Kyle Myers ◽

Micah Shepherd

Keyword(s):

Uncertainty Quantification ◽

Multi Level ◽

Level Parallelism

Download Full-text

level parallelism
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Hierarchical Scheduling of an SDF/L Graph onto Multiple Processors

An energy efficient multi-target binary translator for instruction and data level parallelism exploitation

GraphAttack

Accelerator-level parallelism

HYPPO: A Surrogate-Based Multi-Level Parallelism Tool for Hyperparameter Optimization

Exploit the Data Level Parallelism and Schedule Dependent Tasks on the Multi-core Processors

Enabling Branch-Mispredict Level Parallelism by Selectively Flushing Instructions

COSPlay: Leveraging Task-Level Parallelism for High-Throughput Synchronous Persistence

A Multithreading Based Enhanced Process Scheduling Technique for Heterogeneous Distributed Environment

Multi-level parallelism for structural acoustic uncertainty quantification

Export Citation Format

level parallelismRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Hierarchical Scheduling of an SDF/L Graph onto Multiple Processors

An energy efficient multi-target binary translator for instruction and data level parallelism exploitation

GraphAttack

Accelerator-level parallelism

HYPPO: A Surrogate-Based Multi-Level Parallelism Tool for Hyperparameter Optimization

Exploit the Data Level Parallelism and Schedule Dependent Tasks on the Multi-core Processors

Enabling Branch-Mispredict Level Parallelism by Selectively Flushing Instructions

COSPlay: Leveraging Task-Level Parallelism for High-Throughput Synchronous Persistence

A Multithreading Based Enhanced Process Scheduling Technique for Heterogeneous Distributed Environment

Multi-level parallelism for structural acoustic uncertainty quantification

level parallelism
Recently Published Documents