fine grained parallelism Latest Research Papers

Boolean satisfiability (SAT) is an important performance-hungry problem with applications in many problem domains. However, most work on parallelizing SAT solvers has focused on coarse-grained, mostly embarrassing, parallelism. Here, we study fine-grained parallelism that can speed up existing sequential SAT solvers, which all happen to be of the so-called Conflict-Directed Clause Learning variety. We show the potential for speedups of up to 382× across a variety of problem instances. We hope that these results will stimulate future research, particularly with respect to a computer architecture open problem we present.

Download Full-text

Simultaneous Smoothing and Untangling of 2D Meshes Based on Explicit Element Geometric Transformation and Element Stitching

Applied Sciences ◽

10.3390/app10145019 ◽

2020 ◽

Vol 10 (14) ◽

pp. 5019 ◽

Cited By ~ 1

Author(s):

Shuli Sun ◽

Zhihong Gou ◽

Mingguang Geng

Keyword(s):

Numerical Solutions ◽

Geometric Transformation ◽

Individual Element ◽

Mesh Quality ◽

Graphic Processor Unit ◽

Fine Grained ◽

Polygonal Elements ◽

Fine Grained Parallelism ◽

Processor Unit ◽

Arbitrary Polygon

Mesh quality can affect both the accuracy and efficiency of numerical solutions. This paper first proposes a geometry-based smoothing and untangling method for 2D meshes based on explicit element geometric transformation and element stitching. A new explicit element geometric transformation (EEGT) operation for polygonal elements is firstly presented. The transformation, if applied iteratively to an arbitrary polygon (even inverted), will improve its regularity and quality. Then a well-designed element stitching scheme is introduced, which is achieved by carefully choosing appropriate element weights to average the temporary nodes obtained by the above individual element transformation. Based on the explicit element geometric transformation and element stitching, a new mesh smoothing and untangling approach for 2D meshes is proposed. The proper choice of averaging weights for element stitching ensures that the elements can be transitioned smoothly and uniformly throughout the calculation domain. Numerical results show that the proposed method is able to produce high-quality meshes with no inverted elements for highly tangled meshes. Besides, the inherent regularity and fine-grained parallelism make it suitable for implementation on Graphic Processor Unit (GPU).

Download Full-text

Scaling Dnn-Based Video Analysis By Coarse-Grained And Fine-Grained Parallelism

2020 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme46284.2020.9102768 ◽

2020 ◽

Cited By ~ 1

Author(s):

Phanwadee Sinthong ◽

Kanak Mahadik ◽

Somdeb Sarkhel ◽

Saayan Mitra

Keyword(s):

Video Analysis ◽

Coarse Grained ◽

Fine Grained ◽

Fine Grained Parallelism

Download Full-text

Exploiting potential of deep neural networks by layer-wise fine-grained parallelism

Future Generation Computer Systems ◽

10.1016/j.future.2019.07.054 ◽

2020 ◽

Vol 102 ◽

pp. 210-221 ◽

Cited By ~ 1

Author(s):

Wenbin Jiang ◽

Yangsong Zhang ◽

Pai Liu ◽

Jing Peng ◽

Laurence T. Yang ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Fine Grained ◽

Fine Grained Parallelism

Download Full-text

Fine-grained parallelism and higher core performance: advantages of vector dataflow processor

Program systems theory and applications ◽

10.25209/2079-3316-2019-10-4-201-217 ◽

2019 ◽

Vol 10 (4) ◽

pp. 201-217 ◽

Cited By ~ 1

Author(s):

Николай Иванович Дикарев ◽

Борис Михайлович Шабанов ◽

Александр Сергеевич Шмелёв

Keyword(s):

Fine Grained ◽

Fine Grained Parallelism ◽

Intel Xeon

В настоящее время резервы в повышении производительности современных процессоров практически исчерпаны, что проявляется в отсутствии роста, как тактовой частоты, так и числа команд, выполняемых в такт, которые определяют скалярную производительность процессорного ядра. В разрабатываемом векторном процессоре с архитектурой управления потоком данных (векторном потоковом процессоре) производительность процессорного ядра может быть повышена до 256 флоп в такт на ядро, что в 8 раз выше по сравнению с последними процессорами Intel Xeon. Это достигается за счет более высокой доли векторных вычислений. В работе показано, что отношение реальной производительности к пиковой на программах битонной сортировки, умножения матриц и 2D Stencil у векторного потокового процессора выше, чем у лучших процессоров традиционной архитектуры.

Download Full-text

Exploration of Fine-Grained Parallelism for Load Balancing Eager K-truss on GPU and CPU

2019 IEEE High Performance Extreme Computing Conference (HPEC) ◽

10.1109/hpec.2019.8916473 ◽

2019 ◽

Cited By ~ 4

Author(s):

Mark Blanco ◽

Tze Meng Low ◽

Kyungjoo Kim

Keyword(s):

Load Balancing ◽

Fine Grained ◽

Fine Grained Parallelism

Download Full-text

Exposing Fine-Grained Parallelism in Sequential Gaussian Simulation

10.3997/2214-4609.201903296 ◽

2019 ◽

Author(s):

M. Khait ◽

K. Esler

Keyword(s):

Sequential Gaussian Simulation ◽

Fine Grained ◽

Fine Grained Parallelism ◽

Gaussian Simulation

Download Full-text

Unleashing Fine-Grained Parallelism on Embedded Many-Core Accelerators with Lightweight OpenMP Tasking

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2018.2814602 ◽

2018 ◽

Vol 29 (9) ◽

pp. 2150-2163 ◽

Cited By ~ 8

Author(s):

Giuseppe Tagliavini ◽

Daniele Cesarini ◽

Andrea Marongiu

Keyword(s):

Fine Grained ◽

Fine Grained Parallelism ◽

Many Core

Download Full-text

An Optimal Microarchitecture for Stencil Computation with Data Reuse and Fine-Grained Parallelism

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '18 ◽

10.1145/3174243.3174964 ◽

2018 ◽

Author(s):

Yuze Chi ◽

Peipei Zhou ◽

Jason Cong

Keyword(s):

Data Reuse ◽

Stencil Computation ◽

Fine Grained ◽

Fine Grained Parallelism

Download Full-text

fine grained parallelism
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Enabling Extremely Fine-grained Parallelism via Scalable Concurrent Queues on Modern Many-core Architectures

Study of Fine-grained Nested Parallelism in CDCL SAT Solvers

Simultaneous Smoothing and Untangling of 2D Meshes Based on Explicit Element Geometric Transformation and Element Stitching

Scaling Dnn-Based Video Analysis By Coarse-Grained And Fine-Grained Parallelism

Exploiting potential of deep neural networks by layer-wise fine-grained parallelism

Fine-grained parallelism and higher core performance: advantages of vector dataflow processor

Exploration of Fine-Grained Parallelism for Load Balancing Eager K-truss on GPU and CPU

Exposing Fine-Grained Parallelism in Sequential Gaussian Simulation

Unleashing Fine-Grained Parallelism on Embedded Many-Core Accelerators with Lightweight OpenMP Tasking

An Optimal Microarchitecture for Stencil Computation with Data Reuse and Fine-Grained Parallelism

Export Citation Format

fine grained parallelismRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Enabling Extremely Fine-grained Parallelism via Scalable Concurrent Queues on Modern Many-core Architectures

Study of Fine-grained Nested Parallelism in CDCL SAT Solvers

Simultaneous Smoothing and Untangling of 2D Meshes Based on Explicit Element Geometric Transformation and Element Stitching

Scaling Dnn-Based Video Analysis By Coarse-Grained And Fine-Grained Parallelism

Exploiting potential of deep neural networks by layer-wise fine-grained parallelism

Fine-grained parallelism and higher core performance: advantages of vector dataflow processor

Exploration of Fine-Grained Parallelism for Load Balancing Eager K-truss on GPU and CPU

Exposing Fine-Grained Parallelism in Sequential Gaussian Simulation

Unleashing Fine-Grained Parallelism on Embedded Many-Core Accelerators with Lightweight OpenMP Tasking

An Optimal Microarchitecture for Stencil Computation with Data Reuse and Fine-Grained Parallelism

fine grained parallelism
Recently Published Documents