fine grained parallelism
Recently Published Documents


TOTAL DOCUMENTS

60
(FIVE YEARS 8)

H-INDEX

9
(FIVE YEARS 1)

Author(s):  
Poornima Nookala ◽  
Peter Dinda ◽  
Kyle C. Hale ◽  
Kyle Chard ◽  
Ioan Raicu

2021 ◽  
Vol 8 (3) ◽  
pp. 1-18
Author(s):  
James Edwards ◽  
Uzi Vishkin

Boolean satisfiability (SAT) is an important performance-hungry problem with applications in many problem domains. However, most work on parallelizing SAT solvers has focused on coarse-grained, mostly embarrassing, parallelism. Here, we study fine-grained parallelism that can speed up existing sequential SAT solvers, which all happen to be of the so-called Conflict-Directed Clause Learning variety. We show the potential for speedups of up to 382× across a variety of problem instances. We hope that these results will stimulate future research, particularly with respect to a computer architecture open problem we present.


2020 ◽  
Vol 10 (14) ◽  
pp. 5019 ◽  
Author(s):  
Shuli Sun ◽  
Zhihong Gou ◽  
Mingguang Geng

Mesh quality can affect both the accuracy and efficiency of numerical solutions. This paper first proposes a geometry-based smoothing and untangling method for 2D meshes based on explicit element geometric transformation and element stitching. A new explicit element geometric transformation (EEGT) operation for polygonal elements is firstly presented. The transformation, if applied iteratively to an arbitrary polygon (even inverted), will improve its regularity and quality. Then a well-designed element stitching scheme is introduced, which is achieved by carefully choosing appropriate element weights to average the temporary nodes obtained by the above individual element transformation. Based on the explicit element geometric transformation and element stitching, a new mesh smoothing and untangling approach for 2D meshes is proposed. The proper choice of averaging weights for element stitching ensures that the elements can be transitioned smoothly and uniformly throughout the calculation domain. Numerical results show that the proposed method is able to produce high-quality meshes with no inverted elements for highly tangled meshes. Besides, the inherent regularity and fine-grained parallelism make it suitable for implementation on Graphic Processor Unit (GPU).


2020 ◽  
Vol 102 ◽  
pp. 210-221 ◽  
Author(s):  
Wenbin Jiang ◽  
Yangsong Zhang ◽  
Pai Liu ◽  
Jing Peng ◽  
Laurence T. Yang ◽  
...  

2019 ◽  
Vol 10 (4) ◽  
pp. 201-217 ◽  
Author(s):  
Николай Иванович Дикарев ◽  
Борис Михайлович Шабанов ◽  
Александр Сергеевич Шмелёв

В настоящее время резервы в повышении производительности современных процессоров практически исчерпаны, что проявляется в отсутствии роста, как тактовой частоты, так и числа команд, выполняемых в такт, которые определяют скалярную производительность процессорного ядра. В разрабатываемом векторном процессоре с архитектурой управления потоком данных (векторном потоковом процессоре) производительность процессорного ядра может быть повышена до 256 флоп в такт на ядро, что в 8 раз выше по сравнению с последними процессорами Intel Xeon. Это достигается за счет более высокой доли векторных вычислений. В работе показано, что отношение реальной производительности к пиковой на программах битонной сортировки, умножения матриц и 2D Stencil у векторного потокового процессора выше, чем у лучших процессоров традиционной архитектуры.


Sign in / Sign up

Export Citation Format

Share Document