Study of Fine-grained Nested Parallelism in CDCL SAT Solvers

2021 ◽  
Vol 8 (3) ◽  
pp. 1-18
Author(s):  
James Edwards ◽  
Uzi Vishkin

Boolean satisfiability (SAT) is an important performance-hungry problem with applications in many problem domains. However, most work on parallelizing SAT solvers has focused on coarse-grained, mostly embarrassing, parallelism. Here, we study fine-grained parallelism that can speed up existing sequential SAT solvers, which all happen to be of the so-called Conflict-Directed Clause Learning variety. We show the potential for speedups of up to 382× across a variety of problem instances. We hope that these results will stimulate future research, particularly with respect to a computer architecture open problem we present.

Author(s):  
Zhuliang Yao ◽  
Shijie Cao ◽  
Wencong Xiao ◽  
Chen Zhang ◽  
Lanshun Nie

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference on general-purpose hardwares by adopting coarse-grained sparsity to prune or regularize consecutive weights for efficient computation. But this method often sacrifices model accuracy. In this paper, we propose a novel fine-grained sparsity approach, Balanced Sparsity, to achieve high model accuracy with commercial hardwares efficiently. Our approach adapts to high parallelism property of GPU, showing incredible potential for sparsity in the widely deployment of deep learning services. Experiment results show that Balanced Sparsity achieves up to 3.1x practical speedup for model inference on GPU, while retains the same high model accuracy as finegrained sparsity.


Author(s):  
JIANYONG CHEN ◽  
QIUZHEN LIN ◽  
QINGBIN HU

In this paper, a novel clonal algorithm applied in multiobjecitve optimization (NCMO) is presented, which is designed from the improvement of search operators, i.e. dynamic mutation probability, dynamic simulated binary crossover (D-SBX) operator and hybrid mutation operator combining with Gaussian and polynomial mutations (GP-HM) operator. The main notion of these approaches is to perform more coarse-grained search at initial stage in order to speed up the convergence toward the Pareto-optimal front. Once the solutions are getting close to the Pareto-optimal front, more fine-grained search is performed in order to reduce the gaps between the solutions and the Pareto-optimal front. Based on this purpose, a cooling schedule is adopted in these approaches, reducing the parameters gradually to a minimal threshold, the aim of which is to keep a desirable balance between fine-grained search and coarse-grained search. By this means, the exploratory capabilities of NCMO are enhanced. When compared with various state-of-the-art multiobjective optimization algorithms developed recently, simulation results show that NCMO has remarkable performance.


Author(s):  
K. Liagkouras ◽  
K. Metaxiotis

In this paper, we present a novel Interval-Based Mutation (IBMU) operator. The proposed mutation operator is performing coarse-grained search at initial stage in order to speed up convergence toward more promising regions of the search landscape. Then, more fine-grained search is performed in order to guide the solutions towards the Pareto front. Computational experiments indicate that the proposed mutation operator performs better than conventional approaches for solving several well-known benchmarking problems.


Author(s):  
Sebastião Miranda ◽  
Jonas Feldt ◽  
Frederico Pratas ◽  
Ricardo A Mata ◽  
Nuno Roma ◽  
...  

A novel perturbative Monte Carlo mixed quantum mechanics (QM)/molecular mechanics (MM) approach has been recently developed to simulate molecular systems in complex environments. However, the required accuracy to efficiently simulate such complex molecular systems is usually granted at the cost of long executing times. To alleviate this problem, a new parallelization strategy of multi-level Monte Carlo molecular simulations is herein proposed for heterogeneous systems. It simultaneously exploits fine-grained (at the data level), coarse-grained (at the Markov chain level) and task-grained (pure QM, pure MM and QM/MM procedures) parallelism to ensure an efficient execution in heterogeneous systems composed of central processing units and multiple and possibly different graphical processing units. This is achieved by making use of the OpenCL library, together with appropriate dynamic load balancing schemes. From the conducted evaluation with real benchmarking data, a speed-up of 56x in the computational bottleneck part was observed, which results in a global speed-up of 38x for the whole simulation, reducing the time of a typical simulation from 80 hours to only 2 hours.


2014 ◽  
Vol 988 ◽  
pp. 695-701
Author(s):  
Hai Jie Yu ◽  
Sheng Su

We summarized corresponding literatures of collaborative decision of pricing and scheduling from coarse-grained and fine-grained views, then we found that collaborative decision of pricing and production scheduling can increase more than 30% of the total profits. Main ideas and research frame of the problem are proposed.


10.29007/1tc6 ◽  
2018 ◽  
Author(s):  
Natarajan Shankar

Modularity plays a central role in logical reasoning. We want to beable to reuse proofs, proof patterns, theories, and specializedreasoning procedures. Architectures that support modularity have beendeveloped at all levels of inference: SAT solvers, theory solvers,combination solvers and rewriters, SMT solvers, simplifiers, rewriters,and tactics-based interactive theorem provers. Prior work has mostlyfocused on fine-grained modular inference. However, with theavailability of a diverse range of high-quality inference tools, it hasbecome important to systematically integrate these big components intorobust toolchains. At SRI, we have been developing a framework calledthe Evidential Tool Bus (ETB) as a distributed platform for thecoarse-grained integration of inference components into flexible,scriptable workflows. The talk describes the architecture of ETB alongwith some motivating applications.


Author(s):  
Melki Sadekh Mansuan ◽  
Benfano Soewito

The purpose of this research was to solve several problems in the rendering process such as slow rendering time and complex calculations, which caused inefficient rendering. This research analyzed the efficiency in the rendering process. This research was an experimental study by implementing a distributed rendering system with fine-grained and coarse-grained parallel decomposition in computer laboratory. The primary data used was the rendering time obtained from the rendering process of three scenes animation. Descriptive analysis method was used to compare performance using speedup and efficiency of parallel performance metrics. The results show that the distributed rendering method succeeds in increasing the rendering speed with speedup value of 9,43. Moreover, the efficiency of processor use is 94% when it is applied to solve the problem of slow rendering time in the rendering process.


Author(s):  
Rudranarayan M. Mukherjee ◽  
Paul Crozier ◽  
Kurt S. Anderson

This is the second paper in a series of two papers on using multibody dynamics algorithms and methods for coarse grained molecular dynamics simulations. In the previous paper, the theoretical discussions on this topic have been presented. This paper presents results obtained from simulating several biomolecular and bulk materials using multibody dynamics algorithms. The systems studied include water boxes, alkane chains, alanine dipeptide and carboxyl terminal fragments of Calmodulin, Ribosomal, and Rhodopsin proteins. The atomistic representations of these systems include several thousand degrees of freedom and results of several nano-second simulations of these systems are presented. The stability and validity of the simulations are studied through conservation of energy, thermodynamics properties and conformational analysis. In these simulations, a speed up of an order of magnitude is realized for conservative error bounds. A discussion is presented on the open-source software developed to facilitate future research using multibody dynamics with molecular dynamics.


2015 ◽  
Vol 2015 ◽  
pp. 1-12
Author(s):  
D. C. Kiran ◽  
S. Gurunarayanan ◽  
Janardan Prasad Misra ◽  
Abhijeet Nawal

This work discusses various compiler level global scheduling techniques for multicore processors. The main contribution of the work is to delegate the job of exploiting fine grained parallelism to the compiler, thereby reducing the hardware overhead and the programming complexity. This goal is achieved by decomposing a sequential program into multiple subblocks and constructing subblock dependency graph (SDG). The proposed schedulers select subblocks from the SDG and schedules it on different cores, by ensuring the correct order of execution of subblocks. In conjunction with parallelization techniques, locality optimizations are performed to minimize communication overhead between the cores. The results observed are indicative of better and balanced speed-up per watt.


Sign in / Sign up

Export Citation Format

Share Document