parallel program
Recently Published Documents


TOTAL DOCUMENTS

371
(FIVE YEARS 33)

H-INDEX

23
(FIVE YEARS 1)

2022 ◽  
Vol 2022 ◽  
pp. 1-13
Author(s):  
Jianhua Li ◽  
Guanlong Liu ◽  
Zhiyuan Zhen ◽  
Zihao Shen ◽  
Shiliang Li ◽  
...  

Molecular docking aims to predict possible drug candidates for many diseases, and it is computationally intensive. Particularly, in simulating the ligand-receptor binding process, the binding pocket of the receptor is divided into subcubes, and when the ligand is docked into all cubes, there are many molecular docking tasks, which are extremely time-consuming. In this study, we propose a heterogeneous parallel scheme of molecular docking for the binding process of ligand to receptor to accelerate simulating. The parallel scheme includes two layers of parallelism, a coarse-grained layer of parallelism implemented in the message-passing interface (MPI) and a fine-grained layer of parallelism focused on the graphics processing unit (GPU). At the coarse-grain layer of parallelism, a docking task inside one lattice is assigned to one unique MPI process, and a grouped master-slave mode is used to allocate and schedule the tasks. Meanwhile, at the fine-gained layer of parallelism, GPU accelerators undertake the computationally intensive computing of scoring functions and related conformation spatial transformations in a single docking task. The results of the experiments for the ligand-receptor binding process show that on a multicore server with GPUs the parallel program has achieved a speedup ratio as high as 45 times in flexible docking and as high as 54.5 times in semiflexible docking, and on a distributed memory system, the docking time for flexible docking and that for semiflexible docking gradually decrease as the number of nodes used in the parallel program gradually increases. The scalability of the parallel program is also verified in multiple nodes on a distributed memory system and is approximately linear.


2022 ◽  
pp. 361-453
Author(s):  
Peter S. Pacheco ◽  
Matthew Malensek

2021 ◽  
Vol 28 (4) ◽  
pp. 394-412
Author(s):  
Andrew M. Mironov

The paper presents a new mathematical model of parallel programs, on the basis of which it is possible, in particular, to verify parallel programs presented on a certain subset of the parallel programming interface MPI. This model is based on the concepts of a sequential and distributed process. A parallel program is modeled as a distributed process in which sequential processes communicate by asynchronously sending and receiving messages over channels. The main advantage of the described model is the ability to simulate and verify parallel programs that generate an indefinite number of sequential processes. The proposed model is illustrated by the application of verification of the matrix multiplication MPI program.


2021 ◽  
pp. 009539972110509
Author(s):  
Michael McGann

Quasi-markets in employment services often follow social policy turns toward activation. Critics see this as no accident, arguing that marketization is intended to raise the odds that workfare policies will be implemented. Drawing on surveys of Irish frontline activation workers, this study harnesses a natural policy experiment whereby Ireland introduced a Payment-by-Results quasi-market alongside a parallel program contracted without outcomes-based contracting. Although the demandingness of activation remains modest in Ireland, the study finds that regulatory approaches are more common under market governance conditions, which in turn has been associated with significant workforce changes and stronger systems of performance monitoring.


2021 ◽  
Vol 16 (92) ◽  
pp. 60-71
Author(s):  
Alexander S. Fedulov ◽  
◽  
Yaroslav A. Fedulov ◽  
Anastasiya S. Fedulova ◽  
◽  
...  

This work is devoted to the problem of implementing an efficient parallel program that solves the asigned task using the maximum available amount of computing cluster resources in order to obtain the corresponding gain in performance with respect to the sequential version of the algorithm. The main objective of the work was to study the possibilities of joint use of the parallelization technologies OpenMP and MPI, considering the characteristics and features of the problems being solved, to increase the performance of executing parallel algorithms and programs on a computing cluster. This article provides a brief overview of approaches to calculating the sequential programs complexity functions. To determine the parallel programs complexity, an approach based on operational analysis was used. The features of the sequential programs parallelization technologies OpenMP and MPI are described. The main software and hardware factors affecting the execution speed of parallel programs on the nodes of a computing cluster are presented. The main attention in this paper is paid to the study of the impact on performance of computational and exchange operations number ratio in programs. To implement the research, parallel OpenMP and MPI testing programs were developed, in which the total number of operations and the correlation between computational and exchange operations are set. A computing cluster consisting of several nodes was used as a hardware and software platform. Experimental studies have made it possible to confirm the effectiveness of the hybrid model of a parallel program in multi-node systems with heterogeneous memory using OpenMP in shared memory subsystems, and MPI in a distributed memory subsystems


2021 ◽  
Vol 24 (1) ◽  
pp. 157-183
Author(s):  
Никита Андреевич Катаев

Automation of parallel programming is important at any stage of parallel program development. These stages include profiling of the original program, program transformation, which allows us to achieve higher performance after program parallelization, and, finally, construction and optimization of the parallel program. It is also important to choose a suitable parallel programming model to express parallelism available in a program. On the one hand, the parallel programming model should be capable to map the parallel program to a variety of existing hardware resources. On the other hand, it should simplify the development of the assistant tools and it should allow the user to explore the parallel program the assistant tools generate in a semi-automatic way. The SAPFOR (System FOR Automated Parallelization) system combines various approaches to automation of parallel programming. Moreover, it allows the user to guide the parallelization if necessary. SAPFOR produces parallel programs according to the high-level DVMH parallel programming model which simplify the development of efficient parallel programs for heterogeneous computing clusters. This paper focuses on the approach to semi-automatic parallel programming, which SAPFOR implements. We discuss the architecture of the system and present the interactive subsystem which is useful to guide the SAPFOR through program parallelization. We used the interactive subsystem to parallelize programs from the NAS Parallel Benchmarks in a semi-automatic way. Finally, we compare the performance of manually written parallel programs with programs the SAPFOR system builds.


2021 ◽  
Author(s):  
Can Heng Zhang ◽  
Fang Hua Li ◽  
Wen Zheng Zhang ◽  
Jia Wei Zhang

Author(s):  
Ajit Singh

Communication between parallel programs is an indispensable part of parallel computing. SW26010 is a heterogeneous many-core processor used to build the Sunway Taihu Light supercomputer, which is well suited for parallel computing. There is the designing and implementing a coroutine scheduling system on the SW26010 processor to improve its concurrency, it is very important and necessary to achieve communication between coroutines for the coroutine scheduling system in advance. Therefore, this paper proposes a communication system for data and information exchange between coroutines on SW26010 processor, which contains the following parts. The designing and implementation a producer-consumer mode channel communication based on ring buffer, and it designs synchronization mechanism for condition of multi-producer and multi-consumer based on the different atomic operation on MPE (management processing element) and CPE (computing processing element) of SW26010. There is also the designing of a wake-up mechanism between the producer and the consumer, which reduces the waiting of the program for communication. The testing and analysis of the performance of channel in different numbers of producers and consumers, draw the conclusion that when the number of producers and consumers increases, the channel performance will decrease.


Sign in / Sign up

Export Citation Format

Share Document