scholarly journals Communication Coroutines For Parallel Program Using DW26010 Many Core Processor

Author(s):  
Ajit Singh

Communication between parallel programs is an indispensable part of parallel computing. SW26010 is a heterogeneous many-core processor used to build the Sunway Taihu Light supercomputer, which is well suited for parallel computing. There is the designing and implementing a coroutine scheduling system on the SW26010 processor to improve its concurrency, it is very important and necessary to achieve communication between coroutines for the coroutine scheduling system in advance. Therefore, this paper proposes a communication system for data and information exchange between coroutines on SW26010 processor, which contains the following parts. The designing and implementation a producer-consumer mode channel communication based on ring buffer, and it designs synchronization mechanism for condition of multi-producer and multi-consumer based on the different atomic operation on MPE (management processing element) and CPE (computing processing element) of SW26010. There is also the designing of a wake-up mechanism between the producer and the consumer, which reduces the waiting of the program for communication. The testing and analysis of the performance of channel in different numbers of producers and consumers, draw the conclusion that when the number of producers and consumers increases, the channel performance will decrease.

Author(s):  
S. Blom ◽  
S. Darabi ◽  
M. Huisman ◽  
M. Safari

AbstractA commonly used approach to develop deterministic parallel programs is to augment a sequential program with compiler directives that indicate which program blocks may potentially be executed in parallel. This paper develops a verification technique to reason about such compiler directives, in particular to show that they do not change the behaviour of the program. Moreover, the verification technique is tool-supported and can be combined with proving functional correctness of the program. To develop our verification technique, we propose a simple intermediate representation (syntax and semantics) that captures the main forms of deterministic parallel programs. This language distinguishes three kinds of basic blocks: parallel, vectorised and sequential blocks, which can be composed using three different composition operators: sequential, parallel and fusion composition. We show how a widely used subset of OpenMP can be encoded into this intermediate representation. Our verification technique builds on the notion of iteration contract to specify the behaviour of basic blocks; we show that if iteration contracts are manually specified for single blocks, then that is sufficient to automatically reason about data race freedom of the composed program. Moreover, we also show that it is sufficient to establish functional correctness on a linearised version of the original program to conclude functional correctness of the parallel program. Finally, we exemplify our approach on an example OpenMP program, and we discuss how tool support is provided.


2021 ◽  
Vol 18 (1) ◽  
pp. 22-30
Author(s):  
Erna Nurmawati ◽  
Robby Hasan Pangaribuan ◽  
Ibnu Santoso

One way to deal with the presence of missing value or incomplete data is to impute the data using EM Algorithm. The need for large and fast data processing is necessary to implement parallel computing on EM algorithm serial program. In the parallel program architecture of EM Algorithm in this study, the controller is only related to the EM module whereas the EM module itself uses matrix and vector modules intensively. Parallelization is done by using OpenMP in EM modules which results in faster compute time on parallel programs than serial programs. Parallel computing with a thread of 4 (four) increases speed up, reduces compute time, and reduces efficiency when compared to parallel computing by the number of threads 2 (two).


2021 ◽  
Vol 24 (1) ◽  
pp. 157-183
Author(s):  
Никита Андреевич Катаев

Automation of parallel programming is important at any stage of parallel program development. These stages include profiling of the original program, program transformation, which allows us to achieve higher performance after program parallelization, and, finally, construction and optimization of the parallel program. It is also important to choose a suitable parallel programming model to express parallelism available in a program. On the one hand, the parallel programming model should be capable to map the parallel program to a variety of existing hardware resources. On the other hand, it should simplify the development of the assistant tools and it should allow the user to explore the parallel program the assistant tools generate in a semi-automatic way. The SAPFOR (System FOR Automated Parallelization) system combines various approaches to automation of parallel programming. Moreover, it allows the user to guide the parallelization if necessary. SAPFOR produces parallel programs according to the high-level DVMH parallel programming model which simplify the development of efficient parallel programs for heterogeneous computing clusters. This paper focuses on the approach to semi-automatic parallel programming, which SAPFOR implements. We discuss the architecture of the system and present the interactive subsystem which is useful to guide the SAPFOR through program parallelization. We used the interactive subsystem to parallelize programs from the NAS Parallel Benchmarks in a semi-automatic way. Finally, we compare the performance of manually written parallel programs with programs the SAPFOR system builds.


2012 ◽  
Vol 433-440 ◽  
pp. 2892-2898
Author(s):  
Guang Lei Fei ◽  
Jian Guo Ning ◽  
Tian Bao Ma

Parallel computing has been applied in many fields, and the parallel computing platform system, PC cluster based on MPI (Message Passing Interface) library under Linux operating system is a cost-effectiveness approach to parallel compute. In this paper, the key algorithm of parallel program of explosion and impact is presented. The techniques of solving data dependence and realizing communication between subdomain are proposed. From the test of program, the portability of MMIC-3D parallel program is satisfied, and compared with the single computer, PC cluster can improve the calculation speed and enlarge the scale greatly.


2003 ◽  
Vol 13 (03) ◽  
pp. 473-484 ◽  
Author(s):  
KONRAD HINSEN

One of the main obstacles to a more widespread use of parallel computing in computational science is the difficulty of implementing, testing, and maintaining parallel programs. The combination of a simple parallel computation model, BSP, and a high-level programming language, Python, simplifies these tasks significantly. It allows the rapid development facilities of Python to be applied to parallel programs, providing interactive development as well as interactive debugging of parallel programs.


2001 ◽  
Vol 12 (03) ◽  
pp. 285-306 ◽  
Author(s):  
NORIYUKI FUJIMOTO ◽  
TOMOKI BABA ◽  
TAKASHI HASHIMOTO ◽  
KENICHI HAGIHARA

In this paper, we report a performance gap betweeen a schedule with small makespan on the task scheduling model and the corresponding parallel program on distributed memory parallel machines. The main reason of the gap is the software overhead in the interprocessor communication. Therefore, speedup ratios of schedules on the model do not approximate well to those of parallel programs on the machines. The purpose of the paper is to get a task scheduling algorithm that generates a schedule with good approximation to the corresponding parallel program and with small makespan. For this purpose, we propose algorithm BCSH that generates only bulk synchronous schedules. In those schedules, no-communication phases and communication phases appear alternately. All interprocessor communications are done only in the latter phases, and thus the corresponding parallel programs can make better use of the message packaging technique easily. It reduces many software overheads of messages form a source processor to the same destination processor to almost one software overhead, and improves the performance of a parallel program significantly. Finally, we show some experimental results of performance gaps on BCSH, Kruatrachue's algorithm DSH, and Ahmad et al's algorithm ECPFD. The schedules by DSH and ECPFD are famous for their small makespans, but message packaging can not be effectively applied to the corresponding program. The results show that a bulk synchronous schedule with small makespan has advantages that the gap is small and the corresponding program is a high performance parallel one.


2020 ◽  
Vol 10 (1) ◽  
pp. 56-66
Author(s):  
Alexander Ishchenko

The article analyzes the current state of information exchange in the control system of a mechanized brigade during its combat operations in the anti-terrorist operation, the operation of the combined forces of the Armed Forces of Ukraine in the east of the country. It is determined that the mechanized crew communication system has a low level of survivability. The cause is a large amount of communication damage due to enemy fire damage and limited technical reliability. Maintaining a given level of survivability of the communication system is possible primarily through the timely restoration of damaged communications.Existing models that can be used to determine the estimated number of recovered communications tools are analyzed. Strengths and weaknesses are identified and directions of improvement are formulated.A simulation model developed on a personal computer in the AnyLogic 7.0.2 Professional software environment, the process of repairing communications in the repair unit of a mechanized team, which, unlike existing ones, takes into account the intensity of communication failure due to enemy fire damage combat periods, as well as communication tools that fail due to limited technical reliability.The type of queuing system used to describe the recovery process of communications in the mechanized repair unit is justified. The algorithm of work of simulation model of process of communication means restoration in repair unit of mechanized crew is given.Let's calculate the number of recovered communications using a simulation model of the communication recovery process in a mechanized brigade repair unit using a hypothetical example. The dependence of the likelihood of communication services on the number of communication repair masters has been determined. 


2015 ◽  
Vol 19 (4) ◽  
pp. 48-58
Author(s):  
A. I. Legalov ◽  
O. V. Nepomnyaschy ◽  
I. V. Matkovsky ◽  
M. S. Kropacheva

The peculiarities of transforming functional dataflow parallel programs into programs with finite resources are analysed. It is considered how these transformations are affected by the usage of asynchronous lists, the return of delayed lists and the variation of the data arrival pace relative to the time of its processing. These transformations allow us to generate multiple programs with static parallelism based on one and the some functional dataflow parallel program.


Author(s):  
И.Н. ПАНТЕЛЕЙМОНОВ ◽  
А.А. МОНАСТЫРЕНКО ◽  
А.В. БЕЛОЗЕРЦЕВ ◽  
В.В. БОЦВА ◽  
Л.В. ЩЕРБАТЫХ ◽  
...  

Рассмотрены проблемы организации связи в системе персональной подвижной спутниковой связи (СППСС) на низкоорбитальных спутниках-ретрансляторах для обеспечения широкополосного доступа к сетям передачи данных с применением небольшого персонального абонентского терминала (АТ). Дан краткий анализ существующих СППСС, определены требования к системе связи,представлены основные направления организации информационного обмена,предложенорешение создания АТ. The paper is devoted to the problems of communication organization in the low-orbit personal mobile satellite communication system for providing broadband access to data transmission networks using a small personal subscriber terminal. A brief analysis of the existing systems of personal mobile satellite communications is given, the requirements for the communication system are defined, the main directions of information exchange organization are considered, and a solution for creating a subscriber terminal is proposed.


Sign in / Sign up

Export Citation Format

Share Document