Scheduling Multiple-version Programs on Multiple Processors

Author(s):  
Piotr Jedrzejowicz ◽  
Ewa Ratajczak
Radiation ◽  
2021 ◽  
Vol 1 (2) ◽  
pp. 79-94
Author(s):  
Peter K. Rogan ◽  
Eliseos J. Mucaki ◽  
Ben C. Shirley ◽  
Yanxin Li ◽  
Ruth C. Wilkins ◽  
...  

The dicentric chromosome (DC) assay accurately quantifies exposure to radiation; however, manual and semi-automated assignment of DCs has limited its use for a potential large-scale radiation incident. The Automated Dicentric Chromosome Identifier and Dose Estimator (ADCI) software automates unattended DC detection and determines radiation exposures, fulfilling IAEA criteria for triage biodosimetry. This study evaluates the throughput of high-performance ADCI (ADCI-HT) to stratify exposures of populations in 15 simulated population scale radiation exposures. ADCI-HT streamlines dose estimation using a supercomputer by optimal hierarchical scheduling of DC detection for varying numbers of samples and metaphase cell images in parallel on multiple processors. We evaluated processing times and accuracy of estimated exposures across census-defined populations. Image processing of 1744 samples on 16,384 CPUs required 1 h 11 min 23 s and radiation dose estimation based on DC frequencies required 32 sec. Processing of 40,000 samples at 10 exposures from five laboratories required 25 h and met IAEA criteria (dose estimates were within 0.5 Gy; median = 0.07). Geostatistically interpolated radiation exposure contours of simulated nuclear incidents were defined by samples exposed to clinically relevant exposure levels (1 and 2 Gy). Analysis of all exposed individuals with ADCI-HT required 0.6–7.4 days, depending on the population density of the simulation.


Author(s):  
Chafik Arar ◽  
Mohamed Salah Khireddine

The paper proposes a new reliable fault-tolerant scheduling algorithm for real-time embedded systems. The proposed algorithm is based on static scheduling that allows to include the dependencies and the execution cost of tasks and data dependencies in its scheduling decisions. Our scheduling algorithm is dedicated to multi-bus heterogeneous architectures with multiple processors linked by several shared buses. This scheduling algorithm is considering only one bus fault caused by hardware faults and compensated by software redundancy solutions. The proposed algorithm is based on both active and passive backup copies to minimize the scheduling length of data on buses. In the experiments, the proposed methods are evaluated in terms of data scheduling length for a set of DSP benchmarks. The experimental results show the effectiveness of our technique.


2012 ◽  
Vol 2012 ◽  
pp. 1-12 ◽  
Author(s):  
Shaily Mittal ◽  
Nitin

Nowadays, Multiprocessor System-on-Chip (MPSoC) architectures are mainly focused on by manufacturers to provide increased concurrency, instead of increased clock speed, for embedded systems. However, managing concurrency is a tough task. Hence, one major issue is to synchronize concurrent accesses to shared memory. An important characteristic of any system design process is memory configuration and data flow management. Although, it is very important to select a correct memory configuration, it might be equally imperative to choreograph the data flow between various levels of memory in an optimal manner. Memory map is a multiprocessor simulator to choreograph data flow in individual caches of multiple processors and shared memory systems. This simulator allows user to specify cache reconfigurations and number of processors within the application program and evaluates cache miss and hit rate for each configuration phase taking into account reconfiguration costs. The code is open source and in java.


Author(s):  
Ning Yang ◽  
Shiaaulir Wang ◽  
Paul Schonfeld

A Parallel Genetic Algorithm (PGA) is used for a simulation-based optimization of waterway project schedules. This PGA is designed to distribute a Genetic Algorithm application over multiple processors in order to speed up the solution search procedure for a very large combinational problem. The proposed PGA is based on a global parallel model, which is also called a master-slave model. A Message-Passing Interface (MPI) is used in developing the parallel computing program. A case study is presented, whose results show how the adaption of a simulation-based optimization algorithm to parallel computing can greatly reduce computation time. Additional techniques which are found to further improve the PGA performance include: (1) choosing an appropriate task distribution method, (2) distributing simulation replications instead of different solutions, (3) avoiding the simulation of duplicate solutions, (4) avoiding running multiple simulations simultaneously in shared-memory processors, and (5) avoiding using multiple processors which belong to different clusters (physical sub-networks).


Author(s):  
Ian Bell ◽  
Matthias Kunick

The current trend in computer architecture is for increasingly parallel computation while the clock frequency stagnates. The increase in computing speed is achieved by dividing a process into several threads which are executed in parallel on multiple processors, processors with multiple cores, cores that are able to handle multiple threads (hyper-threading), graphical processing units (GPU), or co-processors. In order to take advantage of these new architectures, algorithms that have historically been implemented for serial evaluation need to be refactored for parallelization. In this work, a native multithreading framework in C++11 for scientific and engineering model development is presented.


Author(s):  
S�ren P. Olesen ◽  
Jens Gregor ◽  
Michael G. Thomason ◽  
Gary T. Smith

Sign in / Sign up

Export Citation Format

Share Document