distributed memory Latest Research Papers

Molecular docking aims to predict possible drug candidates for many diseases, and it is computationally intensive. Particularly, in simulating the ligand-receptor binding process, the binding pocket of the receptor is divided into subcubes, and when the ligand is docked into all cubes, there are many molecular docking tasks, which are extremely time-consuming. In this study, we propose a heterogeneous parallel scheme of molecular docking for the binding process of ligand to receptor to accelerate simulating. The parallel scheme includes two layers of parallelism, a coarse-grained layer of parallelism implemented in the message-passing interface (MPI) and a fine-grained layer of parallelism focused on the graphics processing unit (GPU). At the coarse-grain layer of parallelism, a docking task inside one lattice is assigned to one unique MPI process, and a grouped master-slave mode is used to allocate and schedule the tasks. Meanwhile, at the fine-gained layer of parallelism, GPU accelerators undertake the computationally intensive computing of scoring functions and related conformation spatial transformations in a single docking task. The results of the experiments for the ligand-receptor binding process show that on a multicore server with GPUs the parallel program has achieved a speedup ratio as high as 45 times in flexible docking and as high as 54.5 times in semiflexible docking, and on a distributed memory system, the docking time for flexible docking and that for semiflexible docking gradually decrease as the number of nodes used in the parallel program gradually increases. The scalability of the parallel program is also verified in multiple nodes on a distributed memory system and is approximately linear.

Download Full-text

Distributed memory programming with MPI

10.1016/b978-0-12-804605-0.00010-5 ◽

2022 ◽

pp. 89-157

Author(s):

Peter S. Pacheco ◽

Matthew Malensek

Keyword(s):

Distributed Memory

Download Full-text

Impact of Distributed-Memory Parallel Processing Approach on Performance Enhancing of Multicomputer-Multicore Systems A Review

Qalaai Zanist Scientific Journal ◽

10.25212/lfu.qzj.6.4.46 ◽

2021 ◽

Vol 6 (4) ◽

Keyword(s):

Parallel Processing ◽

Distributed Memory ◽

Multicore Systems ◽

Processing Approach ◽

Performance Enhancing

Download Full-text

Remembering Pokémania: Memory, Cognitive Assemblages, and Augmented Reality

Hyperrhiz New Media Cultures ◽

10.20415/hyp/024.e05 ◽

2021 ◽

Author(s):

Justin Grandinetti ◽

Taylor Abrams-Rollinson

Keyword(s):

Augmented Reality ◽

Distributed Memory ◽

Spatial Relations ◽

Sociotechnical Systems ◽

Structures Of Feeling ◽

Constant State ◽

Multiple Levels ◽

Temporal And Spatial ◽

Pokémon Go ◽

Short Time

Introduced in July 2016, Pokémon GO is widely considered the killer app for contemporary augmented reality. Popular attention to the game has waned in recent years, but Pokémon GO remains enormously successful in terms of both player base and revenue generation. Whether individuals experienced the game for a short time or remain dedicated hardcore players, Pokémon GO exists as memories of time and place, imbuing familiar sites and routes with new meaning and temporal connection. Attending to these complex interrelationships of place, space, mobility, humans, technologies, infrastructures, environments, and memory, we situate Pokémon GO as what Hayles (2016) calls a cognitive assemblage—sociotechnical systems of interconnectivity in which cognition is an exteriorized process occurring across multiple levels, sites, and boundaries. In turn, we conceptualize cognition (and specifically memory) not as confined within a delimited hominid body, but instead operating through contextual relations, at multiple sites, and in a constant state of becoming. By reflecting on our own experiences as part of the distributed memory of Pokémon GO, we situate memory as momentary convergence of signals made possible by infrastructures, inscribed on servers and silicon, and made part of algorithmic suggestion and learning AI. Additionally, our own memories and experiences serve to highlight the experiential complexity of cognitive assemblages in relation to structures of feeling, as well as new temporal and spatial relations.

Download Full-text

University of Warsaw Lagrangian Cloud Model (UWLCM) 2.0: Adaptation of a mixed Eulerian-Lagrangian numerical model for heterogeneous computing clusters

10.5194/gmd-2021-387 ◽

2021 ◽

Author(s):

Piotr Dziekan ◽

Piotr Zmijewski

Keyword(s):

Fluid Flow ◽

Numerical Model ◽

Distributed Memory ◽

Heterogeneous Computing ◽

Cloud Model ◽

Memory Systems ◽

Lagrangian Particles ◽

Numerical Cloud Model

Abstract. A numerical cloud model with Lagrangian particles coupled to an Eulerian flow is adapted for distributed memory systems. Eulerian and Lagrangian calculations can be done in parallell on CPUs and GPUs, respectively. Scaling efficiency and the amount of parallelization of CPU and GPU calculations both exceed 50 % for up to 40 nodes. A sophisticated Lagrangian microphysics model slows down simulation by only 50 % compared to a simplistic bulk microphysics model, thanks to the use of GPUs. Overhead of communications between cluster nodes is mostly related to the pressure solver. Presented method of adaptation for computing clusters can be used in any numerical model with Lagrangian particles coupled to an Eulerian fluid flow.

Download Full-text

Distributed Memory Guard: Enabling Secure Enclave Computing in NoC-based Architectures

10.1109/dac18074.2021.9586222 ◽

2021 ◽

Author(s):

Ghada Dessouky ◽

Mihailo Isakov ◽

Michel A. Kinsy ◽

Pouya Mahmoody ◽

Miguel Mark ◽

...

Keyword(s):

Distributed Memory

Download Full-text

An O(log2N) Fully-Balanced Resampling Algorithm for Particle Filters on Distributed Memory Architectures

Algorithms ◽

10.3390/a14120342 ◽

2021 ◽

Vol 14 (12) ◽

pp. 342

Author(s):

Alessandro Varsi ◽

Simon Maskell ◽

Paul G. Spirakis

Keyword(s):

Parallel Computing ◽

Shared Memory ◽

Time Complexity ◽

Distributed Memory ◽

Particle Filters ◽

Dynamic Models ◽

State Of The Art ◽

Novel Approach ◽

Non Gaussian ◽

Memory Architectures

Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly slow. Parallel computing can help to address this. However, resampling (and, hence, PFs as well) necessarily involves a bottleneck, the redistribution step, which is notoriously challenging to parallelize if using textbook parallel computing techniques. A state-of-the-art redistribution takes O((log2N)2) computations on Distributed Memory (DM) architectures, which most supercomputers adopt, whereas redistribution can be performed in O(log2N) on Shared Memory (SM) architectures, such as GPU or mainstream CPUs. In this paper, we propose a novel parallel redistribution for DM that achieves an O(log2N) time complexity. We also present empirical results that indicate that our novel approach outperforms the O((log2N)2) approach.

Download Full-text

Development of a Scalable Thermal Reservoir Simulator on Distributed-Memory Parallel Computers

Fluids ◽

10.3390/fluids6110395 ◽

2021 ◽

Vol 6 (11) ◽

pp. 395

Author(s):

Hui Liu ◽

Zhangxin Chen ◽

Xiaohu Guo ◽

Lihua Shen

Keyword(s):

Numerical Methods ◽

Large Scale ◽

Distributed Memory ◽

Parallel Implementation ◽

Parallel Computers ◽

Engineering Industry ◽

Petroleum Engineering ◽

Flow Equations ◽

Thermal Reservoir ◽

Reservoir Simulator

Reservoir simulation is to solve a set of fluid flow equations through porous media, which are partial differential equations from the petroleum engineering industry and described by Darcy’s law. This paper introduces the model, numerical methods, algorithms and parallel implementation of a thermal reservoir simulator that is designed for numerical simulations of a thermal reservoir with multiple components in three-dimensional domain using distributed-memory parallel computers. Its full mathematical model is introduced with correlations for important properties and well modeling. Efficient numerical methods (discretization scheme, matrix decoupling methods, and preconditioners), parallel computing technologies, and implementation details are presented. The numerical methods applied in this paper are suitable for large-scale thermal reservoir simulations with dozens of thousands of CPU cores (MPI processes), which are efficient and scalable. The simulator is designed for giant models with billions or even trillions of grid blocks using hundreds of thousands of CPUs, which is our main focus. The validation part is compared with CMG STARS, which is one of the most popular and mature commercial thermal simulators. Numerical experiments show that our results match commercial simulators, which confirms the correctness of our methods and implementations. SAGD simulation with 7406 well pairs is also presented to study the effectiveness of our numerical methods. Scalability testings demonstrate that our simulator can handle giant models with billions of grid blocks using 100,800 CPU cores and the simulator has good scalability.

Download Full-text

Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data

Journal of Parallel and Distributed Computing ◽

10.1016/j.jpdc.2021.11.001 ◽

2021 ◽

Author(s):

Fahad Saeed ◽

Muhammad Haseeb ◽

S.S. Iyengar

Keyword(s):

Mass Spectrometry ◽

Lower Bounds ◽

Distributed Memory ◽

Omics Data

Download Full-text

Improve Ocean Modelling Software NEMO 4.0 benchmarking and communication efficiency

10.5194/gmd-2021-320 ◽

2021 ◽

Author(s):

Gaston Irrmann ◽

Sébastien Masson ◽

Éric Maisonnave ◽

David Guibert ◽

Erwan Raffin

Keyword(s):

Free Surface ◽

Surface Pressure ◽

Distributed Memory ◽

Semiconductor Industry ◽

Ocean Model ◽

Ocean Modelling ◽

Parallel Decomposition ◽

Communication Efficiency ◽

Recent Trends ◽

Computing Performance

Abstract. Communications in distributed memory supercomputers are still limiting scalability of geophysical models. Consid-ering the recent trends of the semiconductor industry, we think this problem is here to stay. We present the optimisations thathave been implemented in the actual 4.0 reference version of the ocean model NEMO 4.0 to improve its scalability. Thanksto the collaboration of oceanographers and HPC experts, we identified and removed the unnecessary communications in twobottleneck routines, the computation of free surface pressure gradient and the forcing in the straights or unstructured open5boundaries. Since a wrong parallel decomposition choice could undermine computing performance, we impose its automaticdefinition in all cases, including when subdomains containing land points only are excluded from the decomposition. For asmaller audience of developers and vendors, we propose a new benchmark configuration, easy to use while offering the fullcomplexity of operational versions.

Download Full-text

distributed memory
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Molecular Docking for Ligand-Receptor Binding Process Based on Heterogeneous Computing

Distributed memory programming with MPI

Impact of Distributed-Memory Parallel Processing Approach on Performance Enhancing of Multicomputer-Multicore Systems A Review

Remembering Pokémania: Memory, Cognitive Assemblages, and Augmented Reality

University of Warsaw Lagrangian Cloud Model (UWLCM) 2.0: Adaptation of a mixed Eulerian-Lagrangian numerical model for heterogeneous computing clusters

Distributed Memory Guard: Enabling Secure Enclave Computing in NoC-based Architectures

An O(log2N) Fully-Balanced Resampling Algorithm for Particle Filters on Distributed Memory Architectures

Development of a Scalable Thermal Reservoir Simulator on Distributed-Memory Parallel Computers

Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data

Improve Ocean Modelling Software NEMO 4.0 benchmarking and communication efficiency

Export Citation Format

distributed memoryRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Molecular Docking for Ligand-Receptor Binding Process Based on Heterogeneous Computing

Distributed memory programming with MPI

Impact of Distributed-Memory Parallel Processing Approach on Performance Enhancing of Multicomputer-Multicore Systems A Review

Remembering Pokémania: Memory, Cognitive Assemblages, and Augmented Reality

University of Warsaw Lagrangian Cloud Model (UWLCM) 2.0: Adaptation of a mixed Eulerian-Lagrangian numerical model for heterogeneous computing clusters

Distributed Memory Guard: Enabling Secure Enclave Computing in NoC-based Architectures

An O(log2N) Fully-Balanced Resampling Algorithm for Particle Filters on Distributed Memory Architectures

Development of a Scalable Thermal Reservoir Simulator on Distributed-Memory Parallel Computers

Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data

Improve Ocean Modelling Software NEMO 4.0 benchmarking and communication efficiency

distributed memory
Recently Published Documents