general purpose processors
Recently Published Documents


TOTAL DOCUMENTS

85
(FIVE YEARS 6)

H-INDEX

16
(FIVE YEARS 2)

Energies ◽  
2020 ◽  
Vol 13 (17) ◽  
pp. 4573
Author(s):  
Anna Franczyk ◽  
Damian Gwiżdż ◽  
Andrzej Leśniak

This paper aims to provide a quantitative understanding of the performance of numerical modeling of a wave field equation using general-purpose processors. In particular, this article presents the most important aspects related to the memory workloads and execution time of the numerical modeling of both acoustic and fully elastic waves in isotropic and anisotropic mediums. The results presented in this article were calculated for the staggered grid finite difference method. Our results show that the more realistic the seismic wave simulations that are performed, the more the demand for memory and the computational capacity of the computing environment increases. The results presented in this article allow the estimation of the memory requirements and computational time of wavefield modeling for the considered model (acoustic, elastic or anisotropic) so that their feasibility can be assessed in a given computing environment and within an acceptable time. Understanding the numerical modeling performance is especially important when graphical processing units (GPU) are utilized to satisfy the intensive calculations of three-dimensional seismic forward modeling.


2018 ◽  
Vol 4 ◽  
pp. e160 ◽  
Author(s):  
Dragan D. Nikolić

Numerical solutions of equation-based simulations require computationally intensive tasks such as evaluation of model equations, linear algebra operations and solution of systems of linear equations. The focus in this work is on parallel evaluation of model equations on shared memory systems such as general purpose processors (multi-core CPUs and manycore devices), streaming processors (Graphics Processing Units and Field Programmable Gate Arrays) and heterogeneous systems. The current approaches for evaluation of model equations are reviewed and their capabilities and shortcomings analysed. Since stream computing differs from traditional computing in that the system processes a sequential stream of elements, equations must be transformed into a data structure suitable for both types. The postfix notation expression stacks are recognised as a platform and programming language independent method to describe, store in computer memory and evaluate general systems of differential and algebraic equations of any size. Each mathematical operation and its operands are described by a specially designed data structure, and every equation is transformed into an array of these structures (a Compute Stack). Compute Stacks are evaluated by a stack machine using a Last In First Out queue. The stack machine is implemented in the DAE Tools modelling software in the C99 language using two Application Programming Interface (APIs)/frameworks for parallelism. The Open Multi-Processing (OpenMP) API is used for parallelisation on general purpose processors, and the Open Computing Language (OpenCL) framework is used for parallelisation on streaming processors and heterogeneous systems. The performance of the sequential Compute Stack approach is compared to the direct C++ implementation and to the previous approach that uses evaluation trees. The new approach is 45% slower than the C++ implementation and more than five times faster than the previous one. The OpenMP and OpenCL implementations are tested on three medium-scale models using a multi-core CPU, a discrete GPU, an integrated GPU and heterogeneous computing setups. Execution times are compared and analysed and the advantages of the OpenCL implementation running on a discrete GPU and heterogeneous systems are discussed. It is found that the evaluation of model equations using the parallel OpenCL implementation running on a discrete GPU is up to twelve times faster than the sequential version while the overall simulation speed-up gained is more than three times.


2018 ◽  
Vol 35 (3) ◽  
pp. 421-432 ◽  
Author(s):  
Ben Langmead ◽  
Christopher Wilks ◽  
Valentin Antonescu ◽  
Rone Charles

2017 ◽  
Author(s):  
Ben Langmead ◽  
Christopher Wilks ◽  
Valentin Antonescu ◽  
Rone Charles

AbstractGeneral-purpose processors can now contain many dozens of processor cores and support hundreds of simultaneous threads of execution. To make best use of these threads, genomics software must contend with new and subtle computer architecture issues. We discuss some of these and propose methods for improving thread scaling in tools that analyze each read independently, such as read aligners. We implement these methods in new versions of Bowtie, Bowtie 2 and HISAT. We greatly improve thread scaling in many scenarios, including on the recent Intel Xeon Phi architecture. We also highlight how bottlenecks are exacerbated by variable-record-length file formats like FASTQ and suggest changes that enable superior scaling.


Sign in / Sign up

Export Citation Format

Share Document