Parallel Fast Transform-Based Preconditioners for Large-Scale Power Grid Analysis on Graphics Processing Units (GPUs)

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.

Download Full-text

Large-scale transient stability simulation on graphics processing units

2009 IEEE Power & Energy Society General Meeting ◽

10.1109/pes.2009.5275844 ◽

2009 ◽

Cited By ~ 14

Author(s):

Vahid Jalili-Marandi ◽

Venkata Dinavahi

Keyword(s):

Graphics Processing Units ◽

Large Scale ◽

Transient Stability ◽

Graphics Processing

Download Full-text

Large-scale analytical Fourier transform of photomask layouts using graphics processing units

10.1117/12.2192040 ◽

2015 ◽

Author(s):

Julia A. Sakamoto

Keyword(s):

Fourier Transform ◽

Graphics Processing Units ◽

Large Scale ◽

Graphics Processing

Download Full-text

Toward large-scale Hybrid Monte Carlo simulations of the Hubbard model on graphics processing units

Computer Physics Communications ◽

10.1016/j.cpc.2011.04.014 ◽

2011 ◽

Vol 182 (8) ◽

pp. 1651-1656 ◽

Cited By ~ 5

Author(s):

Kyle A. Wendt ◽

Joaquín E. Drut ◽

Timo A. Lähde

Keyword(s):

Monte Carlo ◽

Monte Carlo Simulations ◽

Hubbard Model ◽

Graphics Processing Units ◽

Large Scale ◽

Hybrid Monte Carlo ◽

Graphics Processing

Download Full-text

Large Scale Bioinformatics Data Mining with Parallel Genetic Programming on Graphics Processing Units

Studies in Computational Intelligence - Parallel and Distributed Computational Intelligence ◽

10.1007/978-3-642-10675-0_6 ◽

2010 ◽

pp. 113-141 ◽

Cited By ~ 12

Author(s):

William B. Langdon

Keyword(s):

Data Mining ◽

Genetic Programming ◽

Graphics Processing Units ◽

Large Scale ◽

Graphics Processing ◽

Parallel Genetic Programming

Download Full-text

Data-Parallel Techniques for Agent-Based Tissue Modeling on Graphics Processing Units

Volume 3: 28th Computers and Information in Engineering Conference, Parts A and B ◽

10.1115/detc2008-49661 ◽

2008 ◽

Cited By ~ 4

Author(s):

Ryan S. Richards ◽

Mikola Lysenko ◽

Roshan M. D’Souza ◽

Gary An

Keyword(s):

Real Time ◽

Graphics Processing Units ◽

Large Scale ◽

Experimental Studies ◽

Test Case ◽

Cell Systems ◽

Von Neumann ◽

Agent Based ◽

Cell Behaviors ◽

Graphics Processing

Agent-Based Modeling has been recently recognized as a method for in-silico multi-scale modeling of biological cell systems. Agent-Based Models (ABMs) allow results from experimental studies of individual cell behaviors to be scaled into the macro-behavior of interacting cells in complex cell systems or tissues. Current generation ABM simulation toolkits are designed to work on serial von-Neumann architectures, which have poor scalability. The best systems can barely handle tens of thousands of agents in real-time. Considering that there are models for which mega-scale populations have significantly different emergent behaviors than smaller population sizes, it is important to have the ability to model such large scale models in real-time. In this paper we present a new framework for simulating ABMs on programmable graphics processing units (GPUs). Novel algorithms and data-structures have been developed for agent-state representation, agent motion, and replication. As a test case, we have implemented an abstracted version of the Systematic Inflammatory Response System (SIRS) ABM. Compared to the original implementation on the NetLogo system, our implementation can handle an agent population that is over three orders of magnitude larger with close to 40 updates/sec. We believe that our system is the only one of its kind that is capable of efficiently handling realistic problem sizes in biological simulations.

Download Full-text

Data Streaming Processing Window Joined With Graphics Processing Units (GPUs)

Encyclopedia of Information Science and Technology, Fifth Edition - Advances in Information Quality and Management ◽

10.4018/978-1-7998-3479-3.ch043 ◽

2021 ◽

pp. 602-623

Author(s):

Shen Lu ◽

Richard S. Segall

Keyword(s):

Big Data ◽

Data Streams ◽

Graphics Processing Units ◽

Data Stream ◽

Large Scale ◽

Graphics Processing Unit ◽

Processing Unit ◽

Data Streaming ◽

Large Scale Data ◽

Graphics Processing

Big data is large-scale data and can be either discrete or continuous. This article entails research that discusses the continuous case of big data often called “data streaming.” More and more businesses will depend on being able to process and make decisions on streams of data. This article utilizes the algorithmic side of data stream processing often called “stream analytics” or “stream mining.” Data streaming Windows Join can be improved by using graphics processing unit (GPU) for higher performance computing. Data streams are generated by two independent threads: one thread can be used to generate Data Stream A, and the other thread can be used to generate Data Stream B. One would use a Windows Join thread to merge the two data streams, which is also the process of “Data Stream Window Join.” The Window Join process can be implemented in parallel that can efficiently improve the computing speed. Experiments are provided for Data Stream Window Joins using both static and dynamic data.

Download Full-text