GPU Accelerated Nonlinear Electronic Circuits Solver for Transient Simulation of Systems with Large Number of Components

David Černý; Josef Dobeš

doi:10.3390/electronics9111819

GPU Accelerated Nonlinear Electronic Circuits Solver for Transient Simulation of Systems with Large Number of Components

Electronics ◽

10.3390/electronics9111819 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1819

Author(s):

David Černý ◽

Josef Dobeš

Keyword(s):

Krylov Subspace ◽

Direct Method ◽

Iterative Algorithms ◽

Transient Simulation ◽

Compute Unified Device Architecture ◽

Circuit Simulator ◽

Simulation Speed ◽

Device Architecture ◽

Highly Nonlinear ◽

Improved Accuracy

GPU cards have been used for scientific calculations for many years. Despite their ever-increasing performance, there are cases where they may still have problems. This article addresses possible performance and memory issues and their solutions that may occur during GPU calculations of iterative algorithms. Specifically, the article focuses on the optimization of transient simulation of extra-large highly nonlinear time-dependent circuits in SPICE-like electronic circuit simulator core enhanced with NVIDIA/CUDA (Compute Unified Device Architecture) interface and iterative Krylov Subspace methods with emphasis on improved accuracy. The article presents procedures for solving problems that may occur during this integration and negatively affect either the simulation speed or the accuracy of the calculation. Finally, a comparison of the implementation of an iterative calculation procedure with the use of GPU cards, calculation by the direct method and calculation on the CPU only is presented.

Ultrasound color flow imaging based on compute unified device architecture

Journal of Computer Applications ◽

10.3724/sp.j.1087.2011.00856 ◽

2011 ◽

Vol 31 (3) ◽

pp. 856-859

Author(s):

Zheng-juan FAN ◽

Chao-wei TAN ◽

LIU Dong C

Keyword(s):

Flow Imaging ◽

Compute Unified Device Architecture ◽

Color Flow ◽

Color Flow Imaging ◽

Device Architecture

Detecting and counting people using real-time directional algorithms implemented by compute unified device architecture

Neurocomputing ◽

10.1016/j.neucom.2016.08.137 ◽

2017 ◽

Vol 248 ◽

pp. 105-111 ◽

Cited By ~ 8

Author(s):

Yasemin Poyraz Kocak ◽

Selcuk Sevgen

Keyword(s):

Real Time ◽

Compute Unified Device Architecture ◽

Device Architecture

GPU-accelerated deformable registration of cone-beam CT images: design and implementation in the compute unified device architecture

10.17918/etd-2865 ◽

2021 ◽

Author(s):

Nakul Jain

Keyword(s):

Cone Beam Ct ◽

Ct Images ◽

Deformable Registration ◽

Cone Beam ◽

Compute Unified Device Architecture ◽

Design And Implementation ◽

Device Architecture

Parallel Character Reconstruction Expending Compute Unified Device Architecture

Advances in Computing and Information Technology - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-642-31552-7_64 ◽

2013 ◽

pp. 639-648

Author(s):

Anita Pal ◽

Kamal Kumar Srivastava ◽

Atul Kumar

Keyword(s):

Compute Unified Device Architecture ◽

Character Reconstruction ◽

Device Architecture

A New Look-up Table Method of Holographic Algorithms Based on Compute Unified Device Architecture Parallel Computing

Acta Optica Sinica ◽

10.3788/aos201535.0209001 ◽

2015 ◽

Vol 35 (2) ◽

pp. 0209001

Author(s):

蒋晓瑜 Jiang Xiaoyu ◽

丛彬 Cong Bin ◽

裴闯 Pei Chuang ◽

闫兴鹏 Yan Xingpeng ◽

赵锴 Zhao Kai

Keyword(s):

Parallel Computing ◽

Compute Unified Device Architecture ◽

Holographic Algorithms ◽

Table Method ◽

Device Architecture ◽

Look Up Table ◽

New Look

A Three-Dimensional Visualization Framework for Underground Geohazard Recognition on Urban Road-Facing GPR Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9110668 ◽

2020 ◽

Vol 9 (11) ◽

pp. 668

Author(s):

Zhenwu Wang ◽

Benting Wan ◽

Mengjie Han

Keyword(s):

Three Dimensional ◽

Pso Algorithm ◽

Sampling Point ◽

Compute Unified Device Architecture ◽

Point Selection ◽

Triangulated Irregular Network ◽

Device Architecture ◽

Ground Penetrating ◽

Geometry Method ◽

Difficult Issue

The identification of underground geohazards is always a difficult issue in the field of underground public safety. This study proposes an interactive visualization framework for underground geohazard recognition on urban roads, which constructs a whole recognition workflow by incorporating data collection, preprocessing, modeling, rendering and analyzing. In this framework, two proposed sampling point selection methods have been adopted to enhance the interpolated accuracy for the Kriging algorithm based on ground penetrating radar (GPR) technology. An improved Kriging algorithm was put forward, which applies a particle swarm optimization (PSO) algorithm to optimize the Kriging parameters and adopts in parallel the Compute Unified Device Architecture (CUDA) to run the PSO algorithm on the GPU side in order to raise the interpolated efficiency. Furthermore, a layer-constrained triangulated irregular network algorithm was proposed to construct the 3D geohazard bodies and the space geometry method was used to compute their volume information. The study also presents an implementation system to demonstrate the application of the framework and its related algorithms. This system makes a significant contribution to the demonstration and understanding of underground geohazard recognition in a three-dimensional environment.

Performance gains with Compute Unified Device Architecture-enabled eddy current correction for diffusion MRI.

Neuroreport ◽

10.1097/wnr.0000000000001475 ◽

2020 ◽

Vol 31 (10) ◽

pp. 746-753

Author(s):

Jerome J. Maller ◽

Stuart M. Grieve ◽

Simon J. Vogrin ◽

Thomas Welton

Keyword(s):

Eddy Current ◽

Diffusion Mri ◽

Compute Unified Device Architecture ◽

Device Architecture ◽

Performance Gains ◽

Eddy Current Correction

Parallel Direct Solution of the Covariance-Localized Ensemble Square Root Kalman Filter Equations with Matrix Functions

Monthly Weather Review ◽

10.1175/mwr-d-18-0022.1 ◽

2018 ◽

Vol 146 (9) ◽

pp. 2819-2836 ◽

Cited By ~ 1

Author(s):

Jeffrey L. Steward ◽

Jose E. Roman ◽

Alejandro Lamas Daviña ◽

Altuǧ Aksoy

Keyword(s):

Kalman Filter ◽

Krylov Subspace ◽

Direct Method ◽

Local Analysis ◽

Matrix Functions ◽

Square Root ◽

Memory Usage ◽

Reduced Order ◽

Large Numbers ◽

Eigenvectors And Eigenvalues

Abstract Recently, the serial approach to solving the square root ensemble Kalman filter (ESRF) equations in the presence of covariance localization was found to depend on the order of observations. As shown previously, correctly updating the localized posterior covariance in serial requires additional effort and computational expense. A recent work by Steward et al. details an all-at-once direct method to solve the ESRF equations in parallel. This method uses the eigenvectors and eigenvalues of the forward observation covariance matrix to solve the difficult portion of the ESRF equations. The remaining assimilation is easily parallelized, and the analysis does not depend on the order of observations. While this allows for long localization lengths that would render local analysis methods inefficient, in theory, an eigenpair-based method scales as the cube number of observations, making it infeasible for large numbers of observations. In this work, we extend this method to use the theory of matrix functions to avoid eigenpair computations. The Arnoldi process is used to evaluate the covariance-localized ESRF equations on the reduced-order Krylov subspace basis. This method is shown to converge quickly and apparently regains a linear scaling with the number of observations. The method scales similarly to the widely used serial approach of Anderson and Collins in wall time but not in memory usage. To improve the memory usage issue, this method potentially can be used without an explicit matrix. In addition, hybrid ensemble and climatological covariances can be incorporated.

Research and Application of Carbide Blade Surface Quality Primitive Digital Automatic Detection

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.971-973.1700 ◽

2014 ◽

Vol 971-973 ◽

pp. 1700-1705

Author(s):

Xue Jun Tian ◽

Zhi Peng Dong ◽

Feng Ye

Keyword(s):

Hard Alloy ◽

Rapid Detection ◽

Production Efficiency ◽

Surface Defects ◽

Cutting Tools ◽

Production Quality ◽

Detection Methods ◽

Compute Unified Device Architecture ◽

Blade Surface ◽

Device Architecture

Hard alloy has been widely applied as a type of cutter material and cemented carbide cutting tools have become the main tools for processing enterprises in our country. During the blade production process, traditional artificial detection methods for surface defects can't satisfy the demands of production quality and production efficiency any longer. Online automation rapid detection has been realized based on the Compute Unified Device Architecture (CUDA) by utilizing the computing capability of GPU.

A GPU Accelerated Red-Black SOR Algorithm for Computational Fluid Dynamics Problems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.320.335 ◽

2011 ◽

Vol 320 ◽

pp. 335-340 ◽

Cited By ~ 14

Author(s):

Ji Tang Liu ◽

Zhao Song Ma ◽

Shi Hai Li ◽

Ying Zhao

Keyword(s):

High Performance ◽

Memory Allocation ◽

Compute Unified Device Architecture ◽

Problem Size ◽

Benchmark Data ◽

Device Architecture ◽

Computational Performance ◽

Speed Up ◽

Sequential Code ◽

Dynamics Problems

GPUs are high performance co-processors of CPU for scientific computing including CFD. We present an optimistic shared memory allocation strategy to solve 2D CFD problems using Red-Black SOR method on GPU with CUDA (Compute Unified Device Architecture). Lid-driven results are compared with the benchmark data. The speed up ratio of same problem size by using NVDIA GTX480 and Intel Core-Dual 3.0GHz processor is discussed, the performance of GPU is 120 times faster than the sequential code on CPU with the problem size of 756756. Based on this work, we conclude that using the memory hierarchy properly has a key role in improving the computational performance of GPU.