Step Ring Based 3D Path Planning via GPU Simulation for Subtractive 3D Printing

In this paper, both software model visualization with path simulation and associated machining product are produced based on the step ring based 3-axis path planning to demo model-driven graphics processing unit (GPU) feature in tool path planning and 3D image model classification by GPU simulation. Subtractive 3D printing (i.e., 3D machining) is represented as integration between 3D printing modeling and CNC machining via GPU simulated software. Path planning is applied through material surface removal visualization in high resolution and 3D path simulation via ring selective path planning based on accessibility of path through pattern selection. First, the step ring selects critical features to reconstruct computer aided design (CAD) design model as STL (stereolithography) voxel, and then local optimization is attained within interested ring area for time and energy saving of GPU volume generation as compared to global all automatic path planning with longer latency. The reconstructed CAD model comes from an original sample (GATech buzz) with 2D image information. CAD model for optimization and validation is adopted to sustain manufacturing reproduction based on system simulation feedback. To avoid collision with the produced path from retraction path, we pick adaptive ring path generation and prediction in each planning iteration, which may also minimize material removal. Moreover, we did partition analysis and g-code optimization for large scale model and high density volume data. Image classification and grid analysis based on adaptive 3D tree depth are proposed for multi-level set partition of the model to define no cutting zones. After that, accessibility map is computed based on accessibility space for rotational angular space of path orientation to compare step ring based pass planning verses global all path planning. Feature analysis via central processing unit (CPU) or GPU processor for GPU map computation contributes to high performance computing and cloud computing potential through parallel computing application of subtractive 3D printing in the future.

Download Full-text

Step Ring-Based Three-Dimensional Path Planning Via Graphics Processing Unit Simulation for Subtractive Three-Dimensional Printing

Journal of Manufacturing Science and Engineering ◽

10.1115/1.4034662 ◽

2016 ◽

Vol 139 (3) ◽

Cited By ~ 1

Author(s):

Zhengkai Wu ◽

Thomas M. Tucker ◽

Chandra Nath ◽

Thomas R. Kurfess ◽

Richard W. Vuduc

Keyword(s):

3D Printing ◽

Path Planning ◽

Material Removal ◽

Graphics Processing Unit ◽

Three Dimensional ◽

Numerical Control ◽

Scale Model ◽

Processing Unit ◽

Cad Model ◽

Graphics Processing

In this paper, both software model visualization with path simulation and associated machining product are produced based on the step ring-based three-axis path planning to demo model-driven graphics processing unit (GPU) feature in tool path planning and 3D image model classification by GPU simulation. Subtractive 3D printing (i.e., 3D machining) is represented as integration between 3D printing modeling and computer numerical control (CNC) machining via GPU simulated software. Path planning is applied through visualization of surface material removal in high-resolution and 3D path simulation via ring selective path planning based on accessibility of path through pattern selection. First, the step ring selects critical features to reconstruct computer-aided design (CAD) design model as stereolithography (STL) voxel, and then, local optimization is attained within interested ring area for time and energy saving of GPU volume generation as compared to global automatic path planning with longer latency. The reconstructed CAD model comes from an original sample (GATech buzz) with 2D image information. CAD model for optimization and validation is adopted to sustain manufacturing reproduction based on system simulation feedback. To avoid collision with the produced path from retraction path, we pick adaptive ring path generation and prediction in each planning iteration, which may also minimize material removal. Moreover, we did partition analysis and G-code optimization for large-scale model and high density volume data. Image classification and grid analysis based on adaptive 3D tree depth are proposed for multilevel set partition of the model to define no cutting zones. After that, accessibility map is computed based on accessibility space for rotational angular space of path orientation to compare step ring-based pass planning verses global path planning of all geometries. Feature analysis via central processing unit (CPU) or GPU processor for GPU map computation contributes to high-performance computing and cloud computing potential through parallel computing application of subtractive 3D printing in the future.

Download Full-text

A Parallel-Computing Approach for Vector Road-Network Matching Using GPU Architecture

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi7120472 ◽

2018 ◽

Vol 7 (12) ◽

pp. 472 ◽

Cited By ~ 1

Author(s):

Bo Wan ◽

Lin Yang ◽

Shunping Zhou ◽

Run Wang ◽

Dezhi Wang ◽

...

Keyword(s):

Road Network ◽

Large Scale ◽

Graphics Processing Unit ◽

Road Networks ◽

Processing Unit ◽

Data Partition ◽

Matching Method ◽

The Road ◽

Central Processing ◽

Relaxation Matching

The road-network matching method is an effective tool for map integration, fusion, and update. Due to the complexity of road networks in the real world, matching methods often contain a series of complicated processes to identify homonymous roads and deal with their intricate relationship. However, traditional road-network matching algorithms, which are mainly central processing unit (CPU)-based approaches, may have performance bottleneck problems when facing big data. We developed a particle-swarm optimization (PSO)-based parallel road-network matching method on graphics-processing unit (GPU). Based on the characteristics of the two main stages (similarity computation and matching-relationship identification), data-partition and task-partition strategies were utilized, respectively, to fully use GPU threads. Experiments were conducted on datasets with 14 different scales. Results indicate that the parallel PSO-based matching algorithm (PSOM) could correctly identify most matching relationships with an average accuracy of 84.44%, which was at the same level as the accuracy of a benchmark—the probability-relaxation-matching (PRM) method. The PSOM approach significantly reduced the road-network matching time in dealing with large amounts of data in comparison with the PRM method. This paper provides a common parallel algorithm framework for road-network matching algorithms and contributes to integration and update of large-scale road-networks.

Download Full-text

Analysis of Heat and Smoke Propagation and Oscillatory Flow through Ceiling Vents in a Large-Scale Compartment Fire

Applied Sciences ◽

10.3390/app9163305 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3305 ◽

Cited By ~ 1

Author(s):

Claudio Zanzi ◽

Pablo Gómez ◽

Joaquín López ◽

Julio Hernández

Keyword(s):

Convective Heat ◽

Large Scale ◽

Natural Ventilation ◽

Heat Propagation ◽

Oscillatory Behavior ◽

Combustion Model ◽

Processing Unit ◽

Fire Model ◽

Central Processing ◽

Mass Fluxes

One question that often arises is whether a specialized code or a more general code may be equally suitable for fire modeling. This paper investigates the performance and capabilities of a specialized code (FDS) and a general-purpose code (FLUENT) to simulate a fire in the commercial area of an underground intermodal transportation station. In order to facilitate a more precise comparison between the two codes, especially with regard to ventilation issues, the number of factors that may affect the fire evolution is reduced by simplifying the scenario and the fire model. The codes are applied to the same fire scenario using a simplified fire model, which considers a source of mass, heat and species to characterize the fire focus, and whose results are also compared with those obtained using FDS and a combustion model. An oscillating behavior of the fire-induced convective heat and mass fluxes through the natural vents is predicted, whose frequency compares well with experimental results for the ranges of compartment heights and heat release rates considered. The results obtained with the two codes for the smoke and heat propagation patterns and convective fluxes through the forced and natural ventilation systems are discussed and compared to each other. The agreement is very good for the temperature and species concentration distributions and the overall flow pattern, whereas appreciable discrepancies are only found in the oscillatory behavior of the fire-induced convective heat and mass fluxes through the natural vents. The relative performance of the codes in terms of central processing unit (CPU) time consumption is also discussed.

Download Full-text

High-performance computing in water resources hydrodynamics

Journal of Hydroinformatics ◽

10.2166/hydro.2020.163 ◽

2020 ◽

Vol 22 (5) ◽

pp. 1217-1235 ◽

Cited By ~ 3

Author(s):

M. Morales-Hernández ◽

M. B. Sharif ◽

S. Gangrade ◽

T. T. Dullo ◽

S.-C. Kao ◽

...

Keyword(s):

Water Resources ◽

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Test Case ◽

Processing Unit ◽

Central Processing ◽

Graphics Processing ◽

Performance Computing

Abstract This work presents a vision of future water resources hydrodynamics codes that can fully utilize the strengths of modern high-performance computing (HPC). The advances to computing power, formerly driven by the improvement of central processing unit processors, now focus on parallel computing and, in particular, the use of graphics processing units (GPUs). However, this shift to a parallel framework requires refactoring the code to make efficient use of the data as well as changing even the nature of the algorithm that solves the system of equations. These concepts along with other features such as the precision for the computations, dry regions management, and input/output data are analyzed in this paper. A 2D multi-GPU flood code applied to a large-scale test case is used to corroborate our statements and ascertain the new challenges for the next-generation parallel water resources codes.

Download Full-text

An alternative approach for collaborative simulation execution on a CPU+GPU hybrid system

SIMULATION ◽

10.1177/0037549719885178 ◽

2019 ◽

Vol 96 (3) ◽

pp. 347-361

Author(s):

Wenjie Tang ◽

Wentong Cai ◽

Yiping Yao ◽

Xiao Song ◽

Feng Zhu

Keyword(s):

Hybrid System ◽

Large Scale ◽

Scheduling Algorithm ◽

Discrete Event ◽

Processing Unit ◽

Model Computation ◽

Central Processing ◽

Collaborative Simulation ◽

Alternative Approach ◽

Parallel Discrete Event

In the past few years, the graphics processing unit (GPU) has been widely used to accelerate time-consuming models in simulations. Since both model computation and simulation management are main factors that affect the performance of large-scale simulations, only accelerating model computation will limit the potential speedup. Moreover, models that can be well accelerated by a GPU could be insufficient, especially for simulations with many lightweight models. Traditionally, the parallel discrete event simulation (PDES) method is used to solve this class of simulation, but most PDES simulators only utilize the central processing unit (CPU) even though the GPU is commonly available now. Hence, we propose an alternative approach for collaborative simulation execution on a CPU+GPU hybrid system. The GPU supports both simulation management and model computation as CPUs. A concurrency-oriented scheduling algorithm was proposed to enable cooperation between the CPU and the GPU, so that multiple computation and communication resources can be efficiently utilized. In addition, GPU functions have also been carefully designed to adapt the algorithm. The combination of those efforts allows the proposed approach to achieve significant speedup compared to the traditional PDES on a CPU.

Download Full-text

A robust system reliability analysis using partitioning and parallel processing of Markov chain

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060414000493 ◽

2014 ◽

Vol 28 (4) ◽

pp. 311-322 ◽

Cited By ~ 1

Author(s):

Po Ting Lin ◽

Yu-Cheng Chou ◽

Yung Ting ◽

Shian-Shing Shyu ◽

Chang-Kuo Chen

Keyword(s):

Markov Chain ◽

Parallel Processing ◽

Reliability Analysis ◽

System Reliability ◽

Large Scale ◽

Transition Probability ◽

Transition Probability Matrix ◽

Processing Unit ◽

Central Processing ◽

System Reliability Analysis

AbstractThis paper presents a robust reliability analysis method for systems of multimodular redundant (MMR) controllers using the method of partitioning and parallel processing of a Markov chain (PPMC). A Markov chain is formulated to represent the N distinct states of the MMR controllers. Such a Markov chain has N2 directed edges, and each edge corresponds to a transition probability between a pair of start and end states. Because N can be easily increased substantially, the system reliability analysis may require large computational resources, such as the central processing unit usage and memory occupation. By the PPMC, a Markov chain's transition probability matrix can be partitioned and reordered, such that the system reliability can be evaluated through only the diagonal submatrices of the transition probability matrix. In addition, calculations regarding the submatrices are independent of each other and thus can be conducted in parallel to assure the efficiency. The simulation results show that, compared with the sequential method applied to an intact Markov chain, the proposed PPMC can improve the performance and produce allowable accuracy for the reliability analysis on large-scale systems of MMR controllers.

Download Full-text

Experiments on multi-agent capture of a stochastically moving object using modified projective path planning

Robotica ◽

10.1017/s0263574712000239 ◽

2012 ◽

Vol 31 (2) ◽

pp. 267-284 ◽

Cited By ~ 2

Author(s):

Vijaysingh Shinde ◽

Ashish Dutta ◽

Anupam Saxena

Keyword(s):

Path Planning ◽

Past Research ◽

Moving Object ◽

Processing Unit ◽

Time Step ◽

Object A ◽

Central Processing ◽

Form Closure ◽

Minimum Number ◽

Planning Algorithm

SUMMARYMost of the past research in swarm robotics has considered object capture and transport using a specified and very large number of agents. The objects therein were either stationary or moving deterministically (i.e., along a known path). In most previous efforts, the obstacles were also considered stationary. Here we present a modified projective path planning algorithm and illustrate via laboratory experiments that an object exhibiting stochastic (unplanned) but low-speed motion can be restrained by a limited number of agents guided in real-time across randomly moving obstacles. Relaxation of certain restrictions in the grasping objective allows for the determination of a minimum number and placement of agents around the perimeter of any generically shaped prismatic object. A closed loop experiment is designed using a single overhead camera that provides the visual feedback and helps determine the instantaneous positions of all entities in the workspace. Control signals are sent to the robots via wireless modules by a central processing unit to navigate and guide them to their respective new positions in the subsequent time-step. Agents continue to receive signals until they restrict the moving object in form closure.

Download Full-text

Large Scale Finite Element Analysis Via Assembly-Free Deflated Conjugate Gradient

Journal of Computing and Information Science in Engineering ◽

10.1115/1.4028591 ◽

2014 ◽

Vol 14 (4) ◽

Cited By ~ 11

Author(s):

Praveen Yadav ◽

Krishnan Suresh

Keyword(s):

Finite Element Analysis ◽

Finite Element ◽

Conjugate Gradient ◽

Degrees Of Freedom ◽

Large Scale ◽

Solid Mechanics ◽

Processing Unit ◽

Element Analysis ◽

Central Processing ◽

Computational Bottleneck

Large-scale finite element analysis (FEA) with millions of degrees of freedom (DOF) is becoming commonplace in solid mechanics. The primary computational bottleneck in such problems is the solution of large linear systems of equations. In this paper, we propose an assembly-free version of the deflated conjugate gradient (DCG) for solving such equations, where neither the stiffness matrix nor the deflation matrix is assembled. While assembly-free FEA is a well-known concept, the novelty pursued in this paper is the use of assembly-free deflation. The resulting implementation is particularly well suited for large-scale problems and can be easily ported to multicore central processing unit (CPU) and graphics-programmable unit (GPU) architectures. For demonstration, we show that one can solve a 50 × 106 degree of freedom system on a single GPU card, equipped with 3 GB of memory. The second contribution is an extension of the “rigid-body agglomeration” concept used in DCG to a “curvature-sensitive agglomeration.” The latter exploits classic plate and beam theories for efficient deflation of highly ill-conditioned problems arising from thin structures.

Download Full-text

A Deflated Assembly Free Approach to Large-Scale Implicit Structural Dynamics

Journal of Computational and Nonlinear Dynamics ◽

10.1115/1.4029110 ◽

2015 ◽

Vol 10 (6) ◽

Cited By ~ 1

Author(s):

Amir M. Mirzendehdel ◽

Krishnan Suresh

Keyword(s):

Structural Dynamics ◽

Large Scale ◽

Processing Unit ◽

Element Analysis ◽

Central Processing ◽

Bottle Neck ◽

Gpu Architectures ◽

Many Core ◽

Matrix Free ◽

Fast Inversion

The primary computational bottle-neck in implicit structural dynamics is the repeated inversion of the underlying stiffness matrix. In this paper, a fast inversion technique is proposed by merging four distinct but complementary concepts: (1) voxelization with adaptive local refinement, (2) assembly-free (a.k.a. matrix-free or element-by-element) finite element analysis (FEA), (3) assembly-free deflated conjugate gradient (AF-DCG), and (4) multicore parallelization. In particular, we apply these concepts to the well-known Newmark-beta method, and the resulting AF-DCG is well-suited for large-scale problems. It can be easily ported to many-core central processing unit (CPU) and multicore graphics-programmable unit (GPU) architectures, as demonstrated through numerical experiments.

Download Full-text

NeuroGPU: Accelerating multi-compartment, biophysically detailed neuron simulations on GPUs

10.1101/727560 ◽

2019 ◽

Cited By ~ 1

Author(s):

Roy Ben-Shalom ◽

Nikhil S. Artherya ◽

Alexander Ladd ◽

Christopher Cross ◽

Hersh Sanghevi ◽

...

Keyword(s):

Numerical Optimization ◽

Large Scale ◽

Graphics Processing Unit ◽

Computing Time ◽

Processing Unit ◽

Neuron Models ◽

Model Quality ◽

Central Processing ◽

Multi Scale ◽

Graphics Processing

AbstractThe membrane potential of individual neurons depends on a large number of interacting biophysical processes operating on spatial-temporal scales spanning several orders of magnitude. The multi-scale nature of these processes dictates that accurate prediction of membrane potentials in specific neurons requires utilization of detailed simulations. Unfortunately, constraining parameters within biologically detailed neuron models can be difficult, leading to poor model fits. This obstacle can be overcome partially by numerical optimization or detailed exploration of parameter space. However, these processes, which currently rely on central processing unit (CPU) computation, often incur exponential increases in computing time for marginal improvements in model behavior. As a result, model quality is often compromised to accommodate compute resources. Here, we present a simulation environment, NeuroGPU, that takes advantage of the inherent parallelized structure of graphics processing unit (GPU) to accelerate neuronal simulation. NeuroGPU can simulate most of biologically detailed models 800x faster than traditional simulators when using multiple GPU cores, and even 10-200 times faster when implemented on relatively inexpensive GPU systems. We demonstrate the power of NeuoGPU through large-scale parameter exploration to reveal the response landscape of a neuron. Finally, we accelerate numerical optimization of biophysically detailed neuron models to achieve highly accurate fitting of models to simulation and experimental data. Thus, NeuroGPU enables the rapid simulation of multi-compartment, biophysically detailed neuron models on commonly used computing systems accessible by many scientists.

Download Full-text