A Novel Parallel Algorithm with Map Segmentation for Multiple Geographical Feature Label Placement Problem

Multiple geographical feature label placement (MGFLP) is an NP-hard problem that can negatively influence label position accuracy and the computational time of the algorithm. The complexity of such a problem is compounded as the number of features for labeling increases, causing the execution time of the algorithms to grow exponentially. Additionally, in large-scale solutions, the algorithm possibly gets trapped in local minima, which imposes significant challenges in automatic label placement. To address the mentioned challenges, this paper proposes a novel parallel algorithm with the concept of map segmentation which decomposes the problem of multiple geographical feature label placement (MGFLP) to achieve a more intuitive solution. Parallel computing is then utilized to handle each decomposed problem simultaneously on a separate central processing unit (CPU) to speed up the process of label placement. The optimization component of the proposed algorithm is designed based on the hybrid of discrete differential evolution and genetic algorithms. Our results based on real-world datasets confirm the usability and scalability of the algorithm and illustrate its excellent performance. Moreover, the algorithm gained superlinear speedup compared to the previous studies that applied this hybrid algorithm.

Download Full-text

A Parallel-Computing Approach for Vector Road-Network Matching Using GPU Architecture

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi7120472 ◽

2018 ◽

Vol 7 (12) ◽

pp. 472 ◽

Cited By ~ 1

Author(s):

Bo Wan ◽

Lin Yang ◽

Shunping Zhou ◽

Run Wang ◽

Dezhi Wang ◽

...

Keyword(s):

Road Network ◽

Large Scale ◽

Graphics Processing Unit ◽

Road Networks ◽

Processing Unit ◽

Data Partition ◽

Matching Method ◽

The Road ◽

Central Processing ◽

Relaxation Matching

The road-network matching method is an effective tool for map integration, fusion, and update. Due to the complexity of road networks in the real world, matching methods often contain a series of complicated processes to identify homonymous roads and deal with their intricate relationship. However, traditional road-network matching algorithms, which are mainly central processing unit (CPU)-based approaches, may have performance bottleneck problems when facing big data. We developed a particle-swarm optimization (PSO)-based parallel road-network matching method on graphics-processing unit (GPU). Based on the characteristics of the two main stages (similarity computation and matching-relationship identification), data-partition and task-partition strategies were utilized, respectively, to fully use GPU threads. Experiments were conducted on datasets with 14 different scales. Results indicate that the parallel PSO-based matching algorithm (PSOM) could correctly identify most matching relationships with an average accuracy of 84.44%, which was at the same level as the accuracy of a benchmark—the probability-relaxation-matching (PRM) method. The PSOM approach significantly reduced the road-network matching time in dealing with large amounts of data in comparison with the PRM method. This paper provides a common parallel algorithm framework for road-network matching algorithms and contributes to integration and update of large-scale road-networks.

Download Full-text

Step Ring Based 3D Path Planning via GPU Simulation for Subtractive 3D Printing

Volume 2: Materials; Biomanufacturing; Properties, Applications and Systems; Sustainable Manufacturing ◽

10.1115/msec2016-8751 ◽

2016 ◽

Cited By ~ 1

Author(s):

Zhengkai Wu ◽

Thomas M. Tucker ◽

Chandra Nath ◽

Thomas R. Kurfess ◽

Richard W. Vuduc

Keyword(s):

3D Printing ◽

Path Planning ◽

Large Scale ◽

Cnc Machining ◽

Scale Model ◽

Material Surface ◽

Processing Unit ◽

Cad Model ◽

Set Partition ◽

Central Processing

In this paper, both software model visualization with path simulation and associated machining product are produced based on the step ring based 3-axis path planning to demo model-driven graphics processing unit (GPU) feature in tool path planning and 3D image model classification by GPU simulation. Subtractive 3D printing (i.e., 3D machining) is represented as integration between 3D printing modeling and CNC machining via GPU simulated software. Path planning is applied through material surface removal visualization in high resolution and 3D path simulation via ring selective path planning based on accessibility of path through pattern selection. First, the step ring selects critical features to reconstruct computer aided design (CAD) design model as STL (stereolithography) voxel, and then local optimization is attained within interested ring area for time and energy saving of GPU volume generation as compared to global all automatic path planning with longer latency. The reconstructed CAD model comes from an original sample (GATech buzz) with 2D image information. CAD model for optimization and validation is adopted to sustain manufacturing reproduction based on system simulation feedback. To avoid collision with the produced path from retraction path, we pick adaptive ring path generation and prediction in each planning iteration, which may also minimize material removal. Moreover, we did partition analysis and g-code optimization for large scale model and high density volume data. Image classification and grid analysis based on adaptive 3D tree depth are proposed for multi-level set partition of the model to define no cutting zones. After that, accessibility map is computed based on accessibility space for rotational angular space of path orientation to compare step ring based pass planning verses global all path planning. Feature analysis via central processing unit (CPU) or GPU processor for GPU map computation contributes to high performance computing and cloud computing potential through parallel computing application of subtractive 3D printing in the future.

Download Full-text

Analysis of Heat and Smoke Propagation and Oscillatory Flow through Ceiling Vents in a Large-Scale Compartment Fire

Applied Sciences ◽

10.3390/app9163305 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3305 ◽

Cited By ~ 1

Author(s):

Claudio Zanzi ◽

Pablo Gómez ◽

Joaquín López ◽

Julio Hernández

Keyword(s):

Convective Heat ◽

Large Scale ◽

Natural Ventilation ◽

Heat Propagation ◽

Oscillatory Behavior ◽

Combustion Model ◽

Processing Unit ◽

Fire Model ◽

Central Processing ◽

Mass Fluxes

One question that often arises is whether a specialized code or a more general code may be equally suitable for fire modeling. This paper investigates the performance and capabilities of a specialized code (FDS) and a general-purpose code (FLUENT) to simulate a fire in the commercial area of an underground intermodal transportation station. In order to facilitate a more precise comparison between the two codes, especially with regard to ventilation issues, the number of factors that may affect the fire evolution is reduced by simplifying the scenario and the fire model. The codes are applied to the same fire scenario using a simplified fire model, which considers a source of mass, heat and species to characterize the fire focus, and whose results are also compared with those obtained using FDS and a combustion model. An oscillating behavior of the fire-induced convective heat and mass fluxes through the natural vents is predicted, whose frequency compares well with experimental results for the ranges of compartment heights and heat release rates considered. The results obtained with the two codes for the smoke and heat propagation patterns and convective fluxes through the forced and natural ventilation systems are discussed and compared to each other. The agreement is very good for the temperature and species concentration distributions and the overall flow pattern, whereas appreciable discrepancies are only found in the oscillatory behavior of the fire-induced convective heat and mass fluxes through the natural vents. The relative performance of the codes in terms of central processing unit (CPU) time consumption is also discussed.

Download Full-text

A design procedure for improving the effectiveness of fractal layouts formation

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060413000474 ◽

2014 ◽

Vol 28 (1) ◽

pp. 1-26 ◽

Cited By ~ 1

Author(s):

Yung Chin Shih ◽

Eduardo Vila Gonçalves Filho

Keyword(s):

Design Procedure ◽

Current Approach ◽

Computational Time ◽

Shop Floor ◽

Processing Unit ◽

Performance Parameters ◽

Trade Off ◽

Central Processing ◽

Search Heuristics ◽

Functional Layout

AbstractRecently, new types of layouts have been proposed in the literature in order to handle a large number of products. Among these are the fractal layout, aiming at minimization of routing distances. There are already researchers focusing on the design; however, we have noticed that the current approach usually executes several times the allocations of fractal cells on the shop floor up to find the best allocations, which may present a significant disadvantage when applied to a large number of fractal cells owing to combinatorial features. This paper aims to propose a criterion, based on similarity among fractal cells, developed and implemented in a Tabu search heuristics, in order to allocate it on the shop floor in a feasible computational time. Once our proposed procedure is modeled, operations of each workpiece are separated in n subsets and submitted to simulation. The results (traveling distance and makespan) are compared to distributed layout and to functional layout. The results show, in general, a trade-off behavior, that is, when the total routing distance decreases, the makespan increases. Based on our proposed method, depending on the value of segregated fractal cell similarity, it is possible to reduce both performance parameters. Finally, we conclude the proposed procedure shows to be quite promising because allocations of fractal cells demand reduced central processing unit time.

Download Full-text

Ratcheting Prediction at the Notch Root of Steel Samples Over Asymmetric Loading Cycles

Journal of Engineering Materials and Technology ◽

10.1115/1.4045363 ◽

2019 ◽

Vol 142 (2) ◽

Author(s):

A. Shekarian ◽

A. Varvani-Farahani

Keyword(s):

Stress Relaxation ◽

Notch Root ◽

Kinematic Hardening ◽

Strain Range ◽

Medium Carbon Steel ◽

Computational Time ◽

Processing Unit ◽

Central Processing ◽

Hardening Rules ◽

Neuber's Rule

Abstract The present study intends to evaluate local ratcheting and stress relaxation of medium carbon steel samples under various asymmetric load levels by means of two kinematic hardening rules of Chaboche (CH) and Ahmadzadeh-Varvani (A-V). The Neuber's rule was coupled with the hardening rules to predict ratcheting and stress relaxation at the vicinity of the notch root. Stress-strain hysteresis loops generated by the CH and A-V models were employed to simultaneously control ratcheting progress over stress cycles and stress relaxation at notch root while strain range kept constant in each cycle. The higher cyclic load levels applied at the notch root accelerated shakedown over smaller number of cycles and resulted in lower relaxation rate. The larger notch diameter of 9 mm on the other hand induced lower stress concentration and smaller plastic zone at the notch root promoting ratcheting progress with less materials constraint over loading cycles compared with notch diameter d = 3 mm. Predicted ratcheting results through the A-V and CH models as coupled with the Neuber's rule were found in good agreements with the experimental data. The choice of the A-V and CH hardening rules in assessing ratcheting of materials was attributed to the number of terms/coefficients and complexity of their frameworks and computational time/central processing unit (CPU) required to run a ratcheting program.

Download Full-text

Designing Parallel Adaptive Laplacian Smoothing for Improving Tetrahedral Mesh Quality on the GPU

Applied Sciences ◽

10.3390/app11125543 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5543

Author(s):

Ning Xi ◽

Yinjie Sun ◽

Lei Xiao ◽

Gang Mei

Keyword(s):

Parallel Algorithm ◽

Adaptive Algorithm ◽

Large Scale ◽

Tetrahedral Mesh ◽

Processing Unit ◽

Mesh Quality ◽

Mesh Smoothing ◽

Tetrahedral Meshes ◽

Numerical Computing ◽

Laplacian Smoothing

Mesh quality is a critical issue in numerical computing because it directly impacts both computational efficiency and accuracy. Tetrahedral meshes are widely used in various engineering and science applications. However, in large-scale and complicated application scenarios, there are a large number of tetrahedrons, and in this case, the improvement of mesh quality is computationally expensive. Laplacian mesh smoothing is a simple mesh optimization method that improves mesh quality by changing the locations of nodes. In this paper, by exploiting the parallelism features of the modern graphics processing unit (GPU), we specifically designed a parallel adaptive Laplacian smoothing algorithm for improving the quality of large-scale tetrahedral meshes. In the proposed adaptive algorithm, we defined the aspect ratio as a metric to judge the mesh quality after each iteration to ensure that every smoothing improves the mesh quality. The adaptive algorithm avoids the shortcoming of the ordinary Laplacian algorithm to create potential invalid elements in the concave area. We conducted 5 groups of comparative experimental tests to evaluate the performance of the proposed parallel algorithm. The results demonstrated that the proposed adaptive algorithm is up to 23 times faster than the serial algorithms; and the accuracy of the tetrahedral mesh is satisfactorily improved after adaptive Laplacian mesh smoothing. Compared with the ordinary Laplacian algorithm, the proposed adaptive Laplacian algorithm is more applicable, and can effectively deal with those tetrahedrons with extremely poor quality. This indicates that the proposed parallel algorithm can be applied to improve the mesh quality in large-scale and complicated application scenarios.

Download Full-text

Modeling of Dynamic Cuttings Transportation during Drilling of Oil and Gas Wells by Combining 2D CFD and 1D Discretization Approach

SPE Journal ◽

10.2118/199902-pa ◽

2020 ◽

Vol 25 (03) ◽

pp. 1220-1240 ◽

Cited By ~ 1

Author(s):

Feifei Zhang ◽

Yidi Wang ◽

Yuezhi Wang ◽

Stefan Miska ◽

Mengjiao Yu

Keyword(s):

Oil And Gas ◽

Continuous Model ◽

Computational Time ◽

Superposition Method ◽

Processing Unit ◽

Flow Profile ◽

Gas Wells ◽

Central Processing ◽

Bed Height ◽

Oil And Gas Wells

Summary This paper presents an approach that combines a two-dimensional (2D) computational fluid dynamics (CFD) and one-dimensional (1D) continuous model for cuttings transport simulation during drilling of oil and gas wells. The 2D CFD simulates the flow profile and the suspended cuttings concentration profile in the cross section of the wellbore and the 1D continuous model simulates the cuttings transportation in the axial direction of the wellbore. Different cuttings sizes are considered in the model by using a new proposed superposition method. Experimental tests conducted on a 203 × 114 × 25 mm3 flow loop are used to validate the model from three different perspectives: the single-phase flow pressure drop, the steady-state cuttings bed height, and the transient pressure changes. Compared to layer models, the new approach is able to catch accurate flow details in the narrow flow region and overcome the shortcoming of traditional models that underpredict bed height under high flow rate conditions. The computational time increases by the order of 104∼105 from the level of millisecond to seconds but is still within the acceptable range for engineering applications, and the model provides close to three-dimensional (3D) accuracy at a much shorter central processing unit (CPU) time compared to 3D CFD models.

Download Full-text

High-performance computing in water resources hydrodynamics

Journal of Hydroinformatics ◽

10.2166/hydro.2020.163 ◽

2020 ◽

Vol 22 (5) ◽

pp. 1217-1235 ◽

Cited By ~ 3

Author(s):

M. Morales-Hernández ◽

M. B. Sharif ◽

S. Gangrade ◽

T. T. Dullo ◽

S.-C. Kao ◽

...

Keyword(s):

Water Resources ◽

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Test Case ◽

Processing Unit ◽

Central Processing ◽

Graphics Processing ◽

Performance Computing

Abstract This work presents a vision of future water resources hydrodynamics codes that can fully utilize the strengths of modern high-performance computing (HPC). The advances to computing power, formerly driven by the improvement of central processing unit processors, now focus on parallel computing and, in particular, the use of graphics processing units (GPUs). However, this shift to a parallel framework requires refactoring the code to make efficient use of the data as well as changing even the nature of the algorithm that solves the system of equations. These concepts along with other features such as the precision for the computations, dry regions management, and input/output data are analyzed in this paper. A 2D multi-GPU flood code applied to a large-scale test case is used to corroborate our statements and ascertain the new challenges for the next-generation parallel water resources codes.

Download Full-text

An alternative approach for collaborative simulation execution on a CPU+GPU hybrid system

SIMULATION ◽

10.1177/0037549719885178 ◽

2019 ◽

Vol 96 (3) ◽

pp. 347-361

Author(s):

Wenjie Tang ◽

Wentong Cai ◽

Yiping Yao ◽

Xiao Song ◽

Feng Zhu

Keyword(s):

Hybrid System ◽

Large Scale ◽

Scheduling Algorithm ◽

Discrete Event ◽

Processing Unit ◽

Model Computation ◽

Central Processing ◽

Collaborative Simulation ◽

Alternative Approach ◽

Parallel Discrete Event

In the past few years, the graphics processing unit (GPU) has been widely used to accelerate time-consuming models in simulations. Since both model computation and simulation management are main factors that affect the performance of large-scale simulations, only accelerating model computation will limit the potential speedup. Moreover, models that can be well accelerated by a GPU could be insufficient, especially for simulations with many lightweight models. Traditionally, the parallel discrete event simulation (PDES) method is used to solve this class of simulation, but most PDES simulators only utilize the central processing unit (CPU) even though the GPU is commonly available now. Hence, we propose an alternative approach for collaborative simulation execution on a CPU+GPU hybrid system. The GPU supports both simulation management and model computation as CPUs. A concurrency-oriented scheduling algorithm was proposed to enable cooperation between the CPU and the GPU, so that multiple computation and communication resources can be efficiently utilized. In addition, GPU functions have also been carefully designed to adapt the algorithm. The combination of those efforts allows the proposed approach to achieve significant speedup compared to the traditional PDES on a CPU.

Download Full-text

A robust system reliability analysis using partitioning and parallel processing of Markov chain

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060414000493 ◽

2014 ◽

Vol 28 (4) ◽

pp. 311-322 ◽

Cited By ~ 1

Author(s):

Po Ting Lin ◽

Yu-Cheng Chou ◽

Yung Ting ◽

Shian-Shing Shyu ◽

Chang-Kuo Chen

Keyword(s):

Markov Chain ◽

Parallel Processing ◽

Reliability Analysis ◽

System Reliability ◽

Large Scale ◽

Transition Probability ◽

Transition Probability Matrix ◽

Processing Unit ◽

Central Processing ◽

System Reliability Analysis

AbstractThis paper presents a robust reliability analysis method for systems of multimodular redundant (MMR) controllers using the method of partitioning and parallel processing of a Markov chain (PPMC). A Markov chain is formulated to represent the N distinct states of the MMR controllers. Such a Markov chain has N2 directed edges, and each edge corresponds to a transition probability between a pair of start and end states. Because N can be easily increased substantially, the system reliability analysis may require large computational resources, such as the central processing unit usage and memory occupation. By the PPMC, a Markov chain's transition probability matrix can be partitioned and reordered, such that the system reliability can be evaluated through only the diagonal submatrices of the transition probability matrix. In addition, calculations regarding the submatrices are independent of each other and thus can be conducted in parallel to assure the efficiency. The simulation results show that, compared with the sequential method applied to an intact Markov chain, the proposed PPMC can improve the performance and produce allowable accuracy for the reliability analysis on large-scale systems of MMR controllers.

Download Full-text