Parallel Computing Enables Whole-Trip Train Dynamics Optimizations

Due to the high computing demand of whole-trip train dynamics simulations and the iterative nature of optimizations, whole-trip train dynamics optimizations using sequential computing schemes are practically impossible. This paper reports advancements in whole-trip train dynamics optimizations enabled by using the parallel computing technique. A parallel computing scheme for whole-trip train dynamics optimizations is presented and discussed. Two case studies using parallel multiobjective particle swarm optimization (pMOPSO) and parallel multiobjective genetic algorithm (pMOGA), respectively, were performed to optimize a friction draft gear design. Linear speed-up was achieved by using parallel computing to cut down the computing time from 18 months to just 11 days. Optimized results using pMOPSO and pMOGA were in agreement with each other; Pareto fronts were identified to provide technical evidence for railway manufacturers and operators.

Download Full-text

Computing Schemes for Longitudinal Train Dynamics: Sequential, Parallel and Hybrid

Journal of Computational and Nonlinear Dynamics ◽

10.1115/1.4029716 ◽

2015 ◽

Vol 10 (6) ◽

Cited By ~ 16

Author(s):

Qing Wu ◽

Colin Cole

Keyword(s):

Parallel Computing ◽

Personal Computer ◽

Computational Efficiency ◽

Computing Time ◽

The Other ◽

Hybrid Scheme ◽

Sequential Scheme ◽

Parallel Scheme ◽

Train Dynamics

Conventionally, force elements in longitudinal train dynamics (LTD) are determined sequentially. Actually, all these force elements are independent from each other, i.e., determination of each one does not require inputs from others. This independent feature makes LTD feasible for parallel computing. A parallel scheme has been proposed and compared with the conventional sequential scheme in regard to computational efficiency. The parallel scheme is tested as not suitable for LTD; computing time of the parallel scheme is about 165% of the sequential scheme on a four-CPU personal computer (PC). A modified parallel scheme named the hybrid scheme was then proposed. The computing time of the hybrid scheme is only 70% of the sequential scheme. The other advantage of the hybrid scheme is that only two processors are required, which means the hybrid scheme can be implemented on PCs.

Download Full-text

DEVELOPMENT OF AN OPTIMIZATION TOOL FOR WELL PLACEMENT OPTIMIZATION IN GEOLOGICAL CARBON DIOXIDE SEQUESTRATION BY METAHEURISTICS —SPEED-UP BY LEVERAGING PARALLEL COMPUTING TECHNIQUE—

Journal of Japan Society of Civil Engineers Ser A2 (Applied Mechanics (AM)) ◽

10.2208/jscejam.77.1_21 ◽

2021 ◽

Vol 77 (1) ◽

pp. 21-34

Author(s):

Atsuhiro MIYAGI ◽

Hajime YAMAMOTO ◽

Youhei AKIMOTO

Keyword(s):

Carbon Dioxide ◽

Parallel Computing ◽

Carbon Dioxide Sequestration ◽

Well Placement ◽

Computing Technique ◽

Placement Optimization ◽

Well Placement Optimization ◽

Speed Up

Download Full-text

Parallel multiobjective optimisations of draft gear designs

Proceedings of the Institution of Mechanical Engineers Part F Journal of Rail and Rapid Transit ◽

10.1177/0954409717690981 ◽

2017 ◽

Vol 232 (3) ◽

pp. 744-758 ◽

Cited By ~ 4

Author(s):

Qing Wu ◽

Colin Cole ◽

Maksym Spiryagin ◽

Tim McSweeney

Keyword(s):

Genetic Algorithm ◽

Fatigue Damage ◽

Particle Swarm ◽

Particle Swarm Optimisation ◽

Round Trip ◽

Draft Gear ◽

Train Dynamics ◽

Dynamics Simulations ◽

Optimisation Algorithms ◽

Operational Time

This paper presents the methodology and results of the parallel multiobjective optimisations of draft gear designs. The methodology used white-box draft gear models, whose parameters were used as the optimisation variables. Two optimisation algorithms were used: genetic algorithm and particle swarm optimisation. All the optimised draft gear designs were constrained by impact tests to ensure that the optimised designs also comply with the current acceptance standards for draft gears. The performance of draft gears was assessed using whole-trip longitudinal train dynamics simulations and coupler fatigue damage calculations. Each simulation covered a round trip (loaded one way, empty on return) over a total of 640 km of track, which involved about 10 h of operational time. Three optimisation objectives were considered: minimal fatigue damage for wagon connection systems of loaded trains, minimal in-train forces for loaded trains, and minimal longitudinal wagon accelerations for empty trains. Two case studies were presented, which optimised two types of draft gears (single-stage and double-stage draft gears) using genetic algorithm and particle swarm optimisation, respectively.

Download Full-text

Pengembangan Deteksi Citra Mobil untuk Mengetahui Jumlah Tempat Parkir Menggunakan CUDA dan Modified YOLO

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2019641275 ◽

2019 ◽

Vol 6 (4) ◽

pp. 413 ◽

Cited By ~ 1

Author(s):

Sisco Jupiyandi ◽

Fadhil Rizqullah Saniputra ◽

Yoga Pratama ◽

Muhammad Robby Dharmawan ◽

Imam Cholissodin

Keyword(s):

Parallel Computing ◽

Computing Time ◽

Gpu Programming ◽

Parking Space ◽

Test Results ◽

Processing Data ◽

Average Accuracy ◽

Parking Lot ◽

Long Time ◽

Speed Up

Besarnya lahan pada parkir dan jumlah kendaraan roda empat dalam hal ini adalah mobil, dapat menjadi kendala bagi pengendara lain dalam mengetahui posisi parkir mana yang masih dapat digunakan. Sistem pengembangan perparkiran yang ada masih kurang maksimal dalam memanfaatkan lahan dan efisiensi waktunya. Berdasarkan banyaknya kendaraan mobil yang semakin bertambah, maka kebutuhan akan lahan parkir juga semakin dibutuhkan. Banyak sekali sistem yang belum dapat menangani berbagai permasalahan yang ada. Sistem ini dapat mengetahui jumlah slot pada lahan parkir dengan akurat sehingga memudahkan pengelola. Selain itu sistem ini juga dikembangkan agar waktu pencarian lahan parkir oleh pengguna parkir bisa sangat cepat. Sistem ini menggunakan penerapan pemrograman GPU yang dikombinasi dengan Modified Yolo (M-Yolo). GPU pada M-Yolo dibutuhkan untuk mengolah citra sekaligus mengolah data untuk mendeteksi citra mobil dan jumlah mobil secara paralel. Hasil uji coba menunjukkan bahwa dengan menggunakan GPU dibandingkan dengan CPU dapat mempercepat waktu komputasi rata-rata sebesar 0,179 detik dengan rata-rata akurasi sebesar 100%.AbstractThe width of parking lot and the number of cars in the parking lot can be an obstacle for motorists to know the parking area in which part is still empty. Parking systems that exist at this time are still not maximal in the utilization of parking lots and time efficiency. Based on the number of vehicles that are growing, then the need for parking space is also more needed. Many of the existing parking systems have not been able to handle the various problems. This system can know the number of slots on the parking lot, making it easier for operators to know the empty parking lot. In addition, this system will also be designed so that parking time search by parking users doesn’t take a long time. This system uses implementation of GPU programming mixed with Modified Yolo (M-Yolo). GPU on M-Yolo is needed to process images while processing data to detect car and the number of cars using parallel computing. The test results show that using the GPU compared to the CPU can speed up the average computing time by 0.179 seconds and it obtained an average accuracy of 100%.

Download Full-text

Preload on draft gear in freight trains

Proceedings of the Institution of Mechanical Engineers Part F Journal of Rail and Rapid Transit ◽

10.1177/0954409717738849 ◽

2017 ◽

Vol 232 (6) ◽

pp. 1615-1624 ◽

Cited By ~ 3

Author(s):

Qing Wu ◽

Colin Cole ◽

Maksym Spiryagin ◽

Weihua Ma

Keyword(s):

Fatigue Damage ◽

Structural Changes ◽

Step Size ◽

Power Train ◽

Impact Performance ◽

Draft Gear ◽

Train Dynamics ◽

The Difference ◽

Dynamics Simulations ◽

Impact Simulations

Adjusting draft gear preloads requires minimum or no structural changes to the existing coupler systems. Better or optimal preloads are more promising to be implemented than modifying other parameters such as wedge angles and spring stiffness. This paper presents a method to model draft gear preloads and investigates the numerical step-size requirements for the simulations of draft gear preloads. The implications of preloads on the draft gear impact performance, longitudinal train dynamics performance and coupler fatigue damage were also investigated. The results show that step sizes of less than 2.5 and 0.2 ms (with the fourth Runge–Kutta solver) are recommended to simulate preloads during the simulations of longitudinal train dynamics and wagon impacts, respectively. Wagon impact simulations indicate that the increase of draft gear preloads can noticeably decrease the maximum draft gear deflection during wagon impacts. Longitudinal train dynamics simulations for a distributed power train with 214 vehicles on a 320 km long track were conducted. The longitudinal train dynamics simulations indicate that, when the preload is increased from 0 to 100 kN, the difference of maximum vehicle accelerations is insignificant. When the draft gear preload is further increased to 200 or 300 kN, maximum vehicle accelerations are evidently increased. Draft gear preloads do not noticeably influence the maximum tensile coupler forces. However, preloads have evident implications for maximum compressive coupler forces, especially for the second half of the train. Coupler fatigue damage calculations show that the sum of coupler fatigue damage evidently decreases with the increase of draft gear preload. The damage for the zero preload case is 8.7 times than that of the 300 kN preload case.

Download Full-text

Methodology for Optimization of Friction Draft Gear Design

Volume 6: 10th International Conference on Multibody Systems, Nonlinear Dynamics, and Control ◽

10.1115/detc2014-34162 ◽

2014 ◽

Cited By ~ 2

Author(s):

Qing Wu ◽

Colin Cole ◽

Maksym Spiryagin

Keyword(s):

Dynamics Simulation ◽

Rolling Stock ◽

System Failures ◽

Gear Design ◽

Simulation Techniques ◽

Coupling System ◽

Heavy Haul ◽

Draft Gear ◽

Train Dynamics ◽

Heavy Haul Trains

Evidence gathered from industry indicates that railway coupling system failures have become a limitation for further developments of heavy haul trains. Friction draft gears have implications for both longitudinal train dynamics and rolling stock fatigue; therefore, optimization of friction draft gears could be a possible solution to conquer the limitation. In this paper, a methodology for optimization of friction draft gear design based on an advanced friction draft gear model is proposed. The methodology proposes using simulation techniques such as longitudinal train dynamics simulation and a Genetic Algorithm to develop improved parameters.

Download Full-text

Quantification of speed-up and accuracy of multi-CPU computational flow dynamics simulations of hemodynamics in a posterior communicating artery aneurysm of complex geometry

Journal of NeuroInterventional Surgery ◽

10.1136/neurintsurg-2012-010586 ◽

2013 ◽

Vol 5 (Suppl 3) ◽

pp. iii48-iii55 ◽

Cited By ~ 6

Author(s):

Christof Karmonik ◽

Christopher Yen ◽

Edgar Gabriel ◽

Sasan Partovi ◽

Marc Horner ◽

...

Keyword(s):

Complex Geometry ◽

Artery Aneurysm ◽

Flow Dynamics ◽

Posterior Communicating Artery ◽

Computational Flow Dynamics ◽

Speed Up ◽

Posterior Communicating Artery Aneurysm ◽

Dynamics Simulations

Download Full-text

COMPUTATIONS OF PULSATILE AORTIC BLOOD FLOW PROBLEMS ON PARALLEL COMPUTERS

Biomedical Engineering Applications Basis and Communications ◽

10.4015/s1016237203000171 ◽

2003 ◽

Vol 15 (03) ◽

pp. 109-114

Author(s):

YANG-YAO NIU ◽

SHOU-CHENG TCHENG

Keyword(s):

Blood Flow ◽

Stokes Equations ◽

Computing Time ◽

Time Integration ◽

Parallel Computer ◽

Aortic Blood Flow ◽

Pc Cluster ◽

Flow Problems ◽

Aortic Blood ◽

Speed Up

In this study, a parallel computing technology is applied on the simulation of aortic blood flow problems. A third-order upwind flux extrapolation with a dual-time integration method based on artificial compressibility solver is used to solve the Navier-Stokes equations. The original FORTRAN code is converted to the MPI code and tested on a 64-CPU IBM SP2 parallel computer and a 32-node PC Cluster. The test results show that a significant reduction of computing time in running the model and a super-linear speed up rate is achieved up to 32 CPUs at PC cluster. The speed up rate is as high as 49 for using IBM SP2 64 processors. The test shows very promising potential of parallel processing to provide prompt simulation of the current aortic flow problems.

Download Full-text

Dagstuhl-Seminar “Dynamically and Partially Reconfigurable Architectures” (Dynamisch und partiell rekonfigurierbare Architekturen)

it - Information Technology ◽

10.1524/itit.46.4.218.36077 ◽

2004 ◽

Vol 46 (4) ◽

Author(s):

Jürgen Becker

Keyword(s):

Information Technology ◽

Parallel Computing ◽

Computer Science ◽

Mobile Communication ◽

Data Stream ◽

Electrical Engineering ◽

Reconfigurable Architectures ◽

Automotive Application ◽

Partial Reconfiguration ◽

Computing Technique

SummaryThe paper addresses people from information technology, electrical engineering, computer science, and related areas. It gives an introduction and classification to fine-, coarse-, as well as multi-grain reconfigurable architectures. This data-stream-based and transport-triggered parallel computing technique in combination with dynamical and partial reconfiguration features demonstrates promising perspectives for future CMOS-based microelectronic solutions in multimedia and infotainment, mobile communication, as well as automotive application domains, among others.

Download Full-text

A Parallel Method for Accelerating Visualization for Vector Tiles

Abstracts of the ICA ◽

10.5194/ica-abs-1-124-2019 ◽

2019 ◽

Vol 1 ◽

pp. 1-1

Author(s):

Wei Hu ◽

Lin Li ◽

Chao Wu ◽

Hang Zhang ◽

Haihong Zhu

Keyword(s):

Parallel Computing ◽

Decomposition Method ◽

Computing Time ◽

Decomposition Methods ◽

Influential Factors ◽

Accurate Estimation ◽

Parallel Visualization ◽

Linear Regressions ◽

Geographical Feature ◽

Feature Visualization

Abstract. Vector tile technology is developing rapidly and has received increasing attention in recent years. Compared to the raster tile, the vector tile has shown incomparable advantages, such as flexible map styles, suitability for high-resolution screens and ease of interaction. Recent studies on vector tiles have mostly focused on improving the efficiency on the server side and have overlooked the efficiency on the client side, which would actually affect user experience. Parallel computing provides solutions to this issue. Parallel visualization for vector tiles is a typical example of embarrassing parallelism, because there is no need for communications between computing units during parallel computing. Therefore, the performance of parallel visualization for vector tiles mainly depends on how the workload is accurately estimated and evenly decomposed onto the computing units.The estimation of workload of vector tile visualization is essentially an accurate estimation of the computing time of geographical feature visualization in the tile. This article uses the computational weight to represent the computing time of geographical feature visualization. The visualization process for geographical feature consists of three main steps: retrieving geographical feature, symbolizing geographical feature and rendering geographical feature. This article analysis the influential factors and building the computational weight functions (CWFs) of different types of geographical feature (point, linear and area) in different visualization steps. Then, by analysing the linear relationship between the influential factors and the computing time of geographical feature visualization, the coefficients of CWFs can be obtained by linear regressions. The goodness of fit of all the linear regressions are significant (R2&thinsp;&gt;&thinsp;0.9), which means the computing time of geographical feature visualization, can be accurately estimated by CWFs.Once the computational weight of vector tiles is calculated, the workload decomposition is the next key issue. The traditional decomposition methods widely used in spatial domain decomposition are based on evenly divided spatial areas, such as vertical decomposition, horizontal decomposition and so on. However, the distribution of geographical features are usually uneven, the traditional decomposition methods may introduce large imbalance of workload for parallel computing and degrade the efficiency and performance. This article proposes a workload decomposition method based on the computational weight of vector tiles to improve the parallel visualization efficiency of vector tiles. Experiments show that the computational efficiency of parallel visualization of vector tiles with the proposed workload decomposition method is 18.6% higher than that with traditional decomposition methods.

Download Full-text