cuda technology Latest Research Papers

Influence of gravity effect to the recovery rate at uranium in-situ leaching

Bulletin of the National Engineering Academy of the Republic of Kazakhstan ◽

10.47533/2020.1606-146x.125 ◽

2021 ◽

Vol 82 (4) ◽

pp. 148-157

Author(s):

M. B. Kurmanseiit ◽

◽

M. S. Tungatarova ◽

K. A. Alibayeva ◽

◽

...

Keyword(s):

Gravitational Effect ◽

Computational Grids ◽

Processing Unit ◽

Gravity Effect ◽

Central Processing ◽

Cuda Technology ◽

Uranium In Situ Leaching ◽

Leaching Solution ◽

The Difference

In-Situ Leaching is a method of extracting minerals by selectively dissolving it with a leaching solution directly in the place of occurrence of the mineral. In practice, during the development of deposits with the In-Situ Leaching method, situations arise when the solution tends to go down below the active thickness of the stratum. This may be due to geological heterogeneity of the rock or gravitational sedimentation of the solution in the rock due to the difference in the densities of the solution and groundwater. As a result of the deposition of the solution along the height, there is a decrease in the recovery of the metal located in the upper part of the geological layers. This article examines the effect of gravity on the flow regime during the filtration of the solution in the rock. The influence of the gravitational effect on the flow of solution in the rock is studied for different ratios of the densities of the solution and groundwater without taking into account the interaction of the solution with the rock. The CUDA technology is used to improve the performance of calculations. The results show that the use of CUDA technology allows to increase the performance of calculations by 40-80 times compared to calculations on a central processing unit (CPU) for different computational grids.

GPU-Accelerated Laplace Equation Model Development Based on CUDA Fortran

Water ◽

10.3390/w13233435 ◽

2021 ◽

Vol 13 (23) ◽

pp. 3435

Author(s):

Boram Kim ◽

Kwang Seok Yoon ◽

Hyung-Jun Kim

Keyword(s):

Analytical Solution ◽

Laplace Equation ◽

Large Scale ◽

Numerical Models ◽

Model Development ◽

Computation Time ◽

High Accuracy ◽

Equation Model ◽

Computational Time ◽

Cuda Technology

In this study, a CUDA Fortran-based GPU-accelerated Laplace equation model was developed and applied to several cases. The Laplace equation is one of the equations that can physically analyze the groundwater flows, and is an equation that can provide analytical solutions. Such a numerical model requires a large amount of data to physically regenerate the flow with high accuracy, and requires computational time. These numerical models require a large amount of data to physically reproduce the flow with high accuracy and require computational time. As a way to shorten the computation time by applying CUDA technology, large-scale parallel computations were performed on the GPU, and a program was written to reduce the number of data transfers between the CPU and GPU. A GPU consists of many ALUs specialized in graphic processing, and can perform more concurrent computations than a CPU using multiple ALUs. The computation results of the GPU-accelerated model were compared with the analytical solution of the Laplace equation to verify the accuracy. The computation results of the GPU-accelerated Laplace equation model were in good agreement with the analytical solution. As the number of grids increased, the computational time of the GPU-accelerated model gradually reduced compared to the computational time of the CPU-based Laplace equation model. As a result, the computational time of the GPU-accelerated Laplace equation model was reduced by up to about 50 times.

Evaluation of the performance of algorithms for synthesizing radar images using Cuda technology

Doklady BGUIR ◽

10.35596/1729-7648-2021-19-6-92-96 ◽

2021 ◽

Vol 19 (6) ◽

pp. 92-96

Author(s):

S. V. Kozlov

Keyword(s):

Parallel Computing ◽

Computational Complexity ◽

Aperture Synthesis ◽

The Real ◽

Radar Images ◽

Cuda Technology ◽

Real Performance ◽

Radar Information

The features of the implementation of the algorithm for the synthesis of detail radar images for an aperture synthesis radar using the built-in functions of the Cuda library are presented. The estimation of computational complexity from the standpoint of the organization of parallel computing on Nvidia GPUs is given. The estimation of the real performance of radar synthesis is given, taking into account the volume and features of the placement of primary radar information.

Development of software and algorithms of parallel learning of artificial neural networks using CUDA technologies

Technology audit and production reserves ◽

10.15587/2706-5448.2021.239784 ◽

2021 ◽

Vol 5 (2(61)) ◽

pp. 21-25

Author(s):

Yaroslav Sokolovskyy ◽

Denys Manokhin ◽

Yaroslav Kaplunsky ◽

Olha Mokrytska

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Medical Image Analysis ◽

Cloud Services ◽

Small Data ◽

Central Processor ◽

Cuda Technology ◽

Artificial Neural ◽

Cloud Technologies

The object of research is to parallelize the learning process of artificial neural networks to automate the procedure of medical image analysis using the Python programming language, PyTorch framework and Compute Unified Device Architecture (CUDA) technology. The operation of this framework is based on the Define-by-Run model. The analysis of the available cloud technologies for realization of the task and the analysis of algorithms of learning of artificial neural networks is carried out. A modified U-Net architecture from the MedicalTorch library was used. The purpose of its application was the need for a network that can effectively learn with small data sets, as in the field of medicine one of the most problematic places is the availability of large datasets, due to the requirements for data confidentiality of this nature. The resulting information system is able to implement the tasks set before it, contains the most user-friendly interface and all the necessary tools to simplify and automate the process of visualization and analysis of data. The efficiency of neural network learning with the help of the central processor (CPU) and with the help of the graphic processor (GPU) with the use of CUDA technologies is compared. Cloud technology was used in the study. Google Colab and Microsoft Azure were considered among cloud services. Colab was first used to build a prototype. Therefore, the Azure service was used to effectively teach the finished architecture of the artificial neural network. Measurements were performed using cloud technologies in both services. The Adam optimizer was used to learn the model. CPU duration measurements were also measured to assess the acceleration of CUDA technology. An estimate of the acceleration obtained through the use of GPU computing and cloud technologies was implemented. CPU duration measurements were also measured to assess the acceleration of CUDA technology. The model developed during the research showed satisfactory results according to the metrics of Jaccard and Dyce in solving the problem. A key factor in the success of this study was cloud computing services.

Acceleration of boundary element calculations for closed domain using nonlinear form functions and CUDA technology

Doklady BGUIR ◽

10.35596/1729-7648-2021-19-14-21 ◽

2021 ◽

Vol 19 (3) ◽

pp. 14-21

Author(s):

S. S. Sherbakov ◽

M. M. Polestchuk

Keyword(s):

Boundary Element ◽

Mutual Influence ◽

Multicore Processors ◽

Boundary Elements ◽

Mechanics Of Solids ◽

Closed Domain ◽

Is Implementation ◽

Nvidia Cuda ◽

Cuda Technology ◽

Significant Acceleration

The evolution of computer technologies, as a hardware and a software parts, allows to attain fast and accurate solutions to many applied problems in scientific areas. Acceleration of calculations is broadly used technic that is basically implemented by multithreading and multicore processors. NVidia CUDA technology or simply CUDA opens a way to efficient acceleration of boundary elements method (BEM), that includes many independent stages. The main goal of the paper is implementation and acceleration of indirect boundary element method using three form functions. Calculation of the potentialdistribution inside a closed boundary under the action of the defined boundary condition is considered. In order to accelerate corresponding calculations, they were parallelized at the graphic accelerator using NVidia CUDA technology. The dependences of acceleration of parallel computations as compared with sequential ones were explored for different numbers of boundary elements and computational nodes. A significant acceleration (up to 52 times) calculation of the potential distribution without loss in accuracy is shown. Acceleration of up to 22 times was achieved in calculation of mutual influence matrix for boundary elements. Using CUDA technology allows to attain significant acceleration without loss in accuracy and convergence. So application of CUDA is a good way to parallelizing BEM. Application of developed approach allows to solve problems in different areas of physics such as acoustics, hydromechanics, electrodynamics, mechanics of solids and many other areas, efficiently.

Development of vector algorithm using CUDA technology for three-dimensional retinal laser coagulation process modeling

Computer Optics ◽

10.18287/2412-6179-co-828 ◽

2021 ◽

Vol 45 (3) ◽

pp. 427-437

Author(s):

A.S. Shirokanev ◽

N.A. Andriyanov ◽

N.Y. Ilyasova

Keyword(s):

Mathematical Modeling ◽

Heat Conduction ◽

Complex Structure ◽

Three Dimensional ◽

Laser Coagulation ◽

Laser Exposure ◽

Three Dimensional Modeling ◽

Dimensional Modeling ◽

Coagulation Process ◽

Cuda Technology

For diabetic retinopathy treatment, laser coagulation is used in modern practice. During the laser surgery process, the parameters of laser exposure are selected manually by a doctor, which requires the doctor to have sufficient experience and knowledge to achieve a therapeutic effect. On the basis of mathematical modeling of the laser coagulation process, it is possible to estimate the crucial parameters without performing an operation. However, the retina has a rather complex structure, and when even low-cost numerical methods are used for modeling, it takes a long time to obtain a result. In this regard, the development of time-efficient algorithms for three-dimensional modeling is an urgent task, since the use of such algorithms will provide a compre-hensive study within a limited time. In this paper, we study the execution time of algorithms that implement various variations in the application of the splitting method and the finite difference method, adapted to the set problem of heat conduction. The study reveals the most efficient algorithm, which is then vectorized and implemented using the CUDA technology. The study was carried out using Intel Core i7-10875H and Nvidia RTX 2080 MAX Q and showed that an analog of the vector algorithm, focused on solving a multidimensional heat conduction problem, provides an acceleration of no more than 1.5 times compared to the sequential version. The developed vector-based algorithm, focused on the application of the sweep method in all directions of the three-dimensional problem, significantly reduces the time spent on copying into the memory of the video card and provides a 40-fold acceleration in comparison with the sequential three-dimensional modeling algorithm. On the basis of the same approach, a parallel algorithm of mathematical modeling was developed, which provided a 20-fold acceleration at full processor load.

REAL-TIME DEEP NEURAL NETWORKS FOR MULTIPLE OBJECT TRACKING AND SEGMENTATION ON MONOCULAR VIDEO

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliv-2-w1-2021-15-2021 ◽

2021 ◽

Vol XLIV-2/W1-2021 ◽

pp. 15-20

Author(s):

I. Basharov ◽

D. Yudin

Keyword(s):

Neural Networks ◽

Real Time ◽

Deep Neural Networks ◽

Unmanned Vehicles ◽

Geometric Constraints ◽

Found Objects ◽

Cuda Technology ◽

Monocular Video ◽

Found Object ◽

Instance Segmentation

Abstract. The paper is devoted to the task of multiple objects tracking and segmentation on monocular video, which was obtained by the camera of unmanned ground vehicle. The authors investigate various architectures of deep neural networks for this task solution. Special attention is paid to deep models providing inference in real time. The authors proposed an approach based on combining the modern SOLOv2 instance segmentation model, a neural network model for embedding generation for each found object, and a modified Hungarian tracking algorithm. The Hungarian algorithm was modified taking into account the geometric constraints on the positions of the found objects on the sequence of images. The investigated solution is a development and improvement of the state-of-the-art PointTrack method. The effectiveness of the proposed approach is demonstrated quantitatively and qualitatively on the popular KITTI MOTS dataset collected using the cameras of a driverless car. The software implementation of the approach was carried out. The acceleration of the procedure for the formation of a two-dimensional point cloud in the found image segment was done using the NVidia CUDA technology. At the same time, the proposed instance segmentation module provides a mean processing time of one image of 68 ms, the embedding and tracking module of 24 ms using the NVidia Tesla V100 GPU. This indicates that the proposed solution is promising for on-board computer vision systems for both unmanned vehicles and various robotic platforms.

USING THE CUDA TECHNOLOGY TO SPEED UP COMPUTATIONS IN PROBLEMS OF CHEMICAL KINETICS

NEWS OF THE NATIONAL ACADEMY OF SCIENCES OF THE REPUBLIC OF KAZAKHSTAN ◽

10.32014/2021.2518-1726.19 ◽

2021 ◽

Vol 2 (336) ◽

pp. 39-47

Author(s):

M. Sarsembayev ◽

B. Urmashev ◽

O. Mamyrbayev ◽

M. Turdalyuly ◽

T. Sarsembayeva

Keyword(s):

Main Idea ◽

Time Interval ◽

Central Processor ◽

Web Browser ◽

Elementary Reactions ◽

Cuda Technology ◽

Speed Up ◽

Reacting Systems ◽

To Receive ◽

5Th Grade

The main idea of the implementation is reducing the time for calculation and thereby implement a multi-user mode for users by placing it on a server with access via a web browser. To model the kinetics of chemical reacting systems were used 4th and 5th grade Runge-Kutta methods and to receive the index of advantages of this elaboration were written programs in C# for sequential computation on a central processor and was used a platform for parallel computation of CUDA on graphic processors. Parallelization of data during calculation on a GPU was performed by the distribution of the reaction to individual strands, when changes of the concentration was calculated over a given time interval of a certain substance. Parallelization is performed over all elementary reactions, with the increasing of the number of reactions in the mechanism, because of this the computation on the GPU has a noticeable gain in time.

NUMERICAL IMPLEMENTATION OF A PARALLEL ALGORITHM FOR SOLVING THE PROBLEM OF POLLUTANT TRANSPORT IN A RESERVOIR ON A HIGH-PERFORMANCE COMPUTER SYSTEM

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2021.04.pp.027-036 ◽

2021 ◽

pp. 27-36

Author(s):

A. V. Nikitina ◽

A. E. Chistyakov ◽

A. M. Atayan

Keyword(s):

Message Passing Interface ◽

Operating Time ◽

Computation Time ◽

Computing System ◽

Computational Grid ◽

Computational Grids ◽

Processing Unit ◽

Parallel Solution ◽

Central Processing ◽

Cuda Technology

The purpose of this work is to create a software package for a distributed solution of the problem of transporting a pollutant in a reservoir with complex bathymetry and the presence of technological structures. An algorithm has been developed for the parallel solution of the problem of transporting a pollutant (pollutant) in a reservoir on a graphics accelerator controlled by the CUDA (Compute Unified Device Architecture) system; a comparative analysis of the operation of algorithms on a CPU (Central Processing Unit) and on a graphics accelerator GPU (Graphics Processing Unit) made it possible to evaluate their performance. The software implementation of the modules included in the complex is described, the main classes and implemented methods are documented. The results of numerical experiments showed that solving of pollutant transport’s problem based on the CUDA technology is ineffective for small grids (up to 100 ´ 100 computational nodes). In the case of large grids (1000 ´ 1000 computational nodes), the use of CUDA technology reduces the computation time by an order of magnitude. An analysis of the experiments carried out with the developed components of software showed that the maximum value of the ratio of the algorithm operating time that implements the set task of transferring matter in a shallow water on a GPU to the operating time of a similar algorithm on the CPU was 24.92 times, which is achieved on a grid of 1000 ´ 1000 computational nodes. Implementation of methods for decomposition of grid regions is proposed for solving computationally laborious problems of diffusion-convection, including the problem of transporting pollutants in a reservoir with complex bathymetry with technological objects that take into account the architecture and parameters of a MSC (Multiprocessor Computing System) located on the basis of the infrastructure facility of the STU (Scientific and Technological University) “Sirius” (Sochi, Russia). Consideration was made for such a property of a computing system as the time it takes to transmit and receive floating point data. An algorithm for the parallel solution of the task under the control of MPI (Message Passing Interface) technology has been developed, and its efficiency has been assessed. The acceleration values of the proposed algorithm are obtained depending on the number of involved computers (processors) and the size of the computational grid. The maximum number of computers used is 24, the maximum size of the computational grid was 10 000 ´ 10 000 computational nodes. The developed algorithm showed low efficiency for small computational grids (up to 100 ´ 100 computational nodes). In the case of large computational grids ( from 1000  1000 computational nodes), the use of MPI reduces the computation time by several times.

Poroelastoplastic modeling of borehole shear bands on high order curvilinear meshes using CUDA technology

10.5194/egusphere-egu21-15600 ◽

2021 ◽

Author(s):

Anatoly Vershinin ◽

Vladimir Levin ◽

Yury Podladchikov

Keyword(s):

Geometric Model ◽

Shear Bands ◽

Spectral Element ◽

Russian Science ◽

Element Mesh ◽

Cuda Technology ◽

Elastoplastic Matrix ◽

The Galerkin Method ◽

Academy Of Sciences ◽

Schmidt Institute

The presentation describes an approach to solving problems of modeling the development of zones of localization of plastic deformations within the framework of a poroelastoplastic model generalizing Biot's model. A distinctive feature of this model is a two-way coupling between mechanical processes occurring in a porous elastoplastic matrix and a saturating viscous fluid.&#160; For the numerical solution of the problem, a variational formulation based on the Galerkin method and the isoparametric&#160; spectral element method (SEM) is used to discretize the geometric model and PDEs on curvilinear unstructured SEM meshes. SEM orders up to the 15th were used for calculations.&#160; The software implementation of the developed algorithm based on SEM is performed using CUDA. A spectral element mesh is naturally mapped to a CUDA grid of SMs, and accordingly, each spectral element is mapped to a streaming block, within which individual nodes are processed by the corresponding threads within the block.The research for this article is performed partially in Schmidt Institute of Physics of the Earth of the Russian Academy of Sciences and supported by the Russian Science Foundation under grant &#8470; 19-77-10062.

cuda technology
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Influence of gravity effect to the recovery rate at uranium in-situ leaching

GPU-Accelerated Laplace Equation Model Development Based on CUDA Fortran

Evaluation of the performance of algorithms for synthesizing radar images using Cuda technology

Development of software and algorithms of parallel learning of artificial neural networks using CUDA technologies

Acceleration of boundary element calculations for closed domain using nonlinear form functions and CUDA technology

Development of vector algorithm using CUDA technology for three-dimensional retinal laser coagulation process modeling

REAL-TIME DEEP NEURAL NETWORKS FOR MULTIPLE OBJECT TRACKING AND SEGMENTATION ON MONOCULAR VIDEO

USING THE CUDA TECHNOLOGY TO SPEED UP COMPUTATIONS IN PROBLEMS OF CHEMICAL KINETICS

NUMERICAL IMPLEMENTATION OF A PARALLEL ALGORITHM FOR SOLVING THE PROBLEM OF POLLUTANT TRANSPORT IN A RESERVOIR ON A HIGH-PERFORMANCE COMPUTER SYSTEM

Poroelastoplastic modeling of borehole shear bands on high order curvilinear meshes using CUDA technology

Export Citation Format

cuda technologyRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Influence of gravity effect to the recovery rate at uranium in-situ leaching

GPU-Accelerated Laplace Equation Model Development Based on CUDA Fortran

Evaluation of the performance of algorithms for synthesizing radar images using Cuda technology

Development of software and algorithms of parallel learning of artificial neural networks using CUDA technologies

Acceleration of boundary element calculations for closed domain using nonlinear form functions and CUDA technology

Development of vector algorithm using CUDA technology for three-dimensional retinal laser coagulation process modeling

REAL-TIME DEEP NEURAL NETWORKS FOR MULTIPLE OBJECT TRACKING AND SEGMENTATION ON MONOCULAR VIDEO

USING THE CUDA TECHNOLOGY TO SPEED UP COMPUTATIONS IN PROBLEMS OF CHEMICAL KINETICS

NUMERICAL IMPLEMENTATION OF A PARALLEL ALGORITHM FOR SOLVING THE PROBLEM OF POLLUTANT TRANSPORT IN A RESERVOIR ON A HIGH-PERFORMANCE COMPUTER SYSTEM

Poroelastoplastic modeling of borehole shear bands on high order curvilinear meshes using CUDA technology

cuda technology
Recently Published Documents