cuda architecture
Recently Published Documents


TOTAL DOCUMENTS

68
(FIVE YEARS 14)

H-INDEX

8
(FIVE YEARS 0)

Author(s):  
Jiffriya Mohamed Abdul Cader ◽  
Roshan G. Ragel ◽  
Hasindu Gamaarachchi ◽  
Akmal Jahan Mohamed Abdul Cader

2021 ◽  
Vol 4 ◽  
pp. 16-22
Author(s):  
Mykola Semylitko ◽  
Gennadii Malaschonok

SVD (Singular Value Decomposition) algorithm is used in recommendation systems, machine learning, image processing, and in various algorithms for working with matrices which can be very large and Big Data, so, given the peculiarities of this algorithm, it can be performed on a large number of computing threads that have only video cards.CUDA is a parallel computing platform and application programming interface model created by Nvidia. It allows software developers and software engineers to use a CUDA-enabled graphics processing unit for general purpose processing – an approach termed GPGPU (general-purpose computing on graphics processing units). The GPU provides much higher instruction throughput and memory bandwidth than the CPU within a similar price and power envelope. Many applications leverage these higher capabilities to run faster on the GPU than on the CPU. Other computing devices, like FPGAs, are also very energy efficient, but they offer much less programming flexibility than GPUs.The developed modification uses the CUDA architecture, which is intended for a large number of simultaneous calculations, which allows to quickly process matrices of very large sizes. The algorithm of parallel SVD for a three-diagonal matrix based on the Givents rotation provides a high accuracy of calculations. Also the algorithm has a number of optimizations to work with memory and multiplication algorithms that can significantly reduce the computation time discarding empty iterations.This article proposes an approach that will reduce the computation time and, consequently, resources and costs. The developed algorithm can be used with the help of a simple and convenient API in C ++ and Java, as well as will be improved by using dynamic parallelism or parallelization of multiplication operations. Also the obtained results can be used by other developers for comparison, as all conditions of the research are described in detail, and the code is in free access.


Author(s):  
Zhou Zhang

Abstract Soft-body simulation is widely used in animation, prostheses, organs, and so on. The most common way is to use 3D software. However, Their simulation models and the data processing speed are limited. Therefore, one model based on the mass-spring mechanism is proposed. To realize real-time rendering, a parallel computing architecture based on the CUDA architecture is introduced. Besides, to increase the accuracy of the simulation, the Verlet integration is employed. The work is to check whether the massively parallel computing method based on the CUDA architecture improves the rendering performance. To meet the minimum requirement to make the human eye comfortable, all the tests had at least a 60 Hz refreshing rate. Also, the soft body of mass-particles and springs has a uniform width and depth, but the height is much smaller. It was modeled to fall under the influence of gravity, and then, to impact on a rigid object. The serial and the parallel methods were not significantly different when the rendering nodes were less than 2,000, but it became apparent when the number of nodes reached 10,000. Therefore, the simulation efficiency of a soft body is improved by the proposed method.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1782
Author(s):  
Aurelio López-Fernández ◽  
Domingo S. Rodríguez-Baena ◽  
Francisco Gómez-Vela

Nowadays, Biclustering is one of the most widely used machine learning techniques to discover local patterns in datasets from different areas such as energy consumption, marketing, social networks or bioinformatics, among them. Particularly in bioinformatics, Biclustering techniques have become extremely time-consuming, also being huge the number of results generated, due to the continuous increase in the size of the databases over the last few years. For this reason, validation techniques must be adapted to this new environment in order to help researchers focus their efforts on a specific subset of results in an efficient, fast and reliable way. The aforementioned situation may well be considered as Big Data context. In this sense, multiple machine learning techniques have been implemented by the application of Graphic Processing Units (GPU) technology and CUDA architecture to accelerate the processing of large databases. However, as far as we know, this technology has not yet been applied to any bicluster validation technique. In this work, a multi-GPU version of one of the most used bicluster validation measure, Mean Squared Residue (MSR), is presented. It takes advantage of all the hardware and memory resources offered by GPU devices. Because of to this, gMSR is able to validate a massive number of biclusters in any Biclustering-based study within a Big Data context.


2020 ◽  
Author(s):  
Esdras La-Roque ◽  
Cassio Batista ◽  
Josivaldo Araújo

This paper presents a parallel strategy with a heuristic approach to reduce the execution time bottleneck of a routing and wavelength assignment problem in wavelength-division multiplexing networks of a previous work that uses a sequential genetic algorithm. As the parallelization solution, the GPU hardware processing on CUDA architecture and CUDA C programming language were adopted. The results achieved were between 35 and 40 times faster than the sequential version of the genetic algorithm.


2020 ◽  
Vol 25 (1) ◽  
pp. 1-11
Author(s):  
Sura Alrawy ◽  
Fakhrulddin Ali
Keyword(s):  

2020 ◽  
Vol 3 (1) ◽  
Author(s):  
A. Maciel 1 ◽  
R. V. Vieira 2

This paper presents the process of adaptive filtering of cardiovascular disease signals from the processing and cleaning of ECG signals developed by the Compact Genetic Algorithm Based on Abstract Data Types (CGAADT), implemented in MATLAB using GPU/CUDA architecture from the examples of the base of MIT-BIH data. The results show that CGAADT can improve the filtering, cleaning, detection and diagnosis of arrhythmias using a single algorithm (CGAADT) in the adoption of a representation for the population with fixed size of chromosomes, pre-established by fragmentation of the GPU base when implemented in high performance systems, aiming to improve the health systems offered to patients with cardiovascular problems.


Cryptography ◽  
2020 ◽  
pp. 193-213
Author(s):  
Srinivasa K. G. ◽  
Siddesh G. M. ◽  
Srinidhi Hiriyannaiah ◽  
Anusha Morappanavar ◽  
Anurag Banerjee

The world of digital communication consists of various applications which uses internet as the backbone for communication. These applications consist of data related to the users of the application, which is confidential and integrity needs to be maintained to protect against unauthorized access and use. In the information hiding field of research, Cryptography is one of the wide techniques used to provide security to the internet applications that overcome the challenges like confidentiality, integrity, authentication services etc. In this paper, we present a novel approach on symmetric key cryptography technique using genetic algorithm that is implemented on CUDA architecture.


Sign in / Sign up

Export Citation Format

Share Document