graphical processing unit
Recently Published Documents


TOTAL DOCUMENTS

73
(FIVE YEARS 19)

H-INDEX

9
(FIVE YEARS 1)

Author(s):  
João Victor Daher Daibes ◽  
Milton Brown Do Coutto Filho ◽  
Julio Cesar Stacchini de Souza ◽  
Esteban Walter Gonzalez Clua ◽  
Rainer Zanghi

2021 ◽  
Author(s):  
Wanderson Berbert ◽  
Luciano Bertini ◽  
Alessandro Copetti

Aprendizado de máquina tem se tornado uma ferramenta essencialpara qualquer sistema de tomada de decisão. Devido a limitaçõesde performance impostas por arquiteturas tradicionais que utilizamCentral Processing Units (CPUs), para aplicações mais críticas, métodosde aceleração com Graphical Processing Unit (GPU) e ApplicationSpecific Integrated Circuit (ASIC) têm sido empregados. No entanto,quando aplicadas a sistemas embarcados, estas apresentam limitaçõesrelacionadas a tamanho físico e complexidade. Para resolverestes problemas, a utilização da tecnologia Field Programmable GateArray (FPGA) tem se mostrado promissora devido a sua grande eficiência,paralelismo real, reconfigurabilidade e flexibilidade. Diantedisso, este estudo tem como objetivo, além de fazer uma revisãoaprofundada da bibliografia, apresentar arquiteturas projetadasem FPGA que buscam minimizar tais limitações, maximizando aeficiência, sem perda de performance significativa e de modo a viabilizarsua utilização em sistemas embarcados. Resultados mostramganhos em performance acima de 95% quando utilizando um hardwareespecialista desenvolvido em FPGA utilizando o algoritmo deaprendizado de máquina K-Nearest Neighbor (KNN).


Author(s):  
Soumya Ranjan Nayak ◽  
S Sivakumar ◽  
Akash Kumar Bhoi ◽  
Gyoo-Soo Chae ◽  
Pradeep Kumar Mallick

Graphical processing unit (GPU) has gained more popularity among researchers in the field of decision making and knowledge discovery systems. However, most of the earlier studies have GPU memory utilization, computational time, and accuracy limitations. The main contribution of this paper is to present a novel algorithm called the Mixed Mode Database Miner (MMDBM) classifier by implementing multithreading concepts on a large number of attributes. The proposed method use the quick sort algorithm in GPU parallel computing to overcome the state of the art limitations. This method applies the dynamic rule generation approach for constructing the decision tree based on the predicted rules. Moreover, the implementation results are compared with both SLIQ and MMDBM using Java and GPU with the computed acceleration ratio time using the BP dataset. The primary objective of this work is to improve the performance with less processing time. The results are also analyzed using various threads in GPU mining using eight different datasets of UCI Machine learning repository. The proposed MMDBM algorithm have been validated on these chosen eight different dataset with accuracy of 91.3% in diabetes, 89.1% in breast cancer, 96.6% in iris, 89.9% in labor, 95.4% in vote, 89.5% in credit card, 78.7% in supermarket and 78.7% in BP, and simultaneously, it also takes less computational time for given datasets. The outcome of this work will be beneficial for the research community to develop more effective multi thread based GPU solution in GPU mining to handle large set of data in minimal processing time. Therefore, this can be considered a more reliable and precise method for GPU computing.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Xiang Yu ◽  
Yu Qiao ◽  
Qingpeng Li ◽  
Gang Xu ◽  
Chuanxiong Kang ◽  
...  

Comprehensive learning particle swarm optimization (CLPSO) is a powerful metaheuristic for global optimization. This paper studies parallelizing CLPSO by open computing language (OpenCL) on the integrated Intel HD Graphics 520 (IHDG520) graphical processing unit (GPU) with a low clock rate. We implement a coarse-grained all-GPU model that maps each particle to a separate work item. Two enhancement strategies, namely, generating and transferring random numbers from the central processor to the GPU as well as reducing the number of instructions in the kernel, are proposed to shorten the model’s execution time. This paper further investigates parallelizing deterministic optimization for implicit stochastic optimization of China’s Xiaowan Reservoir. The deterministic optimization is performed on an ensemble of 62 years’ historical inflow records with monthly time steps, is solved by CLPSO, and is parallelized by a coarse-grained multipopulation model extended from the all-GPU model. The multipopulation model involves a large number of work items. Because of the capacity limit for a buffer transferring data from the central processor to the GPU and the size of the global memory region, the random number generation strategy is modified by generating a small number of random numbers that can be flexibly exploited by the large number of work items. Experiments conducted on various benchmark functions and the case study demonstrate that our proposed all-GPU and multipopulation parallelization models are appropriate; and the multipopulation model achieves the consumption of significantly less execution time than the corresponding sequential model.


Sign in / Sign up

Export Citation Format

Share Document