acceleration methods
Recently Published Documents


TOTAL DOCUMENTS

117
(FIVE YEARS 28)

H-INDEX

14
(FIVE YEARS 1)

2021 ◽  
Vol 17 (3) ◽  
pp. 1-28
Author(s):  
Aviv Nachman ◽  
Sarai Sheinvald ◽  
Ariel Kolikant ◽  
Gala Yadgar

Deduplication decreases the physical occupancy of files in a storage volume by removing duplicate copies of data chunks, but creates data-sharing dependencies that complicate standard storage management tasks. Specifically, data migration plans must consider the dependencies between files that are remapped to new volumes and files that are not. Thus far, only greedy approaches have been suggested for constructing such plans, and it is unclear how they compare to one another and how much they can be improved. We set to bridge this gap for seeding —migration in which the target volume is initially empty. We prove that even this basic instance of data migration is NP-hard in the presence of deduplication. We then present GoSeed, a formulation of seeding as an integer linear programming (ILP) problem, and three acceleration methods for applying it to real-sized storage volumes. Our experimental evaluation shows that, while the greedy approaches perform well on “easy” problem instances, the cost of their solution can be significantly higher than that of GoSeed’s solution on “hard” instances, for which they are sometimes unable to find a solution at all.


2021 ◽  
pp. 34-45
Author(s):  
Dominika Przewlocka-Rus ◽  
Marcin Kowalczyk ◽  
Tomasz Kryjak

2021 ◽  
Author(s):  
Dominika Przewlocka ◽  
Marcin Kowalczyk ◽  
Tomasz Kryjak

Deep learning algorithms are a key component of many state-of-the-art vision systems, especially as Convolutional Neural Networks (CNN) outperform most solutions in the sense of accuracy. To apply such algorithms in real-time applications, one has to address the challenges of memory and computational complexity. To deal with the first issue, we use networks with reduced precision, specifically a binary neural network (also known as XNOR). To satisfy the computational requirements, we propose to use highly parallel and low-power FPGA devices. In this work, we explore the possibility of accelerating XNOR networks for traffic sign classification. The trained binary networks are implemented on the ZCU 104 development board, equipped with a Zynq UltraScale+ MPSoC device using two different approaches. Firstly, we propose a custom HDL accelerator for XNOR networks, which enables the inference with almost 450 fps. Even better results are obtained with the second method - the Xilinx FINN accelerator - enabling to process input images with around 550 frame rate. Both approaches provide over 96% accuracy on the test set.


2021 ◽  
Author(s):  
Dominika Przewlocka ◽  
Marcin Kowalczyk ◽  
Tomasz Kryjak

Deep learning algorithms are a key component of many state-of-the-art vision systems, especially as Convolutional Neural Networks (CNN) outperform most solutions in the sense of accuracy. To apply such algorithms in real-time applications, one has to address the challenges of memory and computational complexity. To deal with the first issue, we use networks with reduced precision, specifically a binary neural network (also known as XNOR). To satisfy the computational requirements, we propose to use highly parallel and low-power FPGA devices. In this work, we explore the possibility of accelerating XNOR networks for traffic sign classification. The trained binary networks are implemented on the ZCU 104 development board, equipped with a Zynq UltraScale+ MPSoC device using two different approaches. Firstly, we propose a custom HDL accelerator for XNOR networks, which enables the inference with almost 450 fps. Even better results are obtained with the second method - the Xilinx FINN accelerator - enabling to process input images with around 550 frame rate. Both approaches provide over 96% accuracy on the test set.


2021 ◽  
Vol 5 (1-2) ◽  
pp. 1-245
Author(s):  
Alexandre d’Aspremont ◽  
Damien Scieur ◽  
Adrien Taylor
Keyword(s):  

2021 ◽  
Vol 247 ◽  
pp. 02037
Author(s):  
Luke Cornejo ◽  
Benjamin Collins ◽  
Shane Stimpson

Ongoing efforts are being made to improve the performance of MPACT as the deterministic neutron transport solver in the Virtual Environment for Reactor Analysis (VERA). As other parts of the code have been improved, the coarse mesh finite difference method (CMFD) has come to take up a significant portion of the runtime. Multilevel-in-energy CMFD and multilevel-in-space CMFD solvers have been used to improve CMFD solver performance. A new multilevel-in-space-and-energy CMFD solver is being introduced that combines components of these two methods. W-Cycles and partial W-Cycles are being investigated to further improve the efficiency of the multilevel-in-energy CMFD solver. The performance of these methods is demonstrated on full core reactor physics problems of interest to VERA.


2021 ◽  
Author(s):  
Alexandre d’Aspremont ◽  
Damien Scieur ◽  
Adrien Taylor
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document