register file
Recently Published Documents


TOTAL DOCUMENTS

482
(FIVE YEARS 32)

H-INDEX

24
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Ilya Tuzov ◽  
Pablo Andreu ◽  
Laura Medina ◽  
Tomas Picornell ◽  
Antonio Robles ◽  
...  

2021 ◽  
Author(s):  
Martin Košt'ál ◽  
Michal Sojka

Electronics ◽  
2021 ◽  
Vol 10 (18) ◽  
pp. 2286
Author(s):  
Yohan Ko

From early design phases to final release, the reliability of modern embedded systems against soft errors should be carefully considered. Several schemes have been proposed to protect embedded systems against soft errors, but they are neither always functional nor robust, even with expensive overhead in terms of hardware area, performance, and power consumption. Thus, system designers need to estimate reliability quantitatively to apply appropriate protection techniques for resource-constrained embedded systems. Vulnerability modeling based on lifetime analysis is one of the most efficient ways to quantify system reliability against soft errors. However, lifetime analysis can be inaccurate, mainly because it fails to comprehensively capture several system-level masking effects. This study analyzes and characterizes microarchitecture-level and software-level masking effects by developing an automated framework with exhaustive fault injections (i.e., soft errors) based on a cycle-accurate gem5 simulator. We injected faults into a register file because errors in the register file can easily be propagated to other components in a processor. We found that only 5% of injected faults can cause system failures on an average over benchmarks, mainly from the MiBench suite. Further analyses showed that 71% of soft errors are overwritten by write operations before being used, and the CPU does not use 20% of soft errors at all after fault injections. The remainder are also masked by several software-level masking effects, such as dynamically dead instructions, compare and logical instructions that do not change the result, and incorrect control flows that do not affect program outputs.


2021 ◽  
Author(s):  
Ayazulla Khan Patan ◽  
Dimitrios Stathis ◽  
Pudi Dhilleswararao ◽  
Yu Yang ◽  
Srinivas Boppu ◽  
...  

2021 ◽  
Vol 18 (3) ◽  
pp. 1-22
Author(s):  
Ricardo Alves ◽  
Stefanos Kaxiras ◽  
David Black-Schaffer

Achieving low load-to-use latency with low energy and storage overheads is critical for performance. Existing techniques either prefetch into the pipeline (via address prediction and validation) or provide data reuse in the pipeline (via register sharing or L0 caches). These techniques provide a range of tradeoffs between latency, reuse, and overhead. In this work, we present a pipeline prefetching technique that achieves state-of-the-art performance and data reuse without additional data storage, data movement, or validation overheads by adding address tags to the register file. Our addition of register file tags allows us to forward (reuse) load data from the register file with no additional data movement, keep the data alive in the register file beyond the instruction’s lifetime to increase temporal reuse, and coalesce prefetch requests to achieve spatial reuse. Further, we show that we can use the existing memory order violation detection hardware to validate prefetches and data forwards without additional overhead. Our design achieves the performance of existing pipeline prefetching while also forwarding 32% of the loads from the register file (compared to 15% in state-of-the-art register sharing), delivering a 16% reduction in L1 dynamic energy (1.6% total processor energy), with an area overhead of less than 0.5%.


2021 ◽  
pp. 105076
Author(s):  
A. Mohammaden ◽  
M.E. Fouda ◽  
Ihsen Alouani ◽  
Lobna A. Said ◽  
Ahmed G. Radwan
Keyword(s):  

Author(s):  
Yara M. Abdelaal ◽  
M. Fayez ◽  
Samy Ghoniemy ◽  
Ehab Abozinadah ◽  
H. M. Faheem

Face detection algorithms varies in speed and performance on GPUs. Different algorithms can report different speeds on different GPUs that are not governed by linear or nearlinear approximations. This is due to many factors such as register file size, occupancy rate of the GPU, speed of the memory, and speed of double precision processors. This paper studies the most common face detection algorithms LBP and Haar-like and study the bottlenecks associated with deploying both algorithms on different GPU architectures. The study focuses on the bottlenecks and the associated techniques to resolve them based on the different GPUs specifications.


Sign in / Sign up

Export Citation Format

Share Document