branch predictor
Recently Published Documents


TOTAL DOCUMENTS

107
(FIVE YEARS 19)

H-INDEX

11
(FIVE YEARS 1)

2021 ◽  
Vol 36 (5) ◽  
pp. 1022-1036
Author(s):  
Lu-Tan Zhao ◽  
Rui Hou ◽  
Kai Wang ◽  
Yu-Lan Su ◽  
Pei-Nan Li ◽  
...  
Keyword(s):  

Micromachines ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 292
Author(s):  
Wenheng Ma ◽  
Qiao Cheng ◽  
Yudi Gao ◽  
Lan Xu ◽  
Ningmei Yu

Embedded processors are widely used in various systems working on different tasks with different workloads. A more complex micro-architecture leads to better peak performance and worse power consumption. Shutting down the units designed for performance enhancement could improve energy efficiency in low-workload scenarios. In this paper, we evaluated the energy distribution in various embedded processors. According to the analysis, pipeline registers and the dynamic branch predictor, which are employed for better peak performance, have great impacts on energy efficiency. Thus, we proposed an ultra-low-power processor with variable micro-architecture. The processor is based on a 4-stage pipeline core with a Gshare branch predictor, and all units work in high-performance mode. In normal mode, the Gshare predictor is shut down and Always-Not-Taken prediction is used. In low-power mode, some of the pipeline registers are bypassed to avoid unnecessary energy dissipation and improve executing efficiency. A mode register (MR) is designed to indicate current working mode. Switching between different modes is controlled by the software. The proposed core is implemented in 40 nm technology and simulated with the traces of 17 benchmarks in Embench. The average amounts of power consumed by the respective modes are 41.7 μW, 59.7 μW and 71.1 μW. The results show that normal mode (N-mode) and low-power mode (L-mode) consume 16.08% and 41.37% less power than high-performance mode (H-mode) on average. In best case scenarios, they could save 25.36% and 49.30% more power than H-mode. Considering the execution efficiency evaluated by instructions per cycle (IPC), the proposed processor consumes 7.78% or 51.57% less energy for each instruction than the baseline core. The area of the proposed processor is only 7.19% larger than the baseline core, and only 3.08% more power is consumed in H-mode.


Author(s):  
Sweety Nain ◽  
Prachi Chaudhary

Introduction: Accurate branch prediction technique has become compulsory in the superscalar and deep pipeline processors. The conditional instructions can break the continuous flow of execution in the pipeline stages, thereby decreasing processor performance. Discussion: This paper highlights the concept of branch prediction, some issues and challenges, and techniques for improving processor performance. Further, this paper also presents the role of branch prediction in different processors and their features. Conclusion: The concept of the branch prediction used in parallel processors to enhance the execution speed of the conditional branch instructions and improve the processor's performance is highlighted in this paper. Further, this paper highlights the branch predictor techniques with their features and presents the challenges, issues, and future techniques related to the branch prediction.


Author(s):  
Hadeel SH. Mahmood

Instructions pipelining is one of the most outstanding techniques used in improving processor speed; nonetheless, these pipelined stages are constantly facing stalls that caused by nested conditional branches. During the execution of nested conditional branches, the behavior of the running branch depends on the history information of the previous ones; therefore, these branches have the greatest effect in reducing the prediction accuracy of a branch predictor among conditional branches. The purpose of this research is to reduce the stall cycles caused by correlated branches misprediction by introducing a hardware model of a branch predictor that combines both local and global prediction techniques. This predictor integrates the prediction characteristics of the alloyed predictor with those of the correlated predictor. the predictor design which implemented in VHDL (Very high-speed IC hardware description language) was inserted in previously designed MIPS (microprocessor without interlocked pipelined stages) processor and its prediction accuracy was confirmed by executing a program using the selection sort algorithm to sort 100 input numbers of different combinations ascendingly.


Author(s):  
Narasimha Adiga ◽  
James Bonanno ◽  
Adam Collura ◽  
Matthias Heizmann ◽  
Brian R. Prasky ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document