High-performance embedded branch predictor by combining branch direction history and global branch history

2008 ◽  
Vol 2 (2) ◽  
pp. 142 ◽  
Author(s):  
J.W. Kwak ◽  
C.S. Jhon
2014 ◽  
Vol 72 (5) ◽  
pp. 1679-1693 ◽  
Author(s):  
Cong Thuan Do ◽  
Hong Jun Choi ◽  
Dong Oh Son ◽  
Jong Myon Kim ◽  
Cheol Hong Kim

Author(s):  
Sweety Nain ◽  
Prachi Chaudhary

Introduction: Accurate branch prediction technique has become compulsory in the superscalar and deep pipeline processors. The conditional instructions can break the continuous flow of execution in the pipeline stages, thereby decreasing processor performance. Discussion: This paper highlights the concept of branch prediction, some issues and challenges, and techniques for improving processor performance. Further, this paper also presents the role of branch prediction in different processors and their features. Conclusion: The concept of the branch prediction used in parallel processors to enhance the execution speed of the conditional branch instructions and improve the processor's performance is highlighted in this paper. Further, this paper highlights the branch predictor techniques with their features and presents the challenges, issues, and future techniques related to the branch prediction.


Micromachines ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 292
Author(s):  
Wenheng Ma ◽  
Qiao Cheng ◽  
Yudi Gao ◽  
Lan Xu ◽  
Ningmei Yu

Embedded processors are widely used in various systems working on different tasks with different workloads. A more complex micro-architecture leads to better peak performance and worse power consumption. Shutting down the units designed for performance enhancement could improve energy efficiency in low-workload scenarios. In this paper, we evaluated the energy distribution in various embedded processors. According to the analysis, pipeline registers and the dynamic branch predictor, which are employed for better peak performance, have great impacts on energy efficiency. Thus, we proposed an ultra-low-power processor with variable micro-architecture. The processor is based on a 4-stage pipeline core with a Gshare branch predictor, and all units work in high-performance mode. In normal mode, the Gshare predictor is shut down and Always-Not-Taken prediction is used. In low-power mode, some of the pipeline registers are bypassed to avoid unnecessary energy dissipation and improve executing efficiency. A mode register (MR) is designed to indicate current working mode. Switching between different modes is controlled by the software. The proposed core is implemented in 40 nm technology and simulated with the traces of 17 benchmarks in Embench. The average amounts of power consumed by the respective modes are 41.7 μW, 59.7 μW and 71.1 μW. The results show that normal mode (N-mode) and low-power mode (L-mode) consume 16.08% and 41.37% less power than high-performance mode (H-mode) on average. In best case scenarios, they could save 25.36% and 49.30% more power than H-mode. Considering the execution efficiency evaluated by instructions per cycle (IPC), the proposed processor consumes 7.78% or 51.57% less energy for each instruction than the baseline core. The area of the proposed processor is only 7.19% larger than the baseline core, and only 3.08% more power is consumed in H-mode.


Sign in / Sign up

Export Citation Format

Share Document