A Hybrid Branch Prediction Approach for High- Performance Processors

Author(s):  
Sweety Nain ◽  
Prachi Chaudhary

Background: In a parallel processor, the pipeline cannot fetch the conditional instructions with the next clock cycle, leading to a pipeline stall. So, conditional instructions create a problem in the pipeline because the proper path can only be known after the branch execution. To accurately predict branches, a significant predictor is proposed for the prediction of conditional branch instruction. Method: In this paper, a single branch prediction and a correlation branch prediction scheme are applied to the different trace files by using the concept of saturating counters. Further, a hybrid branch prediction scheme is proposed, which uses both global and local branch information, providing more accuracy than the single and correlation branch prediction schemes. Results: Firstly, a single branch prediction and correlation branch prediction technique are applied to the trace files using saturating counters. By comparison, it can be observed that a correlation branch prediction technique provides better results by enhancing the accuracy rate of 2.25% than the simple branch prediction. Further, a hybrid branch prediction scheme is proposed, which uses both global and local branch information, providing more accuracy than the single and correlation branch prediction schemes. The obtained results suggest that the proposed hybrid branch prediction schemes provide an increased accuracy rate of 3.68% and 1.43% than single branch prediction and correlation branch prediction. Conclusion: The proposed hybrid branch prediction scheme gives a lower misprediction rate and higher accuracy rate than the simple branch prediction scheme and correlation branch prediction scheme.

Author(s):  
Sweety Nain ◽  
Prachi Chaudhary

Introduction: Accurate branch prediction technique has become compulsory in the superscalar and deep pipeline processors. The conditional instructions can break the continuous flow of execution in the pipeline stages, thereby decreasing processor performance. Discussion: This paper highlights the concept of branch prediction, some issues and challenges, and techniques for improving processor performance. Further, this paper also presents the role of branch prediction in different processors and their features. Conclusion: The concept of the branch prediction used in parallel processors to enhance the execution speed of the conditional branch instructions and improve the processor's performance is highlighted in this paper. Further, this paper highlights the branch predictor techniques with their features and presents the challenges, issues, and future techniques related to the branch prediction.


2021 ◽  
Vol 7 (3) ◽  
Author(s):  
S.G. Bobkov

The problems of creating of high-performance embedded computing systems based on microprocessors KOMDIV is considered. Processor performance is dependent upon three characteristics: clock cycle, clock cycles per instruction, and instruction count. These characteristics for microprocessors KOMDIV are optimized using parameter performance/power consumption and requirements of embedded systems.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yongyi Li ◽  
Shiqi Wang ◽  
Shuang Dong ◽  
Xueling Lv ◽  
Changzhi Lv ◽  
...  

At present, person reidentification based on attention mechanism has attracted many scholars’ interests. Although attention module can improve the representation ability and reidentification accuracy of Re-ID model to a certain extent, it depends on the coupling of attention module and original network. In this paper, a person reidentification model that combines multiple attentions and multiscale residuals is proposed. The model introduces combined attention fusion module and multiscale residual fusion module in the backbone network ResNet 50 to enhance the feature flow between residual blocks and better fuse multiscale features. Furthermore, a global branch and a local branch are designed and applied to enhance the channel aggregation and position perception ability of the network by utilizing the dual ensemble attention module, as along as the fine-grained feature expression is obtained by using multiproportion block and reorganization. Thus, the global and local features are enhanced. The experimental results on Market-1501 dataset and DukeMTMC-reID dataset show that the indexes of the presented model, especially Rank-1 accuracy, reach 96.20% and 89.59%, respectively, which can be considered as a progress in Re-ID.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Alireza Monemi ◽  
Chia Yee Ooi ◽  
Muhammad Nadzir Marsono

Network-on-Chip (NoC) is fast emerging as an on-chip communication alternative for many-core System-on-Chips (SoCs). However, designing a high performance low latency NoC with low area overhead has remained a challenge. In this paper, we present a two-clock-cycle latency NoC microarchitecture. An efficient request masking technique is proposed to combine virtual channel (VC) allocation with switch allocation nonspeculatively. Our proposed NoC architecture is optimized in terms of area overhead, operating frequency, and quality-of-service (QoS). We evaluate our NoC against CONNECT, an open source low latency NoC design targeted for field-programmable gate array (FPGA). The experimental results on several FPGA devices show that our NoC router outperforms CONNECT with 50% reduction of logic cells (LCs) utilization, while it works with 100% and 35%~20% higher operating frequency compared to the one- and two-clock-cycle latency CONNECT NoC routers, respectively. Moreover, the proposed NoC router achieves 2.3 times better performance compared to CONNECT.


Sign in / Sign up

Export Citation Format

Share Document