A Hybrid Branch Prediction Approach for High- Performance Processors

Recent Advances in Computer Science and Communications ◽

10.2174/2666255814666210210163616 ◽

2021 ◽

Vol 14 ◽

Author(s):

Sweety Nain ◽

Prachi Chaudhary

Keyword(s):

High Performance ◽

Clock Cycle ◽

Branch Prediction ◽

Accuracy Rate ◽

Local Branch ◽

Prediction Scheme ◽

Prediction Technique ◽

Global And Local ◽

Prediction Approach ◽

Single Branch

Background: In a parallel processor, the pipeline cannot fetch the conditional instructions with the next clock cycle, leading to a pipeline stall. So, conditional instructions create a problem in the pipeline because the proper path can only be known after the branch execution. To accurately predict branches, a significant predictor is proposed for the prediction of conditional branch instruction. Method: In this paper, a single branch prediction and a correlation branch prediction scheme are applied to the different trace files by using the concept of saturating counters. Further, a hybrid branch prediction scheme is proposed, which uses both global and local branch information, providing more accuracy than the single and correlation branch prediction schemes. Results: Firstly, a single branch prediction and correlation branch prediction technique are applied to the trace files using saturating counters. By comparison, it can be observed that a correlation branch prediction technique provides better results by enhancing the accuracy rate of 2.25% than the simple branch prediction. Further, a hybrid branch prediction scheme is proposed, which uses both global and local branch information, providing more accuracy than the single and correlation branch prediction schemes. The obtained results suggest that the proposed hybrid branch prediction schemes provide an increased accuracy rate of 3.68% and 1.43% than single branch prediction and correlation branch prediction. Conclusion: The proposed hybrid branch prediction scheme gives a lower misprediction rate and higher accuracy rate than the simple branch prediction scheme and correlation branch prediction scheme.

Download Full-text

Towards the Improving Branch Instructions Identification in High- Performance Processors: Issues, Challenges and Techniques

Recent Advances in Computer Science and Communications ◽

10.2174/2666255814666210210164146 ◽

2021 ◽

Vol 14 ◽

Author(s):

Sweety Nain ◽

Prachi Chaudhary

Keyword(s):

Continuous Flow ◽

High Performance ◽

Branch Prediction ◽

Parallel Processors ◽

Processor Performance ◽

Branch Predictor ◽

Prediction Technique ◽

Execution Speed ◽

Conditional Branch

Introduction: Accurate branch prediction technique has become compulsory in the superscalar and deep pipeline processors. The conditional instructions can break the continuous flow of execution in the pipeline stages, thereby decreasing processor performance. Discussion: This paper highlights the concept of branch prediction, some issues and challenges, and techniques for improving processor performance. Further, this paper also presents the role of branch prediction in different processors and their features. Conclusion: The concept of the branch prediction used in parallel processors to enhance the execution speed of the conditional branch instructions and improve the processor's performance is highlighted in this paper. Further, this paper highlights the branch predictor techniques with their features and presents the challenges, issues, and future techniques related to the branch prediction.

Download Full-text

Branch prediction using both global and local branch history information

IEE Proceedings - Computers and Digital Techniques ◽

10.1049/ip-cdt:20020273 ◽

2002 ◽

Vol 149 (2) ◽

pp. 33 ◽

Cited By ~ 4

Author(s):

M.-C. Chang ◽

Y.-W. Chou

Keyword(s):

Branch Prediction ◽

Local Branch ◽

History Information ◽

Global And Local

Download Full-text

Microprocessors KOMDIV for High Performance Embedded Systems

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v7i3.71 ◽

2021 ◽

Vol 7 (3) ◽

Author(s):

S.G. Bobkov

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

High Performance ◽

Clock Cycle ◽

Embedded Computing ◽

Computing Systems ◽

Processor Performance

The problems of creating of high-performance embedded computing systems based on microprocessors KOMDIV is considered. Processor performance is dependent upon three characteristics: clock cycle, clock cycles per instruction, and instruction count. These characteristics for microprocessors KOMDIV are optimized using parameter performance/power consumption and requirements of embedded systems.

Download Full-text

Person Reidentification Model Based on Multiattention Modules and Multiscale Residuals

Complexity ◽

10.1155/2021/6673461 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Yongyi Li ◽

Shiqi Wang ◽

Shuang Dong ◽

Xueling Lv ◽

Changzhi Lv ◽

...

Keyword(s):

Local Features ◽

Attention Mechanism ◽

Experimental Results ◽

Original Network ◽

Fine Grained ◽

Backbone Network ◽

Model Based ◽

Local Branch ◽

Feature Expression ◽

Global And Local

At present, person reidentification based on attention mechanism has attracted many scholars’ interests. Although attention module can improve the representation ability and reidentification accuracy of Re-ID model to a certain extent, it depends on the coupling of attention module and original network. In this paper, a person reidentification model that combines multiple attentions and multiscale residuals is proposed. The model introduces combined attention fusion module and multiscale residual fusion module in the backbone network ResNet 50 to enhance the feature flow between residual blocks and better fuse multiscale features. Furthermore, a global branch and a local branch are designed and applied to enhance the channel aggregation and position perception ability of the network by utilizing the dual ensemble attention module, as along as the fine-grained feature expression is obtained by using multiproportion block and reorganization. Thus, the global and local features are enhanced. The experimental results on Market-1501 dataset and DukeMTMC-reID dataset show that the indexes of the presented model, especially Rank-1 accuracy, reach 96.20% and 89.59%, respectively, which can be considered as a progress in Re-ID.

Download Full-text

Low Latency Network-on-Chip Router Microarchitecture Using Request Masking Technique

International Journal of Reconfigurable Computing ◽

10.1155/2015/570836 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 14

Author(s):

Alireza Monemi ◽

Chia Yee Ooi ◽

Muhammad Nadzir Marsono

Keyword(s):

High Performance ◽

Clock Cycle ◽

Network On Chip ◽

Operating Frequency ◽

Low Latency ◽

Core System ◽

Low Area ◽

Area Overhead ◽

Logic Cells ◽

On Chip

Network-on-Chip (NoC) is fast emerging as an on-chip communication alternative for many-core System-on-Chips (SoCs). However, designing a high performance low latency NoC with low area overhead has remained a challenge. In this paper, we present a two-clock-cycle latency NoC microarchitecture. An efficient request masking technique is proposed to combine virtual channel (VC) allocation with switch allocation nonspeculatively. Our proposed NoC architecture is optimized in terms of area overhead, operating frequency, and quality-of-service (QoS). We evaluate our NoC against CONNECT, an open source low latency NoC design targeted for field-programmable gate array (FPGA). The experimental results on several FPGA devices show that our NoC router outperforms CONNECT with 50% reduction of logic cells (LCs) utilization, while it works with 100% and 35%~20% higher operating frequency compared to the one- and two-clock-cycle latency CONNECT NoC routers, respectively. Moreover, the proposed NoC router achieves 2.3 times better performance compared to CONNECT.

Download Full-text