Article

Free access

Integrating a misprediction recovery cache (MRC) into a superscalar pipeline

Authors:

James O. Bondi,

Ashwini K. Nanda,

Simonjit DuttaAuthors Info & Claims

MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture

Pages 14 - 23

Published: 02 December 1996 Publication History

Abstract

In modern processors, deep pipelines couple with superscalar techniques to allow each pipe stage to process multiple instructions. When such a pipe must be pushed and refilled, as when predicted program flow beyond a branch is subsequently recognized as wrong, the temporary performance loss is significant. While modern branch target buffer (BTB) technology makes this flush/refill penalty fairly rare, the penalty that accrues from the remaining branch mispredictions is a serious impediment to even higher processor performance. Advanced mechanisms that can reduce this residual misprediction penalty can be of enormous value in future microprocessor designs. One promising new mechanism, the Misprediction Recovery Cache (MRC) is proposed previously. In this paper, we focus especially on MRC integration into existing pipelines.

References

[1]

A.K. Nanda, J.O. Bondi, and S. Dutta, "Misprediction Recovery Cache (MRC): Concept, Analysis, and Design" Tl-Internal Technical Paper, Jun. 1996, pp. 1-30.

[2]

J.O. Bondi, S. Dutta, and A.K. Nanda, "Pipelined Microprocessor with Branch Misprediction Cache Circuits, Systems, and Methods," patent application TI-22458, Jun. 1996.

[3]

A.K. Nanda, S. Dutta, and J.O. Bondi, "Misprediction Recovery Cache (MRC): A New Mechanism for Minimizing Branch Misprediction Penalty," submitted to HPCA 3.

[4]

J.E. Smith, "A Study of Branch Prediction Strategies," Proc. ISCA, May 1981, pp. 135-148.

Digital Library

[5]

J. Lee and A.J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design," Computer, Jan. 1984,pp.6-22.

[6]

T-Y. Yeh and Y.N.Patt, "Two-Level Adaptive Training Branch Prediction," Proc. Micro-24, Nov. 1991, pp. 51-61.

Digital Library

[7]

S.T. Pan, K. So, and J.T. Rahmeh, "Improving the Accuracy of Dynamic Branch Prediction Using Branch Correlation," Proc. ASPLOS-V, Oct. 1992, pp. 76-84.

Digital Library

[8]

S. Dutta and M. Franklin, "Control Flow Prediction with Tree-Like Subgraphs for Superscalar Processors," Proc. Micro-28, Dec. 1995, pp. 258-263.

Digital Library

[9]

W-M. Hwu and Y.N. Patt, "HPSm, A High Performance Restricted Data Flow Architecture Having Minimal Functionality,'' Proc. ISCA, Tokyo, 1986, pp. 297-306.

Digital Library

[10]

S.W. Melvin, M.C. Shebanow, and Y.N. Patt, "Hardware Support for Large Atomic Units in Dynamically Scheduled Machines," Proc. Micro-21, San Diego, Dec. 1988, pp.60~66.

Digital Library

[11]

M. Smotherman and M. Franklin, "Improving CISC instruction Decoding Performance Using a Fill Unit," Proc. Micro-28, Dec. 1995, pp. 219-229.

Digital Library

[12]

A.K. Nanda et al, "Improving the Accuracy of .. Performance Modeling Tools," Tl-Internal Technical Paper, 1995.

[13]

G.S. Tyson, "The Effects of Predicated Execution on Branch Prediction," Micro-27, Nov. 1994, pp. 196-206.

Digital Library

[14]

S.A. Mahlke et al, "Characterizing the Impact of Predicated Execution on Branch Prediction," Proc. Micro-27, Nov. 1994, pp. 217-227.

Digital Library

[15]

S. Song, M. Denman, and J. Chang, "The PowerPC 604 RISC Microprocessor," IEEE Micro, Oct. 1994, pp. 8-17.

Digital Library

[16]

W.W. Hwu et al, "Compiler Technology for Future Microprocessors," Proc. of the IEEE, Dec. 1995, pp. 1625-1640.

[17]

S.J. Walsh and J.A. Board, "Pollution Control Caching," Proc. ICCD '95, Oct. 1995, pp. 300-306.

Digital Library

Cited By

Armstrong DKim HMutlu OPatt Y(2004)Wrong Path EventsProceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2004.38(119-128)Online publication date: 4-Dec-2004
https://dl.acm.org/doi/10.1109/MICRO.2004.38
Aragón JGonzález JGonzález ASmith JEbcioglu KPingali KNicolau A(2002)Dual path instruction processingProceedings of the 16th international conference on Supercomputing10.1145/514191.514223(220-229)Online publication date: 22-Jun-2002
https://dl.acm.org/doi/10.1145/514191.514223
Reinman GCalder BAustin T(2001)Optimizations Enabled by a Decoupled Front-End ArchitectureIEEE Transactions on Computers10.1109/12.91927950:4(338-355)Online publication date: 1-Apr-2001
https://dl.acm.org/doi/10.1109/12.919279
Show More Cited By

Index Terms

Integrating a misprediction recovery cache (MRC) into a superscalar pipeline
1. Computer systems organization
  1. Architectures
    1. Serial architectures
  2. Embedded and cyber-physical systems
    1. Embedded systems

Recommendations

The Misprediction Recovery Cache

In modern processors, deep pipelines couple with superscalar techniques to allow each pipe stage to process multiple instructions. When such a pipe must be flushed and refilled, as when predicted program flow beyond a branch is subsequently recognized ...
Fast branch misprediction recovery in out-of-order superscalar processors
ICS '05: Proceedings of the 19th annual international conference on Supercomputing

Current trends in modern out-of-order processors involve implementing deeper pipelines and a large instruction window to achieve high performance. However, as pipeline depth increases, the branch misprediction penalty becomes a critical factor in ...
Multiple, out-of-order, instruction issuing system for superscalar processors

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture

December 1996

359 pages

ISBN:0818676418

Chairmen:
Stephen Melvin
Zytek Communications Corp.
,
Steve Beaty
Hewlett-Packard Corp.

Copyright © Copyright (c) 1996 Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
IEEE-CS\TCMM: TC on Microprocessors & Microcomputers

Publisher

IEEE Computer Society

United States

Publication History

Published: 02 December 1996

Check for updates

Author Tags

Qualifiers

Article

Conference

MICRO96

Sponsor:

SIGMICRO
IEEE-CS\TCMM

MICRO96: 29th Annual International Symposium on Microarchitecture

December 2 - 4, 1996

Paris, France

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Upcoming Conference

MICRO '24

Sponsor:
sigmicro

57th Annual IEEE/ACM International Symposium on Microarchitecture

November 2 - 6, 2024

Austin , TX , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
440
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)10

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Armstrong DKim HMutlu OPatt Y(2004)Wrong Path EventsProceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2004.38(119-128)Online publication date: 4-Dec-2004
https://dl.acm.org/doi/10.1109/MICRO.2004.38
Aragón JGonzález JGonzález ASmith JEbcioglu KPingali KNicolau A(2002)Dual path instruction processingProceedings of the 16th international conference on Supercomputing10.1145/514191.514223(220-229)Online publication date: 22-Jun-2002
https://dl.acm.org/doi/10.1145/514191.514223
Reinman GCalder BAustin T(2001)Optimizations Enabled by a Decoupled Front-End ArchitectureIEEE Transactions on Computers10.1109/12.91927950:4(338-355)Online publication date: 1-Apr-2001
https://dl.acm.org/doi/10.1109/12.919279
Reinman GAustin TCalder B(1999)A scalable front-end architecture for fast instruction deliveryACM SIGARCH Computer Architecture News10.1145/307338.30099927:2(234-245)Online publication date: 1-May-1999
https://dl.acm.org/doi/10.1145/307338.300999
Reinman GAustin TCalder BGottlieb ADally W(1999)A scalable front-end architecture for fast instruction deliveryProceedings of the 26th annual international symposium on Computer architecture10.1145/300979.300999(234-245)Online publication date: 2-May-1999
https://dl.acm.org/doi/10.1145/300979.300999
Rotenberg EBennett SSmith J(1999)A Trace Cache Microarchitecture and EvaluationIEEE Transactions on Computers10.1109/12.75265248:2(111-120)Online publication date: 1-Feb-1999
https://dl.acm.org/doi/10.1109/12.752652
Klauser APaithankar AGrunwald D(1998)Selective eager execution on the PolyPath architectureACM SIGARCH Computer Architecture News10.1145/279361.27939326:3(250-259)Online publication date: 16-Apr-1998
https://dl.acm.org/doi/10.1145/279361.279393
Klauser APaithankar AGrunwald DValero MSohi G(1998)Selective eager execution on the PolyPath architectureProceedings of the 25th annual international symposium on Computer architecture10.1145/279358.279393(250-259)Online publication date: 16-Apr-1998
https://dl.acm.org/doi/10.1145/279358.279393
Friendly DPatel SPatt YSmotherman MConte T(1997)Alternative fetch and issue policies for the trace cache fetch mechanismProceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture10.5555/266800.266803(24-33)Online publication date: 1-Dec-1997
https://dl.acm.org/doi/10.5555/266800.266803

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents