GB2281422A - Processor ordering consistency for a processor performing out-of-order instruction execution - Google Patents
Processor ordering consistency for a processor performing out-of-order instruction execution Download PDFInfo
- Publication number
- GB2281422A GB2281422A GB9408016A GB9408016A GB2281422A GB 2281422 A GB2281422 A GB 2281422A GB 9408016 A GB9408016 A GB 9408016A GB 9408016 A GB9408016 A GB 9408016A GB 2281422 A GB2281422 A GB 2281422A
- Authority
- GB
- United Kingdom
- Prior art keywords
- load
- memory
- physical
- circuit
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims description 14
- SGPGESCZOCHFCL-UHFFFAOYSA-N Tilisolol hydrochloride Chemical compound [Cl-].C1=CC=C2C(=O)N(C)C=C(OCC(O)C[NH2+]C(C)(C)C)C2=C1 SGPGESCZOCHFCL-UHFFFAOYSA-N 0.000 claims 1
- 238000012546 transfer Methods 0.000 description 34
- 101100534229 Caenorhabditis elegans src-2 gene Proteins 0.000 description 18
- 101100496858 Mus musculus Colec12 gene Proteins 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 101100534223 Caenorhabditis elegans src-1 gene Proteins 0.000 description 5
- 238000001514 detection method Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 230000003139 buffering effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3824—Operand accessing
- G06F9/3834—Maintaining memory consistency
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
- Multi Processors (AREA)
Description
2281422 PROCESSOR ORDERING CONSISTENCY FOR A PROCESSOR PERFORMING
OUT-OF-ORDER INSTRUCTION EXECUTION
BACKGROUND OF THE INVE'NTION
1 FIELD OF THE NVENTION;
The present invention pertains to the field of computer systems More particularly, this invention relates to maintaining processor ordering consistency for a processor employing out of order instruction execution in a multiprocessor computer system.
2 BACKGROUND:
Inter-processor communication in a typical multiprocessor computer system is modeled as information transfer between one or more producer processor and one or more consumer processors In a typical multiprocessor computer system, the producer processor transfers information to the consumer processors via messages stored in a shared memory subsystem.
Each processor in such a multiprocessor system usually must conform to a common processor ordering model to ensure consistent information flow to the consumer processors through the memory subsystem A processor ordering model requires that each consumer processor observe stores from the producer processor in the same order.
For example, in a common Inter-processor communication transaction, the producer processor writes message data to the memory subsystem, and then sets a message flag in the memory subsystem to indicate valid message data Each consumer processor reads the message flag, and 1 - then reads the message data if the message flag indicates valid message data.
A processor ordering model that requires each consumer processor to observe stores from the producer processor in the same order ensures that each consumer processor observes the message data store before the message flag store Such a processor ordering model causes each consumer processor to read valid message data if the message data store occurs before the message flag store.
Typical prior processors in a multiprocessor system implement in- order instruction execution pipelines Such an in-order processor usually fetches an instruction stream, executes the instruction stream in a sequential program order, and dispatches loads and stores in the sequential program order Such in-order processing of the instruction stream ensures that each consumer processor in the multiprocessor system observe stores from the producer processor in the same order because each consumer processor executes load instructions to read the message flag and the message data in the same order.
A processor may implement an out of order instruction execution pipeline to improve instruction execution performance Such an out of order processor fetches an instruction stream and executes ready instructions in the instruction stream ahead of earlier instructions that are not ready A ready instruction is typically an instruction having fully assembled source data and having available execution resources.
Such out of order execution improves processor performance because the instruction execution pipeline of the processor does not stall while assembling source data for a non ready instruction For example, a non ready instruction awaiting source data from a floating-point operation does not stall the execution of later instructions in the instruction stream that are ready to execute.
A processor that implements an out of order instruction execution pipeline generates out of order result data because the instructions in the instruction stream are executed out of order An out of order processor may implement a reorder register file to impose the original program order on the result data after instruction execution.
Out-of-order instruction execution by processors in a multiprocessor system may cause violations of the processor ordering model The consumer processors that execute load instructions out of order may observe stores from the producer processor in differing order.
For example, a consumer processor that executes a load instruction for the message flag before a load instruction for the message data effectively observes the producer processor stores to the message data and the message flag in a different order than a consumer processor that executes a load instruction for the message data before a load instruction for the message flag.
Such a violation of the processor ordering model may cause the consumer processors to read differing message data One of the consumer processors may load the message data before the producer processor stores the message data, and may load the message flag after the producer processor stores the message flag In such a case, the consumer processor loads invalid message data and loads a message flag indicating valid message data As a consequence, the consumer processor erroneously processes invalid message data.
SUMMARY AND OBJECTS OF THE INVENTION
One object of the present invention is to maintain processor ordering in a multiprocessor computer system for a processor having an out of order instruction execution pipeline.
Another object of the present invention is to maintain processor ordering for a processor having an out of order instruction execution pipeline, wherein each consumer processor in the multiprocessor computer system is required to observe memory stores from a producer processor in the same order.
A further object of the present invention is to maintain processor ordering in a multiprocessor computer system for a processor having an out of order instruction execution pipeline by detecting external memory store operations targeted for a memory address corresponding to an executed and unretired load memory instruction.
These and other objects of the invention are provided by a method for processor ordering in a multiprocessor computer system A processor having an out of order instruction execution pipeline fetches an instruction stream from an external memory in a sequential program order The instruction stream includes load memory instructions, wherein each load memory instruction specifies a load memory operation from a memory address over a multiprocessor bus of the multiprocessor computer system.
The processor assembles at least one source data value for each load memory instruction, such that the source data value specifies the memory address for the corresponding load memory instruction The processor executes each load memory instruction after the corresponding source data value is assembled regardless of the sequential program order of the load memory instruction Each executed load memory instruction generates a result data value.
The processor snoops the multiprocessor bus for an external store operation to the memory address of each executed load memory instruction.
The processor commits the result data value of each executed load memory instruction to an architectural state in the sequential program order if the external store operation to the memory address of the executed load memory instruction is not detected.
The processor discards the result data value of each executed load memory instruction if the external store operation to the memory address of the executed load memory instruction is detected before the result data value is committed to the architectural state The processor then reexecutes the instruction stream starting at the load memory instruction corresponding to the discarded result data value.
Other objects, features and advantages of the present invention will be apparent from the accompanying drawings, and from the detailed description that follows below.
-6 BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar elements, and in which:
Figure 1 illustrates a multiprocessor computer system comprising a set of processors and a memory subsystem; Figure 2 is a block diagram of a processor in the multiprocessor computer system; Figure 3 illustrates the functions of the register alias circuit which converts the logical micro-ops into corresponding physical micro-ops by mapping the logical sources and destinations into physical sources and destinations; Figure 4 illustrates the reorder circuit which contains a reorder buffer comprising a set of ROB entries (REQ through R En) that buffer speculative result data from the out of order speculative execution of physical n- icro-ops; Figure 5 illustrates the reservation and dispatch circuit which contains a reservation dispatch table comprising a set of reservation station entries RSO through R Sx for assembling and dispatching micro-ops; Figure 6 illustrates the real register circuit which contains a set of committed state registers that buffer committed result data values; Figure 7 illustrates a load memory circuit which comprises an address generation circuit, a memory ordering circuit, a data translate look- aside buffer (DTLB) circuit, and a data cache circuit; Figure 8 illustrates the memory ordering circuit which contains a load buffer comprising a set of load buffer entries LBO through L Bn; Figure 9 illustrates the snoop detection circuitry in the memory ordering circuit which includes a snoop detect circuit corresponding to each load buffer entry LBO-L Bn; Figure 10 illustrates notification circuitry in the memory ordering circuit that generates the memory ordering restart signals; Figure 11 illustrates processing of a load micro-op Id Ox 1 OO, EBX, EAX issued by the instruction fetch and micro-op issue circuit; Figure 12 illustrates the dispatch and retirement of the linear load memory micro-op ld 32100, 42, Ibid = 4 corresponding to the load micro-op ld Ox 1 Oo, EBX, EAX.
DETAILED DESCRIPTION
Figure 1 illustrates a multiprocessor computer system 20 The multiprocessor computer system 20 comprises a set of processors 22 24, and a memory subsystem 26 The processors 22 24 and the memory subsystem 26 communicate over a multiprocessor bus 28.
Each processor 22 24 fetches a stream of macro instructions from the memory subsystem 26 over the multiprocessor bus 28 Each processor 22 24 executes the corresponding stream of macro instructions and maintains data storage in the memory subsystem 26.
Figure 2 illustrates the processor 22 The processor 22 comprises a front-end section including a bus interface circuit 30 and an instruction fetch and micro-op issue circuit 32 The processor 22 also comprises a register renaming section including a register alias circuit 34 and an allocator circuit 36 The processor 22 also comprises an out of order execution section comprising a reservation and dispatch circuit 38, an execution circuit 40, a reorder circuit 42, and a real register circuit 44.
The bus interface circuit 30 enables transfer of address, data and control information over the multiprocessor bus 28 The instruction fetch and micro-op issue circuit 32 fetches a stream of macro instructions from the memory subsystem 26 over the multiprocessor bus 28 through the bus interface circuit 30 The instruction fetch and micro-op issue circuit 32 implements speculative branch prediction to maximize macro-instruction fetch throughput.
For one embodiment the stream of macro instructions fetched over the multiprocessor bus 28 comprises a stream of Intel Architecture Microprocessor macro instructions The Intel Architecture Microprocessor macro instructions operate on a set of architectural registers, including an EAX register, an EBX register, an ECX register, and an EDX register, etc.
The instruction fetch and micro-op issue circuit 32 converts the macroinstruction of the incoming stream of macro instructions into an in-order stream of logical micro operations, hereinafter referred to as logical micro- ops The instruction fetch and micro-op issue circuit 32 generates one or more logical micro ops for each incoming macro instruction The logical micro-ops corresponding to each macro instruction are reduced instruction set micro operations that perform the function of the corresponding macro instruction The logical micro-op specify arithmetic and logical operations as well as load and store operations to the memory subsystem 26.
The instruction fetch and micro-op issue circuit 32 transfers the inorder stream of logical micro-ops to the register alias circuit 34 and the allocator circuit 36 over a logical micro-op bus 50 For one embodiment, the instruction fetch and micro-op issue circuit 32 issues up to four inorder logical micro-ops during each clock cycle of the processor 22 Alternatively, the in-order logical micro-ops may be limited to four during each clock cycle to minimize integrated circuit die area for the processor 22.
The instruction fetch and micro-op issue circuit 32 contains a micro instruction sequencer and an associated control store The micro instruction -10 sequencer implements micro programs for performing a variety of functions for the processor 22, including fault recovery functions and processor ordering functions.
Each logical micro-op generated by the instruction fetch and micro-op issue circuit 32 comprises an op code, a pair of logical sources and a logical destination Each logical source may specify a register or provide an immediate data value The register logical sources and the logical destinations of the logical micro-ops specify architectural registers of the original macro instructions In addition, the register logical sources and the logical destinations of the logical micro-ops specify temporary registers for microcode implemented by the micro instruction sequencer of the instruction fetch and micro-op issue circuit 32.
The register alias circuit 34 receives the in-order logical micro-ops over the logical mnicro-op bus 50, and generates a corresponding set of inorder physical micro-ops by renaming the logical sources and logical destinations of the logical micro-ops The register alias circuit 34 receives the in-order logical micro-ops over the logical micro-op bus 50, maps the logical sources and the logical destination of each logical micro-op into physical sources and a physical destination, and transfers the in-order physical nicro-ops over a physical micro-op bus 52.
Each physical micro-op comprises the opcode of the corresponding logical micro-op, a pair of physical sources, and a physical destination Each physical source may specify a physical register or provide an immediate data value The register physical sources of the physical mnicro-ops specify physical registers contained in the reorder circuit 42 and committed state registers contained in the real register circuit 44 The physical destinations of the physical micro-ops specify physical registers contained in the reorder circuit 42.
The register alias circuit 34 transfers the logical destinations of the logical micro-ops over a logical destination bus 54 The logical destinations transferred over the logical destination bus 54 identify the architectural registers that correspond to the physical destinations on the physical micro-op bus 52.
The allocator circuit 36 tracks the available resources in the reorder circuit 42, the reservation and dispatch circuit 38, and the execution circuit 40.
The allocator circuit 36 assigns physical destinations in the reorder circuit 42 and reservation station entries in the reservation and dispatch circuit 38 to the physical micro-ops on the physical micro-op bus 52 The allocator circuit 36 also assigns load buffer entries in a memory ordering buffer in the execution circuit 40 to the physical micro-ops on the physical micro-op bus 52 that have an opcode specifying a load memory operation.
The allocator circuit 36 transfers allocated physical destinations to the register alias circuit 34 over a physical destination bus 56 The allocated physical destinations specify physical registers in the reorder circuit 42 for buffering speculative results for the physical micro-ops The allocated physical destinations are used by the register alias circuit 34 to rename the logical destinations of the logical micro-ops to physical destinations.
-12 The allocator circuit 36 allocates the physical registers of the reorder circuit 42 to the physical micro-ops in the same order that logical micro- ops are received over the logical micro-op bus 50 The allocator circuit 36 maintains an allocation pointer for allocating physical registers of the reorder circuit 42 The allocation pointer points to a next set of consecutive physical registers in the reorder circuit 42 for each set of logical micro-ops received over the logical micro-op bus 50 The ordering of the physical registers assigned to the physical micro-ops in the reorder circuit 42 reflects the ordering of the original logical micro-ops.
The allocator circuit 36 specifies the reservation station entries for the physical micro-ops on the physical micro-ops bus 52 by transferring reservation station entry select signals to the reservation and dispatch circuit 38 over a reservation station select bus 66.
The allocator circuit 36 assigns a load buffer entries to each physical micro-ops on the physical micro-ops bus 52 that specifies a load memory opcode The allocator circuit 36 assigns the load buffer entries by transferring load buffer identifiers to the reservation and dispatch circuit 38 over a load buffer ID bus 72.
The reservation and dispatch circuit 38 holds the physical micro-ops awaiting execution by the execution circuit 40 The reservation and dispatch circuit 38 receives the in-order physical micro-ops over the physical micro-op bus 52, assembles the source data for the physical micro-ops, and dispatches the physical micro-ops to the execution circuit 40.
The reservation and dispatch circuit 38 receives the physical micro-ops over the physical micro-op bus 50 and stores the physical micro-ops in available reservation station entries The reservation and dispatch circuit 38 assembles source data for the physical micro-ops, and dispatches the physical micro-ops to appropriate execution units in the execution circuit 40 when the source data is assembled.
The reservation and dispatch circuit 38 receives the source data for the pending physical micro-ops from the reorder circuit 42 and the real register circuit 44 over a source data bus 58 The reservation and dispatch circuit 38 also receives source data for the pending physical micro-ops from the execution circuit 40 over a result bus 62 during a write back of speculative results from the execution circuit 40 to the reorder circuit 42.
The reservation and dispatch circuit 38 schedules the physical micro- ops having completely assembled source data for execution The reservation and dispatch circuit 38 dispatches the ready physical micro-ops to the execution circuit 40 over a micro-op dispatch bus 60 The reservation and dispatch circuit 38 schedules execution of physical micro-ops out of order according to the availability of the source data for the physical microops, and according to the availability of execution unit resources in the execution circuit 40.
The execution circuit 40 writes back the speculative results from the out of order execution of the physical micro-ops to the reorder circuit 42 over the result bus 62 The writes back of speculative results by the execution circuit 40 is out of order due to the out of order dispatching of physical micro- ops by the reservation and dispatch circuit 38 and the differing number of processor 22 cycles required for execution of the differing types of physical micro-ops.
For one embodiment, the execution circuit 40 comprises a set of five execution units EUO-EU 4 The reservation and dispatch circuit 38 dispatches up to five physical mnicro-ops concurrently to the execution units EUO-EU 4 over the micro-op dispatch bus 60.
The execution unit EUO performs arithmetic logic unit (ALU) functions including integer multiply and divide as well as floating-point add, subtract, multiply and divide micro-ops The execution unit E Ul performs ALU integer functions and jump operations The execution unit EU 2 performs integer and floating-point load operations from memory as well as load linear address functions and segment register operations The execution unit EU 3 performs integer and floating-point store and segmentation register operations The execution unit EU 4 performs integer and floating-point store data operations.
The reorder circuit 42 contains the physical registers that buffer speculative results for the physical mnicro-ops Each physical register in the reorder circuit 42 accommodates integer data values and floating-point data values.
The real register circuit 44 contains committed state registers that correspond to the architectural registers of the original stream of macroinstructions Each committed state register in the real register circuit 44 accommodates integer data values and floating-point data values.
For one embodiment, the committed state registers of the real register circuit 44 comprise the EAX, EBX, ECX, and EDX registers, etc of the Intel Architecture Microprocessor, as well as architectural flags for the Intel Architecture Microprocessor The real register circuit 44 also contains committed state registers for the microcode registers used by microcode executing in the instruction fetch and micro-op issue circuit 32.
The reorder circuit 42 and the real register circuit 44 receive the physical micro-ops over-the physical micro-op bus 52 The physical sources of the physical micro-ops specify physical registers in the reorder circuit 42 and committed state registers in the real register file 44 that hold the source data for the physical micro-ops.
The reorder circuit 42 and the real register circuit 44 read the source data specified by the physical sources, and transfer the source data to the reservation and dispatch circuit 38 over a source data bus 58 Each physical source of the physical micro-ops includes a real register file valid (RRFV) flag that indicates whether the source data is contained in a physical register in the reorder circuit 42 or a committed state register in the real register file 44.
The physical destinations of the physical micro-ops on the physical micro-op bus 52 specify physical registers in the reorder circuit 42 for buffering the speculative results of the out of order execution of the physical micro-ops.
The reorder circuit 42 receives the physical destinations of the physical micro- ops over the physical micro-op bus 52, and clears the physical registers specified by the physical destinations.
The reorder circuit 42 receives the logical destinations corresponding to the physical micro-ops over the logical destination bus 54, and stores the logical destinations into the physical registers specified by the physical destinations of the physical micro-ops The logical destinations in the physical registers of the reorder circuit 42 specify committed state registers in the real register circuit 44 for retirement of the physical micro-ops.
A retire logic circuit 46 imposes order on the physical micro-ops by committing the speculative results held in the physical registers of the reorder circuit 42 to an architectural state in the same order as the original logical micro-ops were received The retire logic circuit 46 causes transfer of the speculative result data In the reorder circuit 42 to corresponding committed state registers in the real register circuit 44 over a retirement bus 64 For one embodiment, the retire logic circuit 46 retires up to four physical registers during each cycle of the processor 22 For another embodiment, the retire logic circuit 46 retires up to three physical registers during each cycle of the processor 22 to minimize integrated circuit die space.
The retire logic circuit 46 also causes the reorder circuit 42 to transfer the macro instruction pointer delta values for the retiring physical micro-ops over a macro instruction pointer offset bus 120 during retirement.
The restart circuit 48 receives macro instruction pointer delta values over the macro instruction pointer offset bus 120 The restart circuit 48 calculates a committed instruction pointer value according to the macro instruction pointer deltas for the retiring ROB entries.
The retire logic circuit 46 maintains a retirement pointer to the physical registers in the reorder circuit 42 The retirement pointer points to sets of consecutive physical registers for retirement The retirement pointer follows the allocation pointer through the physical registers in the reorder circuit 42 as the retire logic retires the speculative results of the physical registers to the committed state The retire logic circuit 46 retires the physical registers in order because the physical registers were allocated to the physical micro- ops in order.
The retire logic circuit 46 broadcasts the retirement physical destinations specified by the retirement pointer over a retire notification bus 70 The memory ordering buffer in the execution circuit 40 receives the retirement physical destinations, and issues a set of memory ordering restart signals 76 The memory ordering restart signals 76 indicate whether a memory load operation corresponding to one of the retiring physical destinations has caused a possible processor ordering violation The memory ordering restart signals 76 indicate which of the retiring physical destinations has caused the possible processor ordering violation.
The memory ordering restart signals 76 are received by the restart circuit 48 If the memory ordering restart signals 76 indicate a possible processor ordering violation, the restart circuit 48 issues a reorder clear signal 78 The reorder clear signal 78 causes the reorder circuit 42 to clear the speculative result data for the unretired physical micro-ops The reorder clear signal 78 causes the reservation and dispatch circuit 38 to clear the pending physical micro-ops that await dispatch to the execution circuit 40 The reorder clear signal 78 also causes the allocator circuit 36 to reset the allocation pointer for allocating the physical registers in the reorder circuit 42, and causes the retire logic circuit 46 to reset the retirement pointer for retiring the physical registers.
If the memory ordering restart signals 76 indicate a possible processor ordering violation, the restart circuit 48 uses the macro instruction pointer delta values received over the macro instruction pointer offset bus 120 to calculate a restart instruction pointer value The restart instruction pointer value specifies the macro instruction corresponding to the physical micro- op that caused the possible memory ordering violation The restart circuit 48 transfers the restart instruction pointer value to the instruction fetch and micro-op issue circuit 32 over a restart vector bus 122.
The instruction fetch and micro-op issue circuit 32 receives the restart instruction pointer value over a restart vector bus 122 The reorder clear signal 78 causes the micro-instruction sequencer of the instruction fetch and micro-op issue circuit 32 to reissue the in order stream of logical micro- ops that were cleared from the reorder circuit 42 before retirement The instruction fetch and micro-op issue circuit 32 reissues the logical micro-ops by fetching a macro instruction stream starting at the macro instruction address specified by the restart instruction pointer value, and by converting the macro instruction stream into logical micro-ops, and by transferring the logical micro-ops over the logical micro-op bus 50.
-19 If the memory ordering restart signals 76 do not indicate a possible processor ordering violation, then the retirement of the physical registers specified by the retiring physical destinations proceeds The reorder circuit 42 tests the valid flags for the retiring physical destinations The reorder circuit 42 retires the speculative result data for each retiring physical register if the valid flag of the retiring physical register indicates valid speculative data The reorder circuit 42 retires a physical register by causing transfer of the speculative result data to the committed state registers in the real register circuit 44 specified by the logical destinations of the physical register. The register alias circuit 34 and the allocator circuit 36 receive the
retiring physical destinations over a retire notification bus 70 The register alias circuit 34 accordingly updates the register alias table to reflect the retirement The allocator circuit 36 marks the retired physical registers in the reorder circuit 42 as available for allocation.
Figure 3 is a diagram that illustrates the functions of the register alias circuit 34 The register alias circuit 34 receives logical micro-ops in order over the logical micro-op bus 50, converts the logical micro-ops into corresponding physical micro-ops by mapping the logical sources and destinations into physical sources and destinations, and then transfers the physical micro- ops in order over the physical micro-op bus 52.
The register alias circuit 34 implements a register alias table 80 The register alias table 80 performs logical to physical register renaming by mapping the logical sources and destinations of the logical micro-ops to the physical sources and destinations of the corresponding physical micro-ops.
-20 The physical sources and destinations of the physical micro-ops specify physical registers in the reorder circuit 42 and committed state registers in the real register circuit 44.
The entries in the register alias table 80 correspond to the architectural registers of the original macro instruction stream For one embodiment, the EAX, EBX, ECX, and EDX entries of the register alias table 80 correspond to the EAX, EBX, ECX, and EDX registers of the Intel Architecture Microprocessor.
Each entry in the register alias table 80 contains a reorder buffer (ROB) pointer The ROB pointer specifies a physical register in the reorder circuit 42 that holds the speculative result data for the corresponding architectural register Each entry in the register alias table 80 also contains a real register file valid (RRFV) flag that indicates whether the speculative result data for the corresponding architectural register has been retired to the appropriate committed state register in the real register circuit 44.
The register alias circuit 34 receives a set of in order logical microops (Imop-0 through lmop_ 3) over the logical micro-op bus 50 Each logical micro-op comprises an op code, a pair of logical sources lsrcl and lsrc 2, a logical destination ldst, and a macro instruction pointer delta mipd The logical sources lsrcl and lsrc 2 and the logical destination ldst each specify an architectural register of the original stream of macro-instructions.
The register alias circuit 34 also receives a set of allocated physical destinations (allocppdst_ 0 through alloc pdst_ 3) from the allocator circuit 36 over the physical destination bus 56 The physical destinations alloc pdst- 0 through allocpdst L_ 3 specify newly allocated physical registers in the reorder circuit 42 for the logical micro-ops lmop_ O through lmop_ 3 The physical registers in the reorder circuit 42 specified by the physical destinations allocpdst_ 0 through alloc pdst_ 3 will hold speculative result data for the physical micro-ops corresponding to the logical micro-ops lmop _ 0 through lmop_ 3.
The register alias circuit 34 transfers a set of in order physical microops (pmop_ 0 through pmop_ 3) over the physical micro-op bus 52 Each physical micro-op comprises an op code, a pair of physical sources psrcl and psrc 2 and a physical destination pdst The physical sources psrcl and psrc 2 each specify a physical register in the reorder circuit 42 or a committed state register in the real register circuit 44 The physical destination pdst specifies a physical register in the reorder circuit 42 to hold speculative result data for the corresponding physical micro-op.
The register alias circuit 34 generates the physical micro-ops pmop_ O through pmop_ 3 by mapping the logical sources of the logical micro-ops lmop_ O through lmop_ 3 to the physical registers of the reorder circuit 42 and the committed state registers specified of the real register circuit 44 as specified by the register alias table 80 The register alias circuit 34 merges the physical destinations alloc pdst_ O through allocpdst_ 3 into the physical micro- ops pmop_ 0 through pmop_ 3.
The opcodes of the physical micro-ops pmop_ O through pmop 3 are the same as the corresponding opcodes of the logical micro-ops lmop_ O -22 through lmop_ 3 For example, the register alias circuit 34 generates pmop O such that the op code of pmop-0 equals the opcode of lmop-0.
For example, the register alias circuit 34 generates the physical source psrcl for the physical micro-op pmop_ O by reading the register alias table 80 entry specified by the logical source lsrcl of the lmop 10 If the RRFV flag of the specified registet alias table 80 entry is not set, then the register alias circuit 34 transfers the ROB pointer from the specified register alias table 80 entry along with the RRFV flag over the physical micro-op bus 52 as the physical source psrcl for the pmopj O If the RRFV bit is set, then the register alias circuit 34 transfers a pointer to the committed state register in the real register circuit 44 that corresponds to the logical source Isrcl along with the RRFV flag over the physical micro-op bus 52 as the physical source psrcl for the pmop 0.
The register alias circuit 34 generates the physical source psrc 2 for the physical micro-op pmop-0 by reading the register alias table 80 entry that corresponds to the logical source lsrc 2 of the lmop-0 If the RRFV flag is not set, then the register alias circuit 34 transfers the ROB pointer from the specified register alias table 80 entry along with the RRFV flag over the physical micro-op bus 52 as the physical source psrc 2 for the pmop A O If the RRFV bit is set, then the register alias circuit 34 transfers a pointer to the committed state register in the real register circuit 44 that corresponds to the logical source lsrc 2 along with the RRFV flag over the physical micro-op bus 52 as the physical source psrc 2 for the pmop 0.
The register alias circuit 34 stores the physical destination alloc pdst_O into the ROB pointer field of the register alias table 80 entry specified by the logical destination ldst of the lmop _ 0, and clears the corresponding RRFV bit.
The dclear RRFV bit indicates that the current state of the corresponding architectural register is speculatively held in the physical register of the reorder circuit 42 specified by the corresponding ROB pointer.
The register alias circuit 34 transfers a set of logical destinations Idst_ O through ldst_ 3 and corresponding macro instruction pointer deltas mipd_ O through mipd_ 3 over the logical destination bus 54 The logical destinations ldst_ O through ldst L 3 are the logical destinations ldst of the logical micro-ops lmop_ 0 through lmop_ 3.
The macro instruction pointer deltas mipd_ O through mnipd_ 3 are the macro instruction pointer deltas mipd of the logical nmicro-ops lmop O through Imop 3 The macro instruction pointer delta mipd_ 0 is the logical destination ldst of the lmop_ 0, the macro instruction pointer delta mipdj is the logical destination ldst of the lmop_ 1, etc The macro instruction pointer deltas mipd _ O through mipd 3 identify the original macro instructions corresponding to the physical micro-ops pmop-0 through pmop 3.
Figure 4 illustrates the reorder circuit 42 The reorder circuit 42 implements a reorder buffer 82 comprising a set of ROB entries (RE O through R En) The ROB entries RE O through R En are physical registers that buffer speculative result data from the out of order execution of physical micro- ops.
For one embodiment, the ROB entries RE O through R En comprise a set of 64 physical registers For another embodiment, the ROB entries RE O through R En comprise a set of 40 physical registers.
-24 Each ROB entry comprises a valid flag (V), a result data value, a set of flags, a flag mask, a logical destination (LDST), fault data, and an instruction pointer delta (IPDELTA).
The valid flag indicates whether the result data value for the corresponding ROB entry is valid The reorder circuit 42 clears the valid flag for each newly allocated ROB entry to indicate an invalid result data value.
The reorder circuit 42 sets the valid flag when speculative result data is written back to the ROB entry from the execution circuit 40.
The result data value is a speculative result from the out of order execution of the corresponding physical micro-op The result data value may be either an integer data value or a floating-point data value For one embodiment, the result data value field of each ROB entry REQ through R En comprises 86 bits to accommodate both integer and floating-point data values.
The flags and flag mask provide speculative architectural flag information The speculative architectural flag information is transferred to the architectural flags of the real register circuit 44 upon retirement of the corresponding ROB entry.
The logical destination LDST specifies a committed state register in the real register circuit 44 The result data value of the corresponding ROB entry is transferred to the committed state register specified by LDST during retirement of the ROB entry.
-25 The fault data contains fault information for the fault processing microcode executing in the instruction fetch and micro-op issue circuit 32.
When a fault occurs, the fault handing microcode reads the fault data to determine the cause of the fault.
The IPDELTA is a macro instruction pointer delta value that identifies the macro instruction corresponding to the physical register.
The reorder circuit 42 receives the physical micro-ops pmop_ 0 through pmop_ 3 over the physical micro-op bus 52 The reorder circuit 42 reads the source data specified by the physical micro-ops pmop_ O through pmop_ 3 from the reorder buffer 82 The reorder circuit 42 transfers the result data values and the valid flags from the ROB entries specified by the physical sources psrcl and psrc 2 of the physical nmicro-ops to the reservation and dispatch circuit 38 over the source data bus 58.
For example, the result data values and valid flags from the ROB entries specified by the physical sources psrcl and psrc 2 of the pmop_ 0 are transferred as source data srcl/src 2 data_ O over the source data bus 58 The source data srcl/src 2 data_ O provides the source data specified by the physical sources psrcl and psrc 2 of the pmop_ O if the corresponding valid flags indicate valid source data.
Similarly, the reorder circuit 42 transfers the result data values and valid flags from the appropriate ROB entries as the source data srcl/src 2 data_ 1 through source data srcl/src 2 data_ 3 over the source data bus 58 for the physical micro-ops pmop_ 1 through pmop_ 3.
-26 The reorder circuit 42 clears the valid bits of the ROB entries specified by the physical destinations pdst the physical micro-ops pmop_ O through pmop 3 received over the physical micro-op bus 52 The reorder circuit 42 clears the valid bits to indicate that the corresponding result data value is not valid because the physical micro-ops pmop_ O through pmop_ 3 that generate the result data value are being assembled in the reservation and dispatch circuit 38.
The reorder circuit 42 receives the logical destinations ldst_ 0 through ldst_ 3 and the macro instruction pointer deltas mipd_ O through mipd_ 3 over the logical destination bus 54 The reorder circuit 42 stores the logical destinations ldst_ O through ldst_ 3 into the LDST fields of the ROB entries specified by the physical destinations pdst the physical micro-ops pmop_ O through pmop_ 3 The reorder circuit 42 stores the macro instruction pointer deltas mipd_ O O through mipd_ 3 into the IPDELTA fields of the ROB entries specified by the physical destinations pdst the physical mnlcro-ops pmop_ O through pmop_ 3.
For example, the reorder circuit 42 stores the ldst 0 and the mipd_ O into the LDST and the IPDELTA of the ROB entry specified by the physical destination pdst of the pmop_ 0 The logical destination in the LDST field of a
ROB entry specifies a committed state register in the real register circuit 44 for retirement of the corresponding ROB entry The macro instruction pointer delta in the IPDELTA field of a ROB entry specifies the original macro instruction of the corresponding ROB entry.
-27 The reorder circuit 42 receives write back speculative result information from the execution circuit 40 over the result bus 62 The write back speculative result information from the execution units EUO through EU 4 comprises result data values, physical destinations pdst and fault data.
The reorder circuit 42 stores the write back speculative result information from the execution units EUO through EU 4 into the ROB entries specified by the physical destinations pdst on the result bus 62 For each execution unit EUO through EU 4, the reorder circuit 42 stores the result data value into the result data value field, and stores the fault data into the fault data field of the ROB entry specified by the physical destination pdst.
The result data values from the executions circuit 40 each include a valid flag Each valid flag is stored in the valid flag field of the ROB entry specified by the physical destination pdst The execution units EUO through EU 4 set the valid flags to indicate whether the corresponding result data values are valid.
The reorder circuit 42 receives the retirement physical destinations over the retire notification bus 70 The retirement physical destinations cause the reorder circuit 42 to commit the speculative result data values in the ROB entries REO through R En to architectural state by transferring the speculative result data values to the real register circuit 44 over the retirement bus 64.
The retirement bus 64 carries the speculative results for a set of retirement micro-ops rm_ 0 through rm_ 4 Each retirement micro-op rm-0 -28 through rm 4 comprises a result data value and a logical destination ldst from one of the ROB entries REO through R En.
The retirement physical destinations from the retire logic circuit 46 also cause the reorder circuit 42 to transfer the macro instruction pointer deltas for the retiring ROB entries to the restart circuit 48 over the macro instruction pointer offset bus 120.
The reorder circuit 42 receives the reorder clear signal 78 from the restart circuit 48 The reorder clear signal 78 causes the reorder circuit 42 to clear all of the ROB entries.
Figure 5 illustrates the reservation and dispatch circuit 38 The reservation and dispatch circuit 38 implements a reservation dispatch table 84 comprising a set of reservation station entries RSO through R Sx The reservation and dispatch circuit 38 receives and stores the physical micro-ops pmop-0 through pmop_ 3 into available reservation station entries RSO through R Sx, assembles the source data for the physical micro-ops into the reservation station entries RSO through R Sx, and dispatches the ready physical micro-ops to the execution circuit 40 A physical micro-op is ready when the source data is fully assembled in a reservation station entry RSO through R Sx.
Each reservation station entry RSO through R Sx comprises an entry valid flag, an op code, a pair of source data values (SRC 1/SRC 2 DATA) and corresponding valid flags (V), a pair of physical sources (PSRC 1/PSRC 2), a physical destination (PDST), and a load buffer identifier (LBID).
The entry valid flag indicates whether the corresponding reservation station entry RSO through R Sx holds a physical micro-op awaiting dispatch.
The op code specifies an operation of the execution unit circuit 40 for the physical micro-op in the corresponding reservation station entry RSO through R Sx.
The SRC 1/SRC 2 DATA fields of the reservation station entries RSO through R Sx hold the source data values for the corresponding physical micro-ops The corresponding valid flags indicate whether the source data values are valid.
The physical sources PSR Cl/PSRC 2 of each reservation station entry RSO through R Sx specify the physical destinations in the reorder circuit 42 that hold the source data for the corresponding physical micro-op The reservation and dispatch circuit 38 uses the physical sources PSRC 1/PSRC 2 to detect write back of pending source data from the execution circuit 40 to the reorder circuit 42.
The physical destination PDST of each reservation station entry RSO through R Sx specifies a physical destination in the reorder circuit 42 to hold the speculative results for the corresponding physical micro-op.
The load buffer identifier LBID of each reservation station entry RSO through R Sx specifies a load buffer entry in the memory ordering circuit in the execution circuit 40 The load buffer entry is valid if the corresponding reservation station entry holds a load memory physical micro-op.
The reservation and dispatch circuit 38 receives the physical niicro-ops pmop-0 through pmop-3 over the physical micro-op bus 52 The reservation and dispatch circuit 38 also receives the reservation station entry select signals 66 from the allocator circuit 36 The reservation station entry select signals 66 specify the new reservation station entries.
The reservation and dispatch circuit 38 stores the opcode and physical sources psrcl and psrc 2 for each physical micro-op pmop-0 through pmop 3 into the new reservation station entries RSO through R Sx specified by the reservation station entry select signals 66 The reservation and dispatch circuit 38 sets the entry valid flag for each new reservation station entry.
The reservation and dispatch circuit 38 receives load buffer identifiers for load memory physical micro-ops over the load buffer ID bus 72 from the allocator circuit 36 The reservation and dispatch circuit 38 stores the load buffer identifiers into the appropriate LBID fields of the new reservation station entries R 50 through R Sx.
The reservation and dispatch circuit 38 receives the source data values and corresponding valid flags specified by the physical sources psrcl and psrc 2 of the physical micro-ops pmopj O through pmop 3 from the reorder circuit 42 and the real register circuit 44 over the source data bus 58 The reservation and dispatch circuit 38 transfers the source data values and valid flags into the -31 SRC 1/SRC 2 DATA fields and valid flags of the new reservation station entries corresponding to the physical micro-ops pmop-0 through pmop_ 3.
If the entry valid flags indicate that one or both of the source data values for a reservation station table entry RSO through R Sx is invalid, then the reservation and dispatch circuit 38 waits for the execution circuit 40 to execute previously dispatched physical micro-ops and generate the required source data values.
The reservation and dispatch circuit 38 monitors the physical destinations pdst on the result bus 62 as the execution circuit 40 writes back result data values to the reorder circuit 42 If a physical destination pdst on the result bus 62 corresponds to the physical destination of pending source data for a reservation station table entry RSO through R Sx, then the reservation and dispatch circuit 38 receives the result data value over the result bus 62 and stores the result data value into the corresponding SRC 1/SRC 2 DATA fields and valid flags The reservation and dispatch circuit
38 dispatches the pending physical micro-ops to the execution circuit 40 if both source data values are valid.
Figure 6 illustrates the real register circuit 44 The real register circuit 44 implements a real register file 86 The real register file 86 comprises a set of committed state registers that hold committed result data values The committed state registers buffer committed results for the architectural registers of the original stream of macro-instructions fetched by the instruction fetch and micro-op issue circuit 32.
-32 The result data value in each committed state register may be either an integer data value or a floating-point data value For one embodiment, the result data value field of each committed state register comprises 86 bits to accommodate both integer and floating-point data values.
For one embodiment, the committed state registers comprise the EAX, EBX, ECX, and EDX committed state registers, etc that correspond to the architectural registers of the Intel Architecture Microprocessor The real register file 86 also comprises committed state flags that correspond to the architectural flags of the Intel Architecture Microprocessor The real register file 86 also comprises microcode registers used by microcode executing in the instruction fetch and micro-op issue circuit 32.
The real register circuit 44 receives the physical micro-ops pmop-0 through pmop-3 over the physical micro-op bus 52 The real register circuit 44 reads the result data values from the committed state registers specified by the physical sources psrcl and psrc 2 of the physical micro-ops pmop-0 through pmop-3 from the real register file 86 if the RRFV flags indicate that the physical sources are retired.
The real register circuit 44 transfers the result data values from the committed state registers specified by the physical sources psrcl and psrc 2 of the physical micro-ops to the reservation and dispatch circuit 38 over the source data bus 58 if the RRFV flags indicate that the physical sources are retired in the real register file 86 The real register circuit 44 always sets the source data valid flags while transferring source data to the reservation and -33 dispatch circuit 38 over the source data bus 58 because the result data in the committed state registers is always valid.
For example, the result data value from the committed state register specified by the physical source psrcl of the pmop_ O is transferred as source data srcl data_ O over the source data bus 58 if the RRFV flag for the physical source psrcl of the pmop_ 0 is set The result data value from the committed state register specified by the physical source psrc 2 of the pmop_ O is transferred as source data src 2 data_ 0 over the source data bus 58 if the RRFV flag for the physical source psrc 2 of the pmop_ O is set.
Similarly, the real register circuit 44 transfers source data srcl/src 2 data_l through source data srcl/src 2 data_ 3 over the source data bus 58 to provide source data for the physical micro-ops pmopl through pmop_ 3 if the appropriate RRFV flags of the physical micro-ops pmop_ 1 through pmop_ 3 are set.
The real register circuit 44 receives the retirement micro-ops rm_ O through rm_ 3 from the reorder circuit 42 over the retirement bus 64 Each retirement micro-op rm_ 0 through rm_ 3 contains speculative results from one of the ROB entries RE O through R En in the reorder buffer 82.
Each retirement micro-op rm_ 0 through rm_ 3 comprises a result data value and a logical destination ldst The real register circuit 44 stores the result data values of the retirement micro-ops rm_ 0 through rm_ 3 into the committed state registers of the real register file 86 specified by the logical destinations ldst the retirement micro-op rm_ 0 through rm_ 3.
-34 Figure 7 illustrates a load memory circuit in the execution circuit 40.
The load memory circuit comprises an address generation circuit 100, a memory ordering circuit 102, a data translate look-aside buffer (DTLB) circuit 104, and a data cache circuit 106.
The address generation circuit 100 receives dispatched load memory physical micro-ops from the reservation and dispatch circuit 38 over the i)-O 1}) dispad Il bus 60 Eachi disp atched lo d mnemokiry phlsical nuiwop on the micro-op dispatch bus 60 comprises an opcode, a pair of source data values srcl_data and src 2 _data, a physical destination pdst, and a load buffer identifier lbid.
The address generation circuit 100 determines a linear address for each dispatched load memory physical micro-op according to the source data values srcl_data and src 2 _data The linear address may also be referred to as a virtual address For one embodiment, the address generation circuit 100 implements memory segment registers and generates the linear address according to the memory segmentation of Intel Architecture Microprocessors.
The address generation circuit 100 transfers linear load memory micro- ops to the memory ordering circuit 102 over a linear operation bus 90 Each linear load memory operation on the linear operation bus 90 corresponds to a dispatched load memory physical micro-op received over the micro-op dispatch bus 60 Each linear load memory micro-op comprises the opcode of the corresponding load memory physical micro-op, the linear address Laddr determined from the corresponding source data values srcldata and -35 src 2 _data, the corresponding physical destination pdst, and the corresponding load buffer identifier Ibid.
The memory ordering circuit 102 contains a load buffer The memory ordering circuit 102 receives the linear load memory micro-ops over the linear operation bus 90 The memory ordering circuit 102 stores the linear load memory micro-ops in the load buffer according to the corresponding load buffer identifier Ibid The memory ordering circuit 102 dispatches the linear load memory micro-ops from the load buffer to the DTLB circuit 104 over the linear operation bus 90.
The DTLB circuit 104 receives the dispatched linear load memory micro-ops from the memory ordering circuit 102 over the linear operation bus 90 The DTLB circuit 104 provides a physical address to the data cache circuit 106 over a read bus 94 for each linear load memory micro-op received from the memory ordering circuit 102.
The DTLB circuit 104 converts the corresponding linear address 1 _addr into a physical address for the memory subsystem 26 The DTLB circuit 104 maps the linear address Iaddr of each linear load memory micro-op into a physical address according to a predetermined memory paging mechanism.
The DTLB circuit 104 transfers the mapped physical address corresponding linear address Iaddr of each linear load memory micro-op to the memory ordering circuit 102 over a physical address bus 96 The memory ordering circuit 102 stores the physical addresses for each linear load memory micro-op in the corresponding load buffer entry For one embodiment, the memory ordering circuit 102 stores a portion of the physical addresses for each linear load memory micro-op in the corresponding load buffer entry.
The data cache circuit 106 reads the data specified by the physical address on the read bus 94 If the physical address causes a cache miss, the data cache circuit 106 fetches the required cache line from the memory subsystem 26 The data cache circuit 106 receives cache lines from the memory subsystem 26 over an interface bus 74 through the bus interface circuit 30 which is coupled to the multiprocessor bus 28.
The data cache circuit 106 transfers the read result data, a corresponding valid bit, and fault data for the read access to the reorder circuit 42 and the reservation and dispatch circuit 38 over the result bus 62 The result bus 62 also carries the physical destination from the corresponding load buffer in the memory ordering circuit 102.
The memory ordering circuit 102 senses or "snoops" bus cycles on the multiprocessor bus 28 through the bus interface circuit 30 over the interface bus 74 The memory ordering circuit 102 "snoops" the multiprocessor bus 28 for an external store or read for ownership operation by one of the processors 23 24 that may cause a processor ordering violation for one of the dispatched linear load memory micro-ops The memory ordering circuit 102 "snoops" the multiprocessor bus 28 for an external store operation targeted for the physical address of an already dispatched linear load memory micro-op stored in the load buffer.
-37 During retirement of each load memory physical micro-op, the memory ordering circuit 102 generates the memory ordering restart signals 76 to indicate a possible processor ordering violation according to the snoop detection. Figure 8 illustrates the memory ordering circuit 102 The memory ordering
circuit 102 implements a load buffer 88 comprising a set of load buffer entries LBO through L Bn Each load buffer entry LBO through L Bn holds a linear load memory micro-op from the address generation circuit 100.
Each buffer entry LBO through L Bn comprises an opcode, a physical destination (PDST), a linear address, a physical address, a load status, and a snoop hit flag.
The memory ordering circuit 102 receives the linear load memory micro-ops over the micro-op dispatch bus 60 The memory ordering circuit 102 stores each linear load memory micro-op into a load buffer entry LBO through L Bn specified by the corresponding load buffer identifier Ibid.
The memory ordering circuit 102 sets a "valid" status for each new linear load memory micro-op in the load buffer 88 The "valid" status indicates that the corresponding load buffer entry LBO through L Bn holds an unretired load memory micro-op.
The memory ordering circuit 102 stores the opcode, the physical destination pdst, and the linear address Laddr of each linear load memory micro-op into the corresponding fields of the load buffer entry LBO through -38 L Bn specified by the load buffer identifier Ibid of the linear load memory micro-op.
The memory ordering circuit 102 receives the physical addresses p addr corresponding to the linear load memory micro-ops from the DTLB circuit 104 over the physical address bus 96 The memory ordering circuit 102 stores the physical address for each linear load memory micro-op into the physical address field of the corresponding load buffer entry LBO through L Bn.
For one embodiment, the physical addresses on the physical address bus 96 comprise bits 12 through 19 of the physical address generated by the DTLB circuit 104 for the corresponding linear load memory micro-ops.
The memory ordering circuit 102 dispatches the linear load memory micro-ops from the load buffer entries LBO through L Bn over the linear operation bus 90 according to the availability of resources in the DTLB circuit 104 The memory ordering circuit 102 sets a "complete" status for each linear load memory micro-op dispatched to the DTLB circuit 104.
The memory ordering circuit 102 "snoops" the multiprocessor bus 28 for external store operations that may cause a processor ordering violation.
The memory ordering circuit 102 "snoops" the multiprocessor bus 28 for external stores to one of the physical addresses specified the load buffer entries LBO through L Bn having "complete" status The memory ordering circuit 102 senses an external physical address snoopaddr and a corresponding snoopaddr_valid signal from the multiprocessor bus 28 over the interface -39 bus 74 The snoop_ addrvalid signal specifies a valid address for a store operation on the multiprocessor bus 28.
For one embodiment, the physical address on the multiprocessor bus comprises 40 bits (bits 0 through 39) Bits 0 through 11 of the linear address for a linear load memory micro-op equal bits 0 through 11 of the corresponding physical address The memory ordering circuit 102 detects a processor ordering "snoop hit" by comparing bits 5 through 11 of the physical address of external store operations on the multiprocessor bus 28 with bits 5 through 11 of the linear address of the load buffer entries LBO through L Bn having "complete" status The memory ordering circuit 102 also compares bits 12 through 19 of the physical address of external store operations on the multiprocessor bus 28 with the physical address bits 12 through 19 of the load buffer entries LBO through L Bn having "complete" status.
The memory ordering circuit 102 sets the snoop hit flag for the load buffer entries LBO through L Bn causing a processor ordering snoop hit The memory ordering circuit 102 does not set the snoop hit flag if the load buffer entry LBO through L Bn causing a processor ordering snoop hit holds the oldest linear load memory micro-op in the load buffer 88 Snooping for the oldest linear load memory micro-op in the load buffer 88 is disabled by clearing a corresponding snoop enable flag the appropriate load buffer entry LBO through L Bn.
The memory ordering circuit 102 receives the retirement physical destinations from the retire logic circuit 46 over the retire notification bus 70.
The memory ordering circuit 102 issues the memory ordering restart signals 76 to indicate a possible processor ordering violation if one of the load buffer entries LBO through L Bn specified by the retirement physical destinations has the corresponding snoop hit flag set.
Figure 9 illustrates the snoop detection circuitry in the memory ordering circuit 102 The snoop detection circuitry includes a snoop detect circuit corresponding to each load buffer entry LBO-L Bn in the memory ordering circuit 102.
For example, a snoop detect circuit 200 corresponds to the load buffer entry LBO The snoop detect circuit 200 comprises a valid register 210, complete register 214, a physical address register 216, a linear address register 218, a snoop enable register 212, and a snoop hit register 222.
The valid register 210 contains the "valid" status indicating whether the load buffer entry LBO contains a valid load memory operation The complete register 214 holds the "complete" status indicating whether the load memory operation for the corresponding load buffer entry LBO has dispatched The physical address register 216 holds the physical address bits 19-12 corresponding to the load buffer entry LBO The linear address register 218 stores bits 11-5 of the linear address for the load memory operation corresponding to the load buffer entry LB O The snoop enable register 212 holds a snoop enable flag that enables or disables external store snooping for the load buffer entry LBO.
-41 The physical address register 216 receives a set of snoop address bits 230 The snoop address bits 230 comprise bits 19-12 of the snoop addr received over the interface bus 74 The physical address register 216 asserts a physical address detect signal 236 if the physical address bits 230 equal the physical address bits 19-12 corresponding to the load buffer entry LBO.
The linear address register 218 receives a set of physical address bits 232 The physical address bits 232 comprise bits 11-5 of the snoopaddr received over the interface bus 74 The linear address register 218 generates a linear address detect signal 237 if the physical address bits 232 equal bits 11-5 of the linear address corresponding to the load buffer entry LBO.
A snoop-addrvalid signal 234 is received over the interface bus 74 The snoop-addr_valid signal 234 indicates that the snoop-addr on the interface bus 74 corresponds to a valid external store operation.
The output of an AND gate 220 sets a snoop hit flag in the snoop hit register 222 by combining the physical address detect signal 236, and the linear address detect signal 237, the "complete" and the "valid" status," and the snoop enable flag.
Figure 10 illustrates notification circuitry in the memory ordering circuit 102 that generates the memory ordering restart signals 76 The memory ordering circuit 102 contains a notification circuit for each of the load buffer entries LBO-L Bn.
-42 For example, the notification circuit 250 corresponds to the load buffer entry LBO The snoop hit register 222 contains the snoop hit flag for the load buffer entry LBO A physical destination (PDST) register 260 holds the physical destination corresponding to the load buffer entry LBO.
The PDST register 260 receives a set of retirement physical destinations 270-272 over the retire notification bus 70 indicating the next set of retiring physical micro-ops The PDST register 260 generates a set of control signals 300-302 The control signals 300-302 indicate whether any of the retirement physical destinations 270-272 match the physical destination in the load buffer entry LBO.
For example, the PDST register 260 generates the control signal 300 to indicate that the retirement physical destination 270 matches the physical destination in load buffer entry LBO Similarly, the PDST register 260 generates the control signal 301 to indicate that the retirement physical destination 271 matches the physical destination in load buffer entry LBO, and the control signal 302 to indicate that the retirement physical destination 272 matches the physical destination in load buffer entry LBO.
The memory ordering restart circuit 250 receives a set of retirement physical destination valid flags 280-282 over the retire notification bus 70 The retirement physical destination valid flags 280- 282 indicate whether the retirement physical destinations 270-272 are valid.
-43 For example, the retirement physical destination valid flag 280 indicates whether the retirement physical destination 270 is valid.
Similarly, the retirement physical destination valid flag 281 indicates whether the retirement physical destination 271 is valid, and the retirement physical destination valid flag 282 indicates whether the retirement physical destination 272 is valid.
The control signals 300-302 and the retirement physical destination flags 280-282 are combined with the snoop hit flag by a set of AND gates 310-312 The outputs of the AND gates 310-312 are stored in a register 262 The outputs of the register 262 are synchronized by a clock signal 350.
The register 262 stores the memory ordering restart flags for the load buffer entry LBO The outputs of the AND gates 320-322 control a set of pull down transistors Q 1, Q 2 and Q 3 The pull down transistors Q 1, Q 2 and Q 3 are coupled to a set of memory ordering restart signal lines 290-292 The memory ordering restart signal lines 290-292 are also coupled to a set of pull up transistors Q 4, Q 5 and Q 6 which are synchronized by the clock signal 350.
If the control signal 300 indicates that the retirement physical destination 270 matches the physical destination in load buffer entry LBO, and if the retirement physical destination valid flag 280 indicates that the retirement physical destination 270 is valid, and if the snoop hit flag for the load buffer entry LBO is set, then the output of the AND gate 320 switches on the transistor Q 1 The transistor Q 1 pulls down the voltage on the memory ordering restart signal line 290 to indicate that the physical micro-op specified by the retirement physical destination 270 has caused a possible processor ordering violation.
Similarly, the memory ordering restart signal line 291 indicates that the physical micro-op specified by the retirement physical destination 271 has caused a possible processor ordering violation, and the memory ordering restart signal line 292 indicates that the physical micro-op specified by the retirement physical destination 272 has caused a possible processor ordering violation.
Figure 11 illustrates a load micro-op issue by the instruction fetch and micro-op issue circuit 32 The logical micro-op (Id Oxl OO, EBX, EAX) is transferred by the instruction fetch and micro-op issue circuit 32 over the logical micro-op bus 50 The logical micro-op Id Oxl OO, EBX, EAX specifies a load memory operation to the architectural register EAX from the memory subsystem 26 The address is specified by the contents of the architectural register EBX and offset 100 hex.
The allocator circuit 36 receives the logical micro-op ld Oxl OO, EBX, EAX over the logical micro-op bus 50, and generates a physical destination pdst equal to 42 The allocator circuit 36 transfers the pdst 42 to the register alias circuit 34 over the physical destination bus 56.
The register alias circuit 34 receives the physical destination pdst 42, and translates the logical micro-op Id Oxl OO, EBX, EAX into a physical micro- -45 op Id 100, 35, 42 The argument 100 specifies that psrcl is a constant data value equal to 100 hex The argument 35 specifies that psrc 2 is the RE 35 entry in the reorder buffer 82 according to the ROB pointer and the RRFV flag for the EBX entry in the register alias table 80.
The register alias circuit 34 transfers the physical micro-op ld 100, 35, 42 to the reservation and dispatch circuit 38, the reorder circuit 42, and the real register circuit 44 over the physical micro-op bus 52.
The register alias circuit 34 stores the allocated pdst 42 for the physical micro-op ld 100, 35, 42 into the ROB pointer of the EAX entry in the register alias table 80 The register alias circuit 34 also clears the R RFV bit for the EAX entry in the register alias table 80 to indicate that the logical register EAX is mapped to the reorder buffer 82 in a speculative state.
The reorder circuit 42 and the real register circuit 44 receive the physical micro-op ld 100, 35,42 over the physical micro-op bus 52 The reorder circuit 42 reads source data for the physical source psrc 2 35 by reading ROB entry RE 35 of the reorder buffer 82 The ROB entry RE 35 of the reorder buffer 82 contains a result data value equal to 2000 and a valid bit set for the current speculative state of the EBX architectural register.
The reorder circuit 42 transfers the result data value 2000 and the constant data value 100 along with corresponding valid bits to the reservation and dispatch circuit 38 over the source data bus 58 as a source data pair srcl/src 2 data.
-46 The reorder circuit 42 receives the logical destination Idst EAX for the physical micro-op Id 100, 35, 42 over the logical destination bus 54 The reorder circuit 42 stores the logical destination ldst EAX into the LDST of the entry RE 42 of the reorder buffer 82 The reorder circuit 42 clears the valid flag of the entry RE 42 of the reorder buffer 82 to indicate that the corresponding result data is not valid.
The reservation and dispatch circuit 38 receives the physical micro-op physical micro-op Id 100, 35,42 over the physical micro-op bus 52 The reservation and dispatch circuit 38 stores the opcode Id into the opcode field of the entry RSO of the reservation station table 84 as specified by the allocator circuit 36 The reservation and dispatch circuit 38 stores the physical destination pdst 42 into the PDST of the reservation station table 84 entry RSO.
The reservation and dispatch circuit 38 stores the physical sources psrcl xxx and psrc 2 35 into the PSRC 1/PSRC 2 of the reservation station table 84 entry RSO The reservation and dispatch circuit 38 also sets the entry valid flag of the reservation station table 84 entry RSO.
The reservation and dispatch circuit 38 receives the source data values srcl/src 2 data 100 and 2000 and corresponding valid flags over the source data bus 58 The reservation and dispatch circuit 38 stores the source data values srcl/src 2 data 100 and 2000 and corresponding valid flags into the SRC 1/SRC 2 and V fields of the reservation station table 84 entry RSO.
The reservation and dispatch circuit 38 receives a load buffer identifier Ibid = 4 for the physical micro-op ld 100, 35, 42 from the allocator circuit 36 over the load buffer ID bus 72 The reservation and dispatch circuit 38 stores -47 the load buffer identifier ibid = 4 into the LBID field of the reservation station table 84 entry RSO.
The reservation and dispatch circuit 38 dispatches the load memory physical micro-op Id 100, 2000, 42, Ibid = 4 to the address generation circuit 100 over the micro-op dispatch bus 60 The address generation circuit 100 converts the source data values 100, 2000 into a linear address 32100 according to segment register values The address generation circuit 100 then transfers a corresponding linear load memory micro-op Id 32100, 42, Ibid = 4 to the memory ordering circuit 102 over the linear operation bus 90.
The memory ordering circuit 102 receives the linear load memory micro-op Id 32100, 42, Ibid = 4 over the linear operation bus 90 The memory ordering circuit 102 stores the linear load memory micro-op Id 32100, 42, Ibid = 4 into entry LB 4 of the load buffer 88 as specified by the corresponding load buffer identifier Ibid = 4 The memory ordering circuit 102 sets "valid" load status for load buffer entry LB 4.
The memory ordering circuit 102 also contains an older linear load memory mncro-op Id 31000,41 in entry LB 3 of the load buffer 88 having a "complete" status The "complete" status indicates that the linear load memory micro-op Id 31000, 41 has been dispatched to the DTLB circuit 104 for execution The load buffer entry LB 3 contains physical address bits 6 19 equal to 1040 hex, which corresponds to a physical address equal to 41000 generated by the DTLB circuit 104 for the linear address 31000.
-48 Figure 12 illustrates the dispatch and retirement of the linear load memory micro-op Id 32100, 42, ibid = 4 The memory ordering circuit 102 dispatches the linear load memory micro-op Id 32100,42 from the load buffer entry LB 4 to the DTLB circuit 104 over the linear operation bus 90 The memory ordering circuit 102 then sets "complete" status for the load buffer entry LB 4.
The DUB circuit 104 generates a physical address equal to 42100 for the linear address 32100 of the linear load memory micro-op ld 32100, 42 The DTLB circuit 104 performs a read access of the data cache circuit 106 at physical address 42100 over the read bus 94.
The memory ordering circuit 102 receives the physical address bits 6 - 19 equal to 1084 corresponding to the linear load memory micro-op Id 32100, 42 over the physical address bus 96 The memory ordering circuit 102 stores the physical address bits 6 19 equal to 1084 into the physical address field of the load buffer entry LB 4.
The data cache circuit 106 transfers the result data value equal to 225 for the read to physical address 42100, a corresponding valid bit, and fault data for the read access to the reorder circuit 42 and the reservation and dispatch circuit 38 over the result bus 62 The result bus 62 also carries the physical destination 42 corresponding to the result data for the linear load memory micro-op Id 32100, 42.
-49 The reorder circuit 42 stores the result data equal to 225 and corresponding valid bit into entry RE 42 of the reorder buffer 82 as specified by the physical destination 42 on the result bus 62.
The memory ordering circuit 102 senses a snoop hit on the multiprocessor bus 28 for an external store having the physical address bits 6 - 19 equal to 1084 The physical address bits 6 19 equal to 1084 correspond to entry LB 4 of the load buffer 88 The memory ordering circuit 102 sets the snoop hit flag for entry LB 4 to indicate the processor ordering snoop hit.
The memory ordering circuit 102 also senses a snoop hit on the multiprocessor bus 28 for an external store having the physical address bits 6 - 19 equal to 1040 The memory ordering circuit 102 does not set the snoop hit flag for entry LB 3 because the linear load memory micro-op Id 31000, 41 is the oldest linear load memory micro-op in the load buffer 88.
The memory ordering circuit 102 then receives a set of retirement physical destinations 40, 41, 42 from the retire logic circuit 46 over the retire notification bus 70 In response, the memory ordering circuit 102 issues the memory ordering restart signals 76 to indicate a possible processor ordering violation for the linear load memory micro-op Id 32100, 42.
The memory ordering restart signals 76 cause the restart circuit 48 to issue the reorder clear signal 78 The reorder clear signal 78 causes the reorder circuit 42 to clear the speculative result data for the unretired physical micro- ops in the reorder buffer 82, and causes the reservation and dispatch circuit 38 to clear the pending physical micro-ops that await dispatch to the execution -5 so circuit 40 The reorder clear signal 78 also causes the allocator circuit 36 to reset the allocation pointer for allocating the physical registers in the reorder circuit 42, and causes the retire logic circuit 46 to reset the retirement pointer for retiring the physical registers.
The restart circuit 48 uses the macro instruction pointer delta values received over the macro instruction pointer offset bus 120 to calculate a restart instruction pointer value The restart circuit 48 transfers the restart instruction pointer value to the instruction fetch and micro-op issue circuit 32 over the restart vector bus 122.
The reorder clear signal 78 causes the micro-instruction sequencer of the instruction fetch and micro-op issue circuit 32 to reissue the in order stream of logical mnicro-ops that were cleared from the reorder circuit 42 before retirement.
In the foregoing specification the invention has been described with reference to specific exemplary embodiments thereof It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims The specification and drawings are accordingly to be regarded as illustrative rather than a restrictive sense.
-51
Claims (1)
1 A method for processor ordering in a multiprocessor computer system, comprising the steps of:
fetching a load memory instruction from an external memory in a sequential program order, the load memory instruction specifying a load memory operation from a memory address over a multiprocessor bus; executing the load memory instruction and snooping the multiprocessor bus for a processor ordering conflict at the memory address; committing the load memory instruction to an architectural state in the sequential program order if the external store operation to the memory address is not detected; reexecuting the load memory instruction if the external store operation to the memory address is detected.
2 The method of claim 1, wherein the step of fetching a load memory instruction comprises the step of fetching an instruction stream from the external memory in the sequential program order, the instruction stream comprising the load memory instruction, the load memory instruction specifying a load memory operation from the memory address over the multiprocessor bus.
3 The method of claim 2, wherein the step of executing the load memory instruction comprises the steps of:
-52 assembling at least one source data value for the load memory instruction, such that the source data value specifies the memory address for the load memory instruction; executing the load memory instruction after the source data value for the load memory instruction is assembled, the executed load memory instruction generating a result data value.
4 The method of claim 3, wherein the step of snooping the multiprocessor bus for a processor ordering conflict at the memory address comprises the step of snooping the multiprocessor bus for an external store operation to the memory address of the executed load memory instruction.
The method of claim 3, wherein the step of snooping the multiprocessor bus for a processor ordering conflict at the memory address comprises the step of snooping the multiprocessor bus for an external read for ownership operation to the memory address of the executed load memory instruction.
6 The method of claim 4, wherein the step of committing the load memory instruction to an architectural state comprises the step of committing the result data value to the architectural state in the sequential program order if the external store operation to the memory address of the executed load memory instruction is not detected.
7 The method of claim 6, wherein the step of reexecuting the load memory instruction if the external store operation to the memory address is detected comprises the steps of discarding the result data value, then -53 reexecuting the instruction stream starting with the load memory instruction corresponding to the discarded result data value if the external store operation to the memory address of the executed load memory instruction is detected before the result data value is committed to the architectural state.
8 The method of claim 7, wherein the step of executing the load memory instruction after the source data value for the load memory instruction is assembled comprises the steps of:
determining a linear memory address for the load memory instruction from the source data value for the load memory instruction; storing the load memory instruction and the linear memory address in an available load buffer entry of a load buffer; converting the linear address of the load memory instruction into a physical address, and storing the physical address into the load buffer entry; performing a load memory operation from the physical address, and setting a complete status for the load buffer entry, the load memory operation generating the result data value.
9 The method of claim 8, wherein the step of snooping the multiprocessor bus for an external store operation to the memory address of the executed load memory instruction comprises the steps of:
snooping the multiprocessor bus for an external store operation to the physical address stored in the load buffer entry if the load buffer entry contains the complete status; setting a snoop hit flag in the load buffer entry if the external store operation to the physical address stored in the load buffer entry is detected.
The method of claim 9, wherein the step of committing the result data value to an architectural state in the sequential program order comprises the steps of:
generating a retirement pointer in the sequential program order, such that the retirement pointer specifies the result data value stored in a physical register of a reorder buffer, and specifies the load buffer entry in the load buffer; committing the result data value in the physical register to the architectural state if the snoop hit flag of the load buffer entry is not set.
11 The method of claim 9, wherein the step of discarding the result data value comprises the steps oh generating a retirement pointer in the sequential program order, such that the retirement pointer specifies the result data value stored in a physical register of a reorder buffer, and specifies the load buffer entry in the load buffer; clearing the result data value in the physical register if the snoop hit flag of the load buffer entry is set.
12 An apparatus for processor ordering in a multiprocessor computer system, comprising:
means for fetching a load memory instruction from an external memory in a sequential program order, the load memory instruction specifying a load memory operation from a memory address over a multiprocessor bus; means for executing the load memory instruction and snooping the multiprocessor bus for a processor ordering conflict at the memory address; means for committing the load memory instruction to an architectural state in the sequential program order if the external store operation to the memory address is not detected; means for reexecuting the load memory instruction if the external store operation to the memory address is detected.
13 The apparatus of claim 12, wherein the means for fetching a load memory instruction comprises means for fetching an instruction stream from the external memory in the sequential program order, the instruction stream comprising the load memory instruction, the load memory instruction specifying a load memory operation from the memory address over the multiprocessor bus.
14 The apparatus of claim 13, wherein the means for executing the load memory instruction comprises:
means for assembling at least one source data value for the load memory instruction, such that the source data value specifies the memory address for the load memory instruction; means for executing the load memory instruction after the source data value for the load memory instruction is assembled, the executed load memory instruction generating a result data value.
The apparatus of claim 14, wherein the means for snooping the multiprocessor bus for a processor ordering conflict at the memory address comprises means for snooping the multiprocessor bus for an external store operation to the memory address of the executed load memory instruction.
i 16 The apparatus of claim 14, wherein the means for snooping the multiprocessor bus for a processor ordering conflict at the memory address comprises means for snooping the multiprocessor bus for an external read for ownership operation to the memory address of the executed load memory instruction.
17 The apparatus of claim 15, wherein the means for committing the load memory instruction to an architectural state comprises means for committing the result data value to the architectural state in the sequential program order if the external store operation to the memory address of the executed load memory instruction is not detected.
18 The apparatus of daim 17, wherein the means for reexecuting the load memory instruction if the external store operation to the memory address is detected comprises means for discarding the result data value, and means for reexecuting the instruction stream starting with the load memory instruction corresponding to the discarded result data value if the external store operation to the memory address of the executed load memory instruction is detected before the result data value is committed to the architectural state.
19 The apparatus of claim 18, wherein the means for executing the load memory instruction after the source data value for the load memory instruction is assembled comprises:
means for determining a linear memory address for the load memory instruction from the source data value for the load memory instruction; means for storing the load memory instruction and the linear memory address in an available load buffer entry of a load buffer; -57 means for converting the linear address of the load memory instruction into a physical address, and storing the physical address into the load buffer entry; means for performing a load memory operation from the physical address, and setting a complete status for the load buffer entry, the load memory operation generating the result data value.
The apparatus of claim 19, wherein the means for snooping the multiprocessor bus for an external store operation to the memory address of the executed load memory instruction comprises:
means for snooping the multiprocessor bus for an external store operation to the physical address stored in the load buffer entry if the load buffer entry contains the complete status; means for setting a snoop hit flag in the load buffer entry if the external store operation to the physical address stored in the load buffer entry is detected.
21 The apparatus of claim 20, wherein the means for committing the result data value to an architectural state in the sequential program order comprises:
means for generating a retirement pointer in the sequential program order, such that the retirement pointer specifies the result data value stored in a physical register of a reorder buffer, and specifies the load buffer entry in the load buffer; means for committing the result data value in the physical register to the architectural state if the snoop hit flag of the load buffer entry is not set.
-58 22 The apparatus of claim 20, wherein the means for discarding the result data value comprises:
means for generating a retirement pointer in the sequential program order, such that the retirement pointer specifies the result data value stored in a physical register of a reorder buffer, and specifies the load buffer entry in the load buffer; means for clearing the result data value in the physical register if the snoop hit flag of the load buffer entry is set.
23 A method for processor ordering in a multiprocessor computer system, substantially as hereinbefore described.
24 An apparatus for processor ordering in a multiprocessor computer system, substantially as hereinbefore described, with reference to the accompanying drawings.
-59
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11266893A | 1993-08-26 | 1993-08-26 |
Publications (3)
Publication Number | Publication Date |
---|---|
GB9408016D0 GB9408016D0 (en) | 1994-06-15 |
GB2281422A true GB2281422A (en) | 1995-03-01 |
GB2281422B GB2281422B (en) | 1997-09-10 |
Family
ID=22345221
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB9408016A Expired - Fee Related GB2281422B (en) | 1993-08-26 | 1994-04-22 | Processor ordering consistency for a processor performing out-of-order instruction execution |
Country Status (5)
Country | Link |
---|---|
JP (1) | JPH0784965A (en) |
DE (1) | DE4429921A1 (en) |
GB (1) | GB2281422B (en) |
IE (1) | IE80854B1 (en) |
SG (1) | SG49220A1 (en) |
Cited By (167)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0679993A2 (en) * | 1994-04-28 | 1995-11-02 | Hewlett-Packard Company | A computer apparatus having special instructions to force ordered load and store operations |
US5559975A (en) * | 1994-06-01 | 1996-09-24 | Advanced Micro Devices, Inc. | Program counter update mechanism |
US5574928A (en) * | 1993-10-29 | 1996-11-12 | Advanced Micro Devices, Inc. | Mixed integer/floating point processor core for a superscalar microprocessor with a plurality of operand buses for transferring operand segments |
US5623619A (en) * | 1993-10-29 | 1997-04-22 | Advanced Micro Devices, Inc. | Linearly addressable microprocessor cache |
US5630082A (en) * | 1993-10-29 | 1997-05-13 | Advanced Micro Devices, Inc. | Apparatus and method for instruction queue scanning |
US5632023A (en) * | 1994-06-01 | 1997-05-20 | Advanced Micro Devices, Inc. | Superscalar microprocessor including flag operand renaming and forwarding apparatus |
US5649225A (en) * | 1994-06-01 | 1997-07-15 | Advanced Micro Devices, Inc. | Resynchronization of a superscalar processor |
US5651125A (en) * | 1993-10-29 | 1997-07-22 | Advanced Micro Devices, Inc. | High performance superscalar microprocessor including a common reorder buffer and common register file for both integer and floating point operations |
US5680578A (en) * | 1995-06-07 | 1997-10-21 | Advanced Micro Devices, Inc. | Microprocessor using an instruction field to specify expanded functionality and a computer system employing same |
US5687110A (en) * | 1996-02-20 | 1997-11-11 | Advanced Micro Devices, Inc. | Array having an update circuit for updating a storage location with a value stored in another storage location |
US5689693A (en) * | 1994-04-26 | 1997-11-18 | Advanced Micro Devices, Inc. | Range finding circuit for selecting a consecutive sequence of reorder buffer entries using circular carry lookahead |
US5689672A (en) * | 1993-10-29 | 1997-11-18 | Advanced Micro Devices, Inc. | Pre-decoded instruction cache and method therefor particularly suitable for variable byte-length instructions |
US5696955A (en) * | 1994-06-01 | 1997-12-09 | Advanced Micro Devices, Inc. | Floating point stack and exchange instruction |
US5737550A (en) * | 1995-03-28 | 1998-04-07 | Advanced Micro Devices, Inc. | Cache memory to processor bus interface and method thereof |
US5742791A (en) * | 1996-02-14 | 1998-04-21 | Advanced Micro Devices, Inc. | Apparatus for detecting updates to instructions which are within an instruction processing pipeline of a microprocessor |
US5748978A (en) * | 1996-05-17 | 1998-05-05 | Advanced Micro Devices, Inc. | Byte queue divided into multiple subqueues for optimizing instruction selection logic |
US5752069A (en) * | 1995-08-31 | 1998-05-12 | Advanced Micro Devices, Inc. | Superscalar microprocessor employing away prediction structure |
US5752259A (en) * | 1996-03-26 | 1998-05-12 | Advanced Micro Devices, Inc. | Instruction cache configured to provide instructions to a microprocessor having a clock cycle time less than a cache access time of said instruction cache |
US5758114A (en) * | 1995-04-12 | 1998-05-26 | Advanced Micro Devices, Inc. | High speed instruction alignment unit for aligning variable byte-length instructions according to predecode information in a superscalar microprocessor |
US5761712A (en) * | 1995-06-07 | 1998-06-02 | Advanced Micro Devices | Data memory unit and method for storing data into a lockable cache in one clock cycle by previewing the tag array |
US5764946A (en) * | 1995-04-12 | 1998-06-09 | Advanced Micro Devices | Superscalar microprocessor employing a way prediction unit to predict the way of an instruction fetch address and to concurrently provide a branch prediction address corresponding to the fetch address |
US5765016A (en) * | 1996-09-12 | 1998-06-09 | Advanced Micro Devices, Inc. | Reorder buffer configured to store both speculative and committed register states |
US5765035A (en) * | 1995-11-20 | 1998-06-09 | Advanced Micro Devices, Inc. | Recorder buffer capable of detecting dependencies between accesses to a pair of caches |
US5768610A (en) * | 1995-06-07 | 1998-06-16 | Advanced Micro Devices, Inc. | Lookahead register value generator and a superscalar microprocessor employing same |
US5768555A (en) * | 1997-02-20 | 1998-06-16 | Advanced Micro Devices, Inc. | Reorder buffer employing last in buffer and last in line bits |
US5768574A (en) * | 1995-06-07 | 1998-06-16 | Advanced Micro Devices, Inc. | Microprocessor using an instruction field to expand the condition flags and a computer system employing the microprocessor |
US5781789A (en) * | 1995-08-31 | 1998-07-14 | Advanced Micro Devices, Inc. | Superscaler microprocessor employing a parallel mask decoder |
US5787474A (en) * | 1995-11-20 | 1998-07-28 | Advanced Micro Devices, Inc. | Dependency checking structure for a pair of caches which are accessed from different pipeline stages of an instruction processing pipeline |
US5790821A (en) * | 1996-03-08 | 1998-08-04 | Advanced Micro Devices, Inc. | Control bit vector storage for storing control vectors corresponding to instruction operations in a microprocessor |
US5794028A (en) * | 1996-10-17 | 1998-08-11 | Advanced Micro Devices, Inc. | Shared branch prediction structure |
US5796973A (en) * | 1993-10-29 | 1998-08-18 | Advanced Micro Devices, Inc. | Method and apparatus for decoding one or more complex instructions into concurrently dispatched simple instructions |
US5802588A (en) * | 1995-04-12 | 1998-09-01 | Advanced Micro Devices, Inc. | Load/store unit implementing non-blocking loads for a superscalar microprocessor and method of selecting loads in a non-blocking fashion from a load/store buffer |
US5813033A (en) * | 1996-03-08 | 1998-09-22 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a cache configured to detect dependencies between accesses to the cache and another cache |
US5813045A (en) * | 1996-07-24 | 1998-09-22 | Advanced Micro Devices, Inc. | Conditional early data address generation mechanism for a microprocessor |
US5819080A (en) * | 1996-01-02 | 1998-10-06 | Advanced Micro Devices, Inc. | Microprocessor using an instruction field to specify condition flags for use with branch instructions and a computer system employing the microprocessor |
US5819059A (en) * | 1995-04-12 | 1998-10-06 | Advanced Micro Devices, Inc. | Predecode unit adapted for variable byte-length instruction set processors and method of operating the same |
US5819057A (en) * | 1995-01-25 | 1998-10-06 | Advanced Micro Devices, Inc. | Superscalar microprocessor including an instruction alignment unit with limited dispatch to decode units |
US5822559A (en) * | 1996-01-02 | 1998-10-13 | Advanced Micro Devices, Inc. | Apparatus and method for aligning variable byte-length instructions to a plurality of issue positions |
US5822558A (en) * | 1995-04-12 | 1998-10-13 | Advanced Micro Devices, Inc. | Method and apparatus for predecoding variable byte-length instructions within a superscalar microprocessor |
US5822560A (en) * | 1996-05-23 | 1998-10-13 | Advanced Micro Devices, Inc. | Apparatus for efficient instruction execution via variable issue and variable control vectors per issue |
US5822778A (en) * | 1995-06-07 | 1998-10-13 | Advanced Micro Devices, Inc. | Microprocessor and method of using a segment override prefix instruction field to expand the register file |
US5822574A (en) * | 1995-04-12 | 1998-10-13 | Advanced Micro Devices, Inc. | Functional unit with a pointer for mispredicted resolution, and a superscalar microprocessor employing the same |
US5822575A (en) * | 1996-09-12 | 1998-10-13 | Advanced Micro Devices, Inc. | Branch prediction storage for storing branch prediction information such that a corresponding tag may be routed with the branch instruction |
US5826071A (en) * | 1995-08-31 | 1998-10-20 | Advanced Micro Devices, Inc. | Parallel mask decoder and method for generating said mask |
US5826053A (en) * | 1993-10-29 | 1998-10-20 | Advanced Micro Devices, Inc. | Speculative instruction queue and method therefor particularly suitable for variable byte-length instructions |
US5828873A (en) * | 1997-03-19 | 1998-10-27 | Advanced Micro Devices, Inc. | Assembly queue for a floating point unit |
US5832249A (en) * | 1995-01-25 | 1998-11-03 | Advanced Micro Devices, Inc. | High performance superscalar alignment unit |
US5832297A (en) * | 1995-04-12 | 1998-11-03 | Advanced Micro Devices, Inc. | Superscalar microprocessor load/store unit employing a unified buffer and separate pointers for load and store operations |
US5835968A (en) * | 1996-04-17 | 1998-11-10 | Advanced Micro Devices, Inc. | Apparatus for providing memory and register operands concurrently to functional units |
US5835753A (en) * | 1995-04-12 | 1998-11-10 | Advanced Micro Devices, Inc. | Microprocessor with dynamically extendable pipeline stages and a classifying circuit |
US5835511A (en) * | 1996-05-17 | 1998-11-10 | Advanced Micro Devices, Inc. | Method and mechanism for checking integrity of byte enable signals |
US5835744A (en) * | 1995-11-20 | 1998-11-10 | Advanced Micro Devices, Inc. | Microprocessor configured to swap operands in order to minimize dependency checking logic |
US5838943A (en) * | 1996-03-26 | 1998-11-17 | Advanced Micro Devices, Inc. | Apparatus for speculatively storing and restoring data to a cache memory |
US5845323A (en) * | 1995-08-31 | 1998-12-01 | Advanced Micro Devices, Inc. | Way prediction structure for predicting the way of a cache in which an access hits, thereby speeding cache access time |
US5845101A (en) * | 1997-05-13 | 1998-12-01 | Advanced Micro Devices, Inc. | Prefetch buffer for storing instructions prior to placing the instructions in an instruction cache |
US5848287A (en) * | 1996-02-20 | 1998-12-08 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a reorder buffer which detects dependencies between accesses to a pair of caches |
US5850532A (en) * | 1997-03-10 | 1998-12-15 | Advanced Micro Devices, Inc. | Invalid instruction scan unit for detecting invalid predecode data corresponding to instructions being fetched |
US5852727A (en) * | 1997-03-10 | 1998-12-22 | Advanced Micro Devices, Inc. | Instruction scanning unit for locating instructions via parallel scanning of start and end byte information |
US5854921A (en) * | 1995-08-31 | 1998-12-29 | Advanced Micro Devices, Inc. | Stride-based data address prediction structure |
US5860104A (en) * | 1995-08-31 | 1999-01-12 | Advanced Micro Devices, Inc. | Data cache which speculatively updates a predicted data cache storage location with store data and subsequently corrects mispredicted updates |
US5859992A (en) * | 1997-03-12 | 1999-01-12 | Advanced Micro Devices, Inc. | Instruction alignment using a dispatch list and a latch list |
US5859991A (en) * | 1995-06-07 | 1999-01-12 | Advanced Micro Devices, Inc. | Parallel and scalable method for identifying valid instructions and a superscalar microprocessor including an instruction scanning unit employing the method |
US5859998A (en) * | 1997-03-19 | 1999-01-12 | Advanced Micro Devices, Inc. | Hierarchical microcode implementation of floating point instructions for a microprocessor |
US5862065A (en) * | 1997-02-13 | 1999-01-19 | Advanced Micro Devices, Inc. | Method and circuit for fast generation of zero flag condition code in a microprocessor-based computer |
US5864707A (en) * | 1995-12-11 | 1999-01-26 | Advanced Micro Devices, Inc. | Superscalar microprocessor configured to predict return addresses from a return stack storage |
US5867680A (en) * | 1996-07-24 | 1999-02-02 | Advanced Micro Devices, Inc. | Microprocessor configured to simultaneously dispatch microcode and directly-decoded instructions |
US5870579A (en) * | 1996-11-18 | 1999-02-09 | Advanced Micro Devices, Inc. | Reorder buffer including a circuit for selecting a designated mask corresponding to an instruction that results in an exception |
US5870580A (en) * | 1996-12-13 | 1999-02-09 | Advanced Micro Devices, Inc. | Decoupled forwarding reorder buffer configured to allocate storage in chunks for instructions having unresolved dependencies |
US5870578A (en) * | 1997-12-09 | 1999-02-09 | Advanced Micro Devices, Inc. | Workload balancing in a microprocessor for reduced instruction dispatch stalling |
US5872947A (en) * | 1995-10-24 | 1999-02-16 | Advanced Micro Devices, Inc. | Instruction classification circuit configured to classify instructions into a plurality of instruction types prior to decoding said instructions |
US5872951A (en) * | 1996-07-26 | 1999-02-16 | Advanced Micro Design, Inc. | Reorder buffer having a future file for storing speculative instruction execution results |
US5872946A (en) * | 1997-06-11 | 1999-02-16 | Advanced Micro Devices, Inc. | Instruction alignment unit employing dual instruction queues for high frequency instruction dispatch |
US5872943A (en) * | 1996-07-26 | 1999-02-16 | Advanced Micro Devices, Inc. | Apparatus for aligning instructions using predecoded shift amounts |
US5875315A (en) * | 1995-06-07 | 1999-02-23 | Advanced Micro Devices, Inc. | Parallel and scalable instruction scanning unit |
US5875324A (en) * | 1995-06-07 | 1999-02-23 | Advanced Micro Devices, Inc. | Superscalar microprocessor which delays update of branch prediction information in response to branch misprediction until a subsequent idle clock |
US5878244A (en) * | 1995-01-25 | 1999-03-02 | Advanced Micro Devices, Inc. | Reorder buffer configured to allocate storage capable of storing results corresponding to a maximum number of concurrently receivable instructions regardless of a number of instructions received |
US5878255A (en) * | 1995-06-07 | 1999-03-02 | Advanced Micro Devices, Inc. | Update unit for providing a delayed update to a branch prediction array |
US5881278A (en) * | 1995-10-30 | 1999-03-09 | Advanced Micro Devices, Inc. | Return address prediction system which adjusts the contents of return stack storage to enable continued prediction after a mispredicted branch |
US5881305A (en) * | 1996-12-13 | 1999-03-09 | Advanced Micro Devices, Inc. | Register rename stack for a microprocessor |
US5884058A (en) * | 1996-07-24 | 1999-03-16 | Advanced Micro Devices, Inc. | Method for concurrently dispatching microcode and directly-decoded instructions in a microprocessor |
US5887152A (en) * | 1995-04-12 | 1999-03-23 | Advanced Micro Devices, Inc. | Load/store unit with multiple oldest outstanding instruction pointers for completing store and load/store miss instructions |
US5887185A (en) * | 1997-03-19 | 1999-03-23 | Advanced Micro Devices, Inc. | Interface for coupling a floating point unit to a reorder buffer |
US5893146A (en) * | 1995-08-31 | 1999-04-06 | Advanced Micro Design, Inc. | Cache structure having a reduced tag comparison to enable data transfer from said cache |
US5892936A (en) * | 1995-10-30 | 1999-04-06 | Advanced Micro Devices, Inc. | Speculative register file for storing speculative register states and removing dependencies between instructions utilizing the register |
US5898865A (en) * | 1997-06-12 | 1999-04-27 | Advanced Micro Devices, Inc. | Apparatus and method for predicting an end of loop for string instructions |
US5900012A (en) * | 1995-05-10 | 1999-05-04 | Advanced Micro Devices, Inc. | Storage device having varying access times and a superscalar microprocessor employing the same |
US5901076A (en) * | 1997-04-16 | 1999-05-04 | Advanced Micro Designs, Inc. | Ripple carry shifter in a floating point arithmetic unit of a microprocessor |
US5900013A (en) * | 1996-07-26 | 1999-05-04 | Advanced Micro Devices, Inc. | Dual comparator scheme for detecting a wrap-around condition and generating a cancel signal for removing wrap-around buffer entries |
US5901302A (en) * | 1995-01-25 | 1999-05-04 | Advanced Micro Devices, Inc. | Superscalar microprocessor having symmetrical, fixed issue positions each configured to execute a particular subset of instructions |
US5903910A (en) * | 1995-11-20 | 1999-05-11 | Advanced Micro Devices, Inc. | Method for transferring data between a pair of caches configured to be accessed from different stages of an instruction processing pipeline |
US5903740A (en) * | 1996-07-24 | 1999-05-11 | Advanced Micro Devices, Inc. | Apparatus and method for retiring instructions in excess of the number of accessible write ports |
US5903741A (en) * | 1995-01-25 | 1999-05-11 | Advanced Micro Devices, Inc. | Method of allocating a fixed reorder buffer storage line for execution results regardless of a number of concurrently dispatched instructions |
US5915110A (en) * | 1996-07-26 | 1999-06-22 | Advanced Micro Devices, Inc. | Branch misprediction recovery in a reorder buffer having a future file |
US5918056A (en) * | 1996-05-17 | 1999-06-29 | Advanced Micro Devices, Inc. | Segmentation suspend mode for real-time interrupt support |
US5920710A (en) * | 1996-11-18 | 1999-07-06 | Advanced Micro Devices, Inc. | Apparatus and method for modifying status bits in a reorder buffer with a large speculative state |
US5930492A (en) * | 1997-03-19 | 1999-07-27 | Advanced Micro Devices, Inc. | Rapid pipeline control using a control word and a steering word |
US5933629A (en) * | 1997-06-12 | 1999-08-03 | Advanced Micro Devices, Inc. | Apparatus and method for detecting microbranches early |
US5933626A (en) * | 1997-06-12 | 1999-08-03 | Advanced Micro Devices, Inc. | Apparatus and method for tracing microprocessor instructions |
US5931943A (en) * | 1997-10-21 | 1999-08-03 | Advanced Micro Devices, Inc. | Floating point NaN comparison |
US5933618A (en) * | 1995-10-30 | 1999-08-03 | Advanced Micro Devices, Inc. | Speculative register storage for storing speculative results corresponding to register updated by a plurality of concurrently recorded instruction |
US5940602A (en) * | 1997-06-11 | 1999-08-17 | Advanced Micro Devices, Inc. | Method and apparatus for predecoding variable byte length instructions for scanning of a number of RISC operations |
US5946468A (en) * | 1996-07-26 | 1999-08-31 | Advanced Micro Devices, Inc. | Reorder buffer having an improved future file for storing speculative instruction execution results |
US5954816A (en) * | 1996-11-19 | 1999-09-21 | Advanced Micro Devices, Inc. | Branch selector prediction |
US5961638A (en) * | 1996-11-19 | 1999-10-05 | Advanced Micro Devices, Inc. | Branch prediction mechanism employing branch selectors to select a branch prediction |
US5968163A (en) * | 1997-03-10 | 1999-10-19 | Advanced Micro Devices, Inc. | Microcode scan unit for scanning microcode instructions using predecode data |
US5974542A (en) * | 1997-10-30 | 1999-10-26 | Advanced Micro Devices, Inc. | Branch prediction unit which approximates a larger number of branch predictions using a smaller number of branch predictions and an alternate target indication |
US5974432A (en) * | 1997-12-05 | 1999-10-26 | Advanced Micro Devices, Inc. | On-the-fly one-hot encoding of leading zero count |
US5978906A (en) * | 1996-11-19 | 1999-11-02 | Advanced Micro Devices, Inc. | Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions |
US5978901A (en) * | 1997-08-21 | 1999-11-02 | Advanced Micro Devices, Inc. | Floating point and multimedia unit with data type reclassification capability |
US5983337A (en) * | 1997-06-12 | 1999-11-09 | Advanced Micro Devices, Inc. | Apparatus and method for patching an instruction by providing a substitute instruction or instructions from an external memory responsive to detecting an opcode of the instruction |
US5983321A (en) * | 1997-03-12 | 1999-11-09 | Advanced Micro Devices, Inc. | Cache holding register for receiving instruction packets and for providing the instruction packets to a predecode unit and instruction cache |
US5987235A (en) * | 1997-04-04 | 1999-11-16 | Advanced Micro Devices, Inc. | Method and apparatus for predecoding variable byte length instructions for fast scanning of instructions |
US5987561A (en) * | 1995-08-31 | 1999-11-16 | Advanced Micro Devices, Inc. | Superscalar microprocessor employing a data cache capable of performing store accesses in a single clock cycle |
US5991869A (en) * | 1995-04-12 | 1999-11-23 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a high speed instruction alignment unit |
US6003128A (en) * | 1997-05-01 | 1999-12-14 | Advanced Micro Devices, Inc. | Number of pipeline stages and loop length related counter differential based end-loop prediction |
US6006324A (en) * | 1995-01-25 | 1999-12-21 | Advanced Micro Devices, Inc. | High performance superscalar alignment unit |
US6009511A (en) * | 1997-06-11 | 1999-12-28 | Advanced Micro Devices, Inc. | Apparatus and method for tagging floating point operands and results for rapid detection of special floating point numbers |
US6012125A (en) * | 1997-06-20 | 2000-01-04 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a decoded instruction cache configured to receive partially decoded instructions |
US6016533A (en) * | 1997-12-16 | 2000-01-18 | Advanced Micro Devices, Inc. | Way prediction logic for cache array |
US6016545A (en) * | 1997-12-16 | 2000-01-18 | Advanced Micro Devices, Inc. | Reduced size storage apparatus for storing cache-line-related data in a high frequency microprocessor |
US6018798A (en) * | 1997-12-18 | 2000-01-25 | Advanced Micro Devices, Inc. | Floating point unit using a central window for storing instructions capable of executing multiple instructions in a single clock cycle |
US6032252A (en) * | 1997-10-28 | 2000-02-29 | Advanced Micro Devices, Inc. | Apparatus and method for efficient loop control in a superscalar microprocessor |
US6073230A (en) * | 1997-06-11 | 2000-06-06 | Advanced Micro Devices, Inc. | Instruction fetch unit configured to provide sequential way prediction for sequential instruction fetches |
US6079005A (en) * | 1997-11-20 | 2000-06-20 | Advanced Micro Devices, Inc. | Microprocessor including virtual address branch prediction and current page register to provide page portion of virtual and physical fetch address |
US6079003A (en) * | 1997-11-20 | 2000-06-20 | Advanced Micro Devices, Inc. | Reverse TLB for providing branch target address in a microprocessor having a physically-tagged cache |
US6085302A (en) * | 1996-04-17 | 2000-07-04 | Advanced Micro Devices, Inc. | Microprocessor having address generation units for efficient generation of memory operation addresses |
US6101577A (en) * | 1997-09-15 | 2000-08-08 | Advanced Micro Devices, Inc. | Pipelined instruction cache and branch prediction mechanism therefor |
US6108769A (en) * | 1996-05-17 | 2000-08-22 | Advanced Micro Devices, Inc. | Dependency table for reducing dependency checking hardware |
US6112296A (en) * | 1997-12-18 | 2000-08-29 | Advanced Micro Devices, Inc. | Floating point stack manipulation using a register map and speculative top of stack values |
US6112018A (en) * | 1997-12-18 | 2000-08-29 | Advanced Micro Devices, Inc. | Apparatus for exchanging two stack registers |
US6119223A (en) * | 1998-07-31 | 2000-09-12 | Advanced Micro Devices, Inc. | Map unit having rapid misprediction recovery |
US6122729A (en) * | 1997-05-13 | 2000-09-19 | Advanced Micro Devices, Inc. | Prefetch buffer which stores a pointer indicating an initial predecode position |
US6122656A (en) * | 1998-07-31 | 2000-09-19 | Advanced Micro Devices, Inc. | Processor configured to map logical register numbers to physical register numbers using virtual register numbers |
US6141745A (en) * | 1998-04-30 | 2000-10-31 | Advanced Micro Devices, Inc. | Functional bit identifying a prefix byte via a particular state regardless of type of instruction |
US6141740A (en) * | 1997-03-03 | 2000-10-31 | Advanced Micro Devices, Inc. | Apparatus and method for microcode patching for generating a next address |
US6154818A (en) * | 1997-11-20 | 2000-11-28 | Advanced Micro Devices, Inc. | System and method of controlling access to privilege partitioned address space for a model specific register file |
US6157986A (en) * | 1997-12-16 | 2000-12-05 | Advanced Micro Devices, Inc. | Fast linear tag validation unit for use in microprocessor |
US6157996A (en) * | 1997-11-13 | 2000-12-05 | Advanced Micro Devices, Inc. | Processor programably configurable to execute enhanced variable byte length instructions including predicated execution, three operand addressing, and increased register space |
US6175908B1 (en) | 1998-04-30 | 2001-01-16 | Advanced Micro Devices, Inc. | Variable byte-length instructions using state of function bit of second byte of plurality of instructions bytes as indicative of whether first byte is a prefix byte |
US6175906B1 (en) | 1996-12-06 | 2001-01-16 | Advanced Micro Devices, Inc. | Mechanism for fast revalidation of virtual tags |
US6199154B1 (en) | 1997-11-17 | 2001-03-06 | Advanced Micro Devices, Inc. | Selecting cache to fetch in multi-level cache system based on fetch address source and pre-fetching additional data to the cache for future access |
US6230259B1 (en) | 1997-10-31 | 2001-05-08 | Advanced Micro Devices, Inc. | Transparent extended state save |
US6230262B1 (en) | 1998-07-31 | 2001-05-08 | Advanced Micro Devices, Inc. | Processor configured to selectively free physical registers upon retirement of instructions |
US6233672B1 (en) | 1997-03-06 | 2001-05-15 | Advanced Micro Devices, Inc. | Piping rounding mode bits with floating point instructions to eliminate serialization |
US6237082B1 (en) | 1995-01-25 | 2001-05-22 | Advanced Micro Devices, Inc. | Reorder buffer configured to allocate storage for instruction results corresponding to predefined maximum number of concurrently receivable instructions independent of a number of instructions received |
US6266744B1 (en) | 1999-05-18 | 2001-07-24 | Advanced Micro Devices, Inc. | Store to load forwarding using a dependency link file |
US6298423B1 (en) | 1993-10-29 | 2001-10-02 | Advanced Micro Devices, Inc. | High performance load/store functional unit and data cache |
US6393536B1 (en) | 1999-05-18 | 2002-05-21 | Advanced Micro Devices, Inc. | Load/store unit employing last-in-buffer indication for rapid load-hit-store |
US6415360B1 (en) | 1999-05-18 | 2002-07-02 | Advanced Micro Devices, Inc. | Minimizing self-modifying code checks for uncacheable memory types |
US6427193B1 (en) | 1999-05-18 | 2002-07-30 | Advanced Micro Devices, Inc. | Deadlock avoidance using exponential backoff |
US6442707B1 (en) | 1999-10-29 | 2002-08-27 | Advanced Micro Devices, Inc. | Alternate fault handler |
US6473832B1 (en) | 1999-05-18 | 2002-10-29 | Advanced Micro Devices, Inc. | Load/store unit having pre-cache and post-cache queues for low latency load memory operations |
US6473837B1 (en) | 1999-05-18 | 2002-10-29 | Advanced Micro Devices, Inc. | Snoop resynchronization mechanism to preserve read ordering |
US6516395B1 (en) | 1997-11-20 | 2003-02-04 | Advanced Micro Devices, Inc. | System and method for controlling access to a privilege-partitioned address space with a fixed set of attributes |
US6604190B1 (en) | 1995-06-07 | 2003-08-05 | Advanced Micro Devices, Inc. | Data address prediction structure and a method for operating the same |
US6662280B1 (en) | 1999-11-10 | 2003-12-09 | Advanced Micro Devices, Inc. | Store buffer which forwards data based on index and optional way match |
US6732234B1 (en) | 2000-08-07 | 2004-05-04 | Broadcom Corporation | Direct access mode for a cache |
US6738792B1 (en) | 2001-03-09 | 2004-05-18 | Advanced Micro Devices, Inc. | Parallel mask generator |
US6748495B2 (en) | 2001-05-15 | 2004-06-08 | Broadcom Corporation | Random generator |
US6748492B1 (en) | 2000-08-07 | 2004-06-08 | Broadcom Corporation | Deterministic setting of replacement policy in a cache through way selection |
US6848024B1 (en) | 2000-08-07 | 2005-01-25 | Broadcom Corporation | Programmably disabling one or more cache entries |
US6877084B1 (en) | 2000-08-09 | 2005-04-05 | Advanced Micro Devices, Inc. | Central processing unit (CPU) accessing an extended register set in an extended register mode |
US6981132B2 (en) | 2000-08-09 | 2005-12-27 | Advanced Micro Devices, Inc. | Uniform register addressing using prefix byte |
US6988168B2 (en) | 2002-05-15 | 2006-01-17 | Broadcom Corporation | Cache programmable to partition ways to agents and/or local/remote blocks |
US7117290B2 (en) | 2003-09-03 | 2006-10-03 | Advanced Micro Devices, Inc. | MicroTLB and micro tag for reducing power in a processor |
US7321964B2 (en) | 2003-07-08 | 2008-01-22 | Advanced Micro Devices, Inc. | Store-to-load forwarding buffer using indexed lookup |
US10311191B2 (en) | 2017-01-26 | 2019-06-04 | Advanced Micro Devices, Inc. | Memory including side-car arrays with irregular sized entries |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6438664B1 (en) | 1999-10-27 | 2002-08-20 | Advanced Micro Devices, Inc. | Microcode patch device and method for patching microcode using match registers and patch routines |
JP4584124B2 (en) * | 2005-11-24 | 2010-11-17 | エヌイーシーコンピュータテクノ株式会社 | Information processing apparatus, error processing method thereof, and control program |
US8775777B2 (en) | 2007-08-15 | 2014-07-08 | Nvidia Corporation | Techniques for sourcing immediate values from a VLIW |
US8521800B1 (en) | 2007-08-15 | 2013-08-27 | Nvidia Corporation | Interconnected arithmetic logic units |
US8599208B2 (en) | 2007-08-15 | 2013-12-03 | Nvidia Corporation | Shared readable and writeable global values in a graphics processor unit pipeline |
US8314803B2 (en) | 2007-08-15 | 2012-11-20 | Nvidia Corporation | Buffering deserialized pixel data in a graphics processor unit pipeline |
US9317251B2 (en) | 2012-12-31 | 2016-04-19 | Nvidia Corporation | Efficient correction of normalizer shift amount errors in fused multiply add operations |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB1582815A (en) * | 1977-02-03 | 1981-01-14 | Siemens Ag | Data processing system |
-
1994
- 1994-04-12 IE IE940337A patent/IE80854B1/en not_active IP Right Cessation
- 1994-04-22 GB GB9408016A patent/GB2281422B/en not_active Expired - Fee Related
- 1994-04-22 SG SG1996007762A patent/SG49220A1/en unknown
- 1994-08-11 JP JP20941894A patent/JPH0784965A/en active Pending
- 1994-08-23 DE DE19944429921 patent/DE4429921A1/en not_active Withdrawn
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB1582815A (en) * | 1977-02-03 | 1981-01-14 | Siemens Ag | Data processing system |
Cited By (231)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6298423B1 (en) | 1993-10-29 | 2001-10-02 | Advanced Micro Devices, Inc. | High performance load/store functional unit and data cache |
US5651125A (en) * | 1993-10-29 | 1997-07-22 | Advanced Micro Devices, Inc. | High performance superscalar microprocessor including a common reorder buffer and common register file for both integer and floating point operations |
US5796973A (en) * | 1993-10-29 | 1998-08-18 | Advanced Micro Devices, Inc. | Method and apparatus for decoding one or more complex instructions into concurrently dispatched simple instructions |
US5574928A (en) * | 1993-10-29 | 1996-11-12 | Advanced Micro Devices, Inc. | Mixed integer/floating point processor core for a superscalar microprocessor with a plurality of operand buses for transferring operand segments |
US5623619A (en) * | 1993-10-29 | 1997-04-22 | Advanced Micro Devices, Inc. | Linearly addressable microprocessor cache |
US5630082A (en) * | 1993-10-29 | 1997-05-13 | Advanced Micro Devices, Inc. | Apparatus and method for instruction queue scanning |
US5826053A (en) * | 1993-10-29 | 1998-10-20 | Advanced Micro Devices, Inc. | Speculative instruction queue and method therefor particularly suitable for variable byte-length instructions |
US5970235A (en) * | 1993-10-29 | 1999-10-19 | Advanced Micro Devices, Inc. | Pre-decoded instruction cache and method therefor particularly suitable for variable byte-length instructions |
US6189087B1 (en) | 1993-10-29 | 2001-02-13 | Advanced Micro Devices, Inc. | Superscalar instruction decoder including an instruction queue |
US5896518A (en) * | 1993-10-29 | 1999-04-20 | Advanced Micro Devices, Inc. | Instruction queue scanning using opcode identification |
US6240484B1 (en) | 1993-10-29 | 2001-05-29 | Advanced Micro Devices, Inc. | Linearly addressable microprocessor cache |
US5689672A (en) * | 1993-10-29 | 1997-11-18 | Advanced Micro Devices, Inc. | Pre-decoded instruction cache and method therefor particularly suitable for variable byte-length instructions |
US5689693A (en) * | 1994-04-26 | 1997-11-18 | Advanced Micro Devices, Inc. | Range finding circuit for selecting a consecutive sequence of reorder buffer entries using circular carry lookahead |
US5996067A (en) * | 1994-04-26 | 1999-11-30 | Advanced Micro Devices, Inc. | Range finding circuit for selecting a consecutive sequence of reorder buffer entries using circular carry lookahead |
US6286095B1 (en) | 1994-04-28 | 2001-09-04 | Hewlett-Packard Company | Computer apparatus having special instructions to force ordered load and store operations |
EP0679993A2 (en) * | 1994-04-28 | 1995-11-02 | Hewlett-Packard Company | A computer apparatus having special instructions to force ordered load and store operations |
EP0679993A3 (en) * | 1994-04-28 | 1996-09-11 | Hewlett Packard Co | A computer apparatus having special instructions to force ordered load and store operations. |
US5857089A (en) * | 1994-06-01 | 1999-01-05 | Advanced Micro Devices, Inc. | Floating point stack and exchange instruction |
US5649225A (en) * | 1994-06-01 | 1997-07-15 | Advanced Micro Devices, Inc. | Resynchronization of a superscalar processor |
US5799162A (en) * | 1994-06-01 | 1998-08-25 | Advanced Micro Devices, Inc. | Program counter update mechanism |
US5559975A (en) * | 1994-06-01 | 1996-09-24 | Advanced Micro Devices, Inc. | Program counter update mechanism |
US6035386A (en) * | 1994-06-01 | 2000-03-07 | Advanced Micro Devices, Inc. | Program counter update mechanism |
US5696955A (en) * | 1994-06-01 | 1997-12-09 | Advanced Micro Devices, Inc. | Floating point stack and exchange instruction |
US6351801B1 (en) | 1994-06-01 | 2002-02-26 | Advanced Micro Devices, Inc. | Program counter update mechanism |
US5632023A (en) * | 1994-06-01 | 1997-05-20 | Advanced Micro Devices, Inc. | Superscalar microprocessor including flag operand renaming and forwarding apparatus |
US5901302A (en) * | 1995-01-25 | 1999-05-04 | Advanced Micro Devices, Inc. | Superscalar microprocessor having symmetrical, fixed issue positions each configured to execute a particular subset of instructions |
US5878244A (en) * | 1995-01-25 | 1999-03-02 | Advanced Micro Devices, Inc. | Reorder buffer configured to allocate storage capable of storing results corresponding to a maximum number of concurrently receivable instructions regardless of a number of instructions received |
US5819057A (en) * | 1995-01-25 | 1998-10-06 | Advanced Micro Devices, Inc. | Superscalar microprocessor including an instruction alignment unit with limited dispatch to decode units |
US6026482A (en) * | 1995-01-25 | 2000-02-15 | Advanced Micro Devices, Inc. | Recorder buffer and a method for allocating a fixed amount of storage for instruction results independent of a number of concurrently dispatched instructions |
US6237082B1 (en) | 1995-01-25 | 2001-05-22 | Advanced Micro Devices, Inc. | Reorder buffer configured to allocate storage for instruction results corresponding to predefined maximum number of concurrently receivable instructions independent of a number of instructions received |
US5903741A (en) * | 1995-01-25 | 1999-05-11 | Advanced Micro Devices, Inc. | Method of allocating a fixed reorder buffer storage line for execution results regardless of a number of concurrently dispatched instructions |
US6006324A (en) * | 1995-01-25 | 1999-12-21 | Advanced Micro Devices, Inc. | High performance superscalar alignment unit |
US6134651A (en) * | 1995-01-25 | 2000-10-17 | Advanced Micro Devices, Inc. | Reorder buffer employed in a microprocessor to store instruction results having a plurality of entries predetermined to correspond to a plurality of functional units |
US6381689B2 (en) | 1995-01-25 | 2002-04-30 | Advanced Micro Devices, Inc. | Line-oriented reorder buffer configured to selectively store a memory operation result in one of the plurality of reorder buffer storage locations corresponding to the executed instruction |
US5832249A (en) * | 1995-01-25 | 1998-11-03 | Advanced Micro Devices, Inc. | High performance superscalar alignment unit |
US6393549B1 (en) | 1995-01-25 | 2002-05-21 | Advanced Micro Devices, Inc. | Instruction alignment unit for routing variable byte-length instructions |
US5737550A (en) * | 1995-03-28 | 1998-04-07 | Advanced Micro Devices, Inc. | Cache memory to processor bus interface and method thereof |
US5802588A (en) * | 1995-04-12 | 1998-09-01 | Advanced Micro Devices, Inc. | Load/store unit implementing non-blocking loads for a superscalar microprocessor and method of selecting loads in a non-blocking fashion from a load/store buffer |
US5819059A (en) * | 1995-04-12 | 1998-10-06 | Advanced Micro Devices, Inc. | Predecode unit adapted for variable byte-length instruction set processors and method of operating the same |
US5887152A (en) * | 1995-04-12 | 1999-03-23 | Advanced Micro Devices, Inc. | Load/store unit with multiple oldest outstanding instruction pointers for completing store and load/store miss instructions |
US5822558A (en) * | 1995-04-12 | 1998-10-13 | Advanced Micro Devices, Inc. | Method and apparatus for predecoding variable byte-length instructions within a superscalar microprocessor |
US5991869A (en) * | 1995-04-12 | 1999-11-23 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a high speed instruction alignment unit |
US5835753A (en) * | 1995-04-12 | 1998-11-10 | Advanced Micro Devices, Inc. | Microprocessor with dynamically extendable pipeline stages and a classifying circuit |
US5822574A (en) * | 1995-04-12 | 1998-10-13 | Advanced Micro Devices, Inc. | Functional unit with a pointer for mispredicted resolution, and a superscalar microprocessor employing the same |
US5758114A (en) * | 1995-04-12 | 1998-05-26 | Advanced Micro Devices, Inc. | High speed instruction alignment unit for aligning variable byte-length instructions according to predecode information in a superscalar microprocessor |
US5832297A (en) * | 1995-04-12 | 1998-11-03 | Advanced Micro Devices, Inc. | Superscalar microprocessor load/store unit employing a unified buffer and separate pointers for load and store operations |
US5764946A (en) * | 1995-04-12 | 1998-06-09 | Advanced Micro Devices | Superscalar microprocessor employing a way prediction unit to predict the way of an instruction fetch address and to concurrently provide a branch prediction address corresponding to the fetch address |
US5900012A (en) * | 1995-05-10 | 1999-05-04 | Advanced Micro Devices, Inc. | Storage device having varying access times and a superscalar microprocessor employing the same |
US5768610A (en) * | 1995-06-07 | 1998-06-16 | Advanced Micro Devices, Inc. | Lookahead register value generator and a superscalar microprocessor employing same |
US5761712A (en) * | 1995-06-07 | 1998-06-02 | Advanced Micro Devices | Data memory unit and method for storing data into a lockable cache in one clock cycle by previewing the tag array |
US6604190B1 (en) | 1995-06-07 | 2003-08-05 | Advanced Micro Devices, Inc. | Data address prediction structure and a method for operating the same |
US5822778A (en) * | 1995-06-07 | 1998-10-13 | Advanced Micro Devices, Inc. | Microprocessor and method of using a segment override prefix instruction field to expand the register file |
US5680578A (en) * | 1995-06-07 | 1997-10-21 | Advanced Micro Devices, Inc. | Microprocessor using an instruction field to specify expanded functionality and a computer system employing same |
US5878255A (en) * | 1995-06-07 | 1999-03-02 | Advanced Micro Devices, Inc. | Update unit for providing a delayed update to a branch prediction array |
US5768574A (en) * | 1995-06-07 | 1998-06-16 | Advanced Micro Devices, Inc. | Microprocessor using an instruction field to expand the condition flags and a computer system employing the microprocessor |
US5875324A (en) * | 1995-06-07 | 1999-02-23 | Advanced Micro Devices, Inc. | Superscalar microprocessor which delays update of branch prediction information in response to branch misprediction until a subsequent idle clock |
US5875315A (en) * | 1995-06-07 | 1999-02-23 | Advanced Micro Devices, Inc. | Parallel and scalable instruction scanning unit |
US5859991A (en) * | 1995-06-07 | 1999-01-12 | Advanced Micro Devices, Inc. | Parallel and scalable method for identifying valid instructions and a superscalar microprocessor including an instruction scanning unit employing the method |
US5860104A (en) * | 1995-08-31 | 1999-01-12 | Advanced Micro Devices, Inc. | Data cache which speculatively updates a predicted data cache storage location with store data and subsequently corrects mispredicted updates |
US5752069A (en) * | 1995-08-31 | 1998-05-12 | Advanced Micro Devices, Inc. | Superscalar microprocessor employing away prediction structure |
US5854921A (en) * | 1995-08-31 | 1998-12-29 | Advanced Micro Devices, Inc. | Stride-based data address prediction structure |
US5935239A (en) * | 1995-08-31 | 1999-08-10 | Advanced Micro Devices, Inc. | Parallel mask decoder and method for generating said mask |
US5987561A (en) * | 1995-08-31 | 1999-11-16 | Advanced Micro Devices, Inc. | Superscalar microprocessor employing a data cache capable of performing store accesses in a single clock cycle |
US5845323A (en) * | 1995-08-31 | 1998-12-01 | Advanced Micro Devices, Inc. | Way prediction structure for predicting the way of a cache in which an access hits, thereby speeding cache access time |
US5826071A (en) * | 1995-08-31 | 1998-10-20 | Advanced Micro Devices, Inc. | Parallel mask decoder and method for generating said mask |
US5893146A (en) * | 1995-08-31 | 1999-04-06 | Advanced Micro Design, Inc. | Cache structure having a reduced tag comparison to enable data transfer from said cache |
US5781789A (en) * | 1995-08-31 | 1998-07-14 | Advanced Micro Devices, Inc. | Superscaler microprocessor employing a parallel mask decoder |
US6079006A (en) * | 1995-08-31 | 2000-06-20 | Advanced Micro Devices, Inc. | Stride-based data address prediction structure |
US6189068B1 (en) | 1995-08-31 | 2001-02-13 | Advanced Micro Devices, Inc. | Superscalar microprocessor employing a data cache capable of performing store accesses in a single clock cycle |
US5872947A (en) * | 1995-10-24 | 1999-02-16 | Advanced Micro Devices, Inc. | Instruction classification circuit configured to classify instructions into a plurality of instruction types prior to decoding said instructions |
US5881278A (en) * | 1995-10-30 | 1999-03-09 | Advanced Micro Devices, Inc. | Return address prediction system which adjusts the contents of return stack storage to enable continued prediction after a mispredicted branch |
US5892936A (en) * | 1995-10-30 | 1999-04-06 | Advanced Micro Devices, Inc. | Speculative register file for storing speculative register states and removing dependencies between instructions utilizing the register |
US5933618A (en) * | 1995-10-30 | 1999-08-03 | Advanced Micro Devices, Inc. | Speculative register storage for storing speculative results corresponding to register updated by a plurality of concurrently recorded instruction |
US5903910A (en) * | 1995-11-20 | 1999-05-11 | Advanced Micro Devices, Inc. | Method for transferring data between a pair of caches configured to be accessed from different stages of an instruction processing pipeline |
US5765035A (en) * | 1995-11-20 | 1998-06-09 | Advanced Micro Devices, Inc. | Recorder buffer capable of detecting dependencies between accesses to a pair of caches |
US5787474A (en) * | 1995-11-20 | 1998-07-28 | Advanced Micro Devices, Inc. | Dependency checking structure for a pair of caches which are accessed from different pipeline stages of an instruction processing pipeline |
US5835744A (en) * | 1995-11-20 | 1998-11-10 | Advanced Micro Devices, Inc. | Microprocessor configured to swap operands in order to minimize dependency checking logic |
US6269436B1 (en) | 1995-12-11 | 2001-07-31 | Advanced Micro Devices, Inc. | Superscalar microprocessor configured to predict return addresses from a return stack storage |
US5864707A (en) * | 1995-12-11 | 1999-01-26 | Advanced Micro Devices, Inc. | Superscalar microprocessor configured to predict return addresses from a return stack storage |
US6014734A (en) * | 1995-12-11 | 2000-01-11 | Advanced Micro Devices, Inc. | Superscalar microprocessor configured to predict return addresses from a return stack storage |
US5819080A (en) * | 1996-01-02 | 1998-10-06 | Advanced Micro Devices, Inc. | Microprocessor using an instruction field to specify condition flags for use with branch instructions and a computer system employing the microprocessor |
US5822559A (en) * | 1996-01-02 | 1998-10-13 | Advanced Micro Devices, Inc. | Apparatus and method for aligning variable byte-length instructions to a plurality of issue positions |
US5742791A (en) * | 1996-02-14 | 1998-04-21 | Advanced Micro Devices, Inc. | Apparatus for detecting updates to instructions which are within an instruction processing pipeline of a microprocessor |
US6073217A (en) * | 1996-02-14 | 2000-06-06 | Advanced Micro Devices | Method for detecting updates to instructions which are within an instruction processing pipeline of a microprocessor |
US6389512B1 (en) | 1996-02-14 | 2002-05-14 | Advanced Micro Devices, Inc. | Microprocessor configured to detect updates to instructions outstanding within an instruction processing pipeline and computer system including same |
US5848287A (en) * | 1996-02-20 | 1998-12-08 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a reorder buffer which detects dependencies between accesses to a pair of caches |
US5687110A (en) * | 1996-02-20 | 1997-11-11 | Advanced Micro Devices, Inc. | Array having an update circuit for updating a storage location with a value stored in another storage location |
US6192462B1 (en) | 1996-02-20 | 2001-02-20 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a load/store unit, decode units and a reorder buffer to detect dependencies between access to a stack cache and a data cache |
US5813033A (en) * | 1996-03-08 | 1998-09-22 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a cache configured to detect dependencies between accesses to the cache and another cache |
US5790821A (en) * | 1996-03-08 | 1998-08-04 | Advanced Micro Devices, Inc. | Control bit vector storage for storing control vectors corresponding to instruction operations in a microprocessor |
US6351804B1 (en) | 1996-03-08 | 2002-02-26 | Advanced Micro Devices, Inc. | Control bit vector storage for a microprocessor |
US6157994A (en) * | 1996-03-08 | 2000-12-05 | Advanced Micro Devices, Inc. | Microprocessor employing and method of using a control bit vector storage for instruction execution |
US5838943A (en) * | 1996-03-26 | 1998-11-17 | Advanced Micro Devices, Inc. | Apparatus for speculatively storing and restoring data to a cache memory |
US6233657B1 (en) | 1996-03-26 | 2001-05-15 | Advanced Micro Devices, Inc. | Apparatus and method for performing speculative stores |
US6167510A (en) * | 1996-03-26 | 2000-12-26 | Advanced Micro Devices, Inc. | Instruction cache configured to provide instructions to a microprocessor having a clock cycle time less than a cache access time of said instruction cache |
US6006317A (en) * | 1996-03-26 | 1999-12-21 | Advanced Micro Devices, Inc. | Apparatus and method performing speculative stores |
US5752259A (en) * | 1996-03-26 | 1998-05-12 | Advanced Micro Devices, Inc. | Instruction cache configured to provide instructions to a microprocessor having a clock cycle time less than a cache access time of said instruction cache |
US6085302A (en) * | 1996-04-17 | 2000-07-04 | Advanced Micro Devices, Inc. | Microprocessor having address generation units for efficient generation of memory operation addresses |
US5960467A (en) * | 1996-04-17 | 1999-09-28 | Advanced Micro Devices, Inc. | Apparatus for efficiently providing memory operands for instructions |
US5835968A (en) * | 1996-04-17 | 1998-11-10 | Advanced Micro Devices, Inc. | Apparatus for providing memory and register operands concurrently to functional units |
US6249862B1 (en) | 1996-05-17 | 2001-06-19 | Advanced Micro Devices, Inc. | Dependency table for reducing dependency checking hardware |
US6209084B1 (en) | 1996-05-17 | 2001-03-27 | Advanced Micro Devices, Inc. | Dependency table for reducing dependency checking hardware |
US5918056A (en) * | 1996-05-17 | 1999-06-29 | Advanced Micro Devices, Inc. | Segmentation suspend mode for real-time interrupt support |
US5748978A (en) * | 1996-05-17 | 1998-05-05 | Advanced Micro Devices, Inc. | Byte queue divided into multiple subqueues for optimizing instruction selection logic |
US5835511A (en) * | 1996-05-17 | 1998-11-10 | Advanced Micro Devices, Inc. | Method and mechanism for checking integrity of byte enable signals |
US6108769A (en) * | 1996-05-17 | 2000-08-22 | Advanced Micro Devices, Inc. | Dependency table for reducing dependency checking hardware |
US5822560A (en) * | 1996-05-23 | 1998-10-13 | Advanced Micro Devices, Inc. | Apparatus for efficient instruction execution via variable issue and variable control vectors per issue |
US5903740A (en) * | 1996-07-24 | 1999-05-11 | Advanced Micro Devices, Inc. | Apparatus and method for retiring instructions in excess of the number of accessible write ports |
US6189089B1 (en) | 1996-07-24 | 2001-02-13 | Advanced Micro Devices, Inc. | Apparatus and method for retiring instructions in excess of the number of accessible write ports |
US6161172A (en) * | 1996-07-24 | 2000-12-12 | Advanced Micro Devices, Inc. | Method for concurrently dispatching microcode and directly-decoded instructions in a microprocessor |
US5867680A (en) * | 1996-07-24 | 1999-02-02 | Advanced Micro Devices, Inc. | Microprocessor configured to simultaneously dispatch microcode and directly-decoded instructions |
US6049863A (en) * | 1996-07-24 | 2000-04-11 | Advanced Micro Devices, Inc. | Predecoding technique for indicating locations of opcode bytes in variable byte-length instructions within a superscalar microprocessor |
US5884058A (en) * | 1996-07-24 | 1999-03-16 | Advanced Micro Devices, Inc. | Method for concurrently dispatching microcode and directly-decoded instructions in a microprocessor |
US5813045A (en) * | 1996-07-24 | 1998-09-22 | Advanced Micro Devices, Inc. | Conditional early data address generation mechanism for a microprocessor |
US5900013A (en) * | 1996-07-26 | 1999-05-04 | Advanced Micro Devices, Inc. | Dual comparator scheme for detecting a wrap-around condition and generating a cancel signal for removing wrap-around buffer entries |
US5946468A (en) * | 1996-07-26 | 1999-08-31 | Advanced Micro Devices, Inc. | Reorder buffer having an improved future file for storing speculative instruction execution results |
US5872951A (en) * | 1996-07-26 | 1999-02-16 | Advanced Micro Design, Inc. | Reorder buffer having a future file for storing speculative instruction execution results |
US5915110A (en) * | 1996-07-26 | 1999-06-22 | Advanced Micro Devices, Inc. | Branch misprediction recovery in a reorder buffer having a future file |
US5872943A (en) * | 1996-07-26 | 1999-02-16 | Advanced Micro Devices, Inc. | Apparatus for aligning instructions using predecoded shift amounts |
US5961634A (en) * | 1996-07-26 | 1999-10-05 | Advanced Micro Devices, Inc. | Reorder buffer having a future file for storing speculative instruction execution results |
US5822575A (en) * | 1996-09-12 | 1998-10-13 | Advanced Micro Devices, Inc. | Branch prediction storage for storing branch prediction information such that a corresponding tag may be routed with the branch instruction |
US5765016A (en) * | 1996-09-12 | 1998-06-09 | Advanced Micro Devices, Inc. | Reorder buffer configured to store both speculative and committed register states |
US5794028A (en) * | 1996-10-17 | 1998-08-11 | Advanced Micro Devices, Inc. | Shared branch prediction structure |
US5920710A (en) * | 1996-11-18 | 1999-07-06 | Advanced Micro Devices, Inc. | Apparatus and method for modifying status bits in a reorder buffer with a large speculative state |
US5870579A (en) * | 1996-11-18 | 1999-02-09 | Advanced Micro Devices, Inc. | Reorder buffer including a circuit for selecting a designated mask corresponding to an instruction that results in an exception |
US5961638A (en) * | 1996-11-19 | 1999-10-05 | Advanced Micro Devices, Inc. | Branch prediction mechanism employing branch selectors to select a branch prediction |
US6247123B1 (en) | 1996-11-19 | 2001-06-12 | Advanced Micro Devices, Inc. | Branch prediction mechanism employing branch selectors to select a branch prediction |
US5978906A (en) * | 1996-11-19 | 1999-11-02 | Advanced Micro Devices, Inc. | Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions |
US6141748A (en) * | 1996-11-19 | 2000-10-31 | Advanced Micro Devices, Inc. | Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions |
US5995749A (en) * | 1996-11-19 | 1999-11-30 | Advanced Micro Devices, Inc. | Branch prediction mechanism employing branch selectors to select a branch prediction |
US6279107B1 (en) | 1996-11-19 | 2001-08-21 | Advanced Micro Devices, Inc. | Branch selectors associated with byte ranges within an instruction cache for rapidly identifying branch predictions |
US5954816A (en) * | 1996-11-19 | 1999-09-21 | Advanced Micro Devices, Inc. | Branch selector prediction |
US6175906B1 (en) | 1996-12-06 | 2001-01-16 | Advanced Micro Devices, Inc. | Mechanism for fast revalidation of virtual tags |
US5944812A (en) * | 1996-12-13 | 1999-08-31 | Advanced Micro Devices, Inc. | Register rename stack for a microprocessor |
US5881305A (en) * | 1996-12-13 | 1999-03-09 | Advanced Micro Devices, Inc. | Register rename stack for a microprocessor |
US5922069A (en) * | 1996-12-13 | 1999-07-13 | Advanced Micro Devices, Inc. | Reorder buffer which forwards operands independent of storing destination specifiers therein |
US5870580A (en) * | 1996-12-13 | 1999-02-09 | Advanced Micro Devices, Inc. | Decoupled forwarding reorder buffer configured to allocate storage in chunks for instructions having unresolved dependencies |
US5987596A (en) * | 1996-12-13 | 1999-11-16 | Advanced Micro Devices, Inc. | Register rename stack for a microprocessor |
US5862065A (en) * | 1997-02-13 | 1999-01-19 | Advanced Micro Devices, Inc. | Method and circuit for fast generation of zero flag condition code in a microprocessor-based computer |
US6292884B1 (en) | 1997-02-20 | 2001-09-18 | Advanced Micro Devices, Inc. | Reorder buffer employing last in line indication |
US5768555A (en) * | 1997-02-20 | 1998-06-16 | Advanced Micro Devices, Inc. | Reorder buffer employing last in buffer and last in line bits |
US6032251A (en) * | 1997-02-20 | 2000-02-29 | Advanced Micro Devices, Inc. | Computer system including a microprocessor having a reorder buffer employing last in buffer and last in line indications |
US6141740A (en) * | 1997-03-03 | 2000-10-31 | Advanced Micro Devices, Inc. | Apparatus and method for microcode patching for generating a next address |
US6233672B1 (en) | 1997-03-06 | 2001-05-15 | Advanced Micro Devices, Inc. | Piping rounding mode bits with floating point instructions to eliminate serialization |
US5968163A (en) * | 1997-03-10 | 1999-10-19 | Advanced Micro Devices, Inc. | Microcode scan unit for scanning microcode instructions using predecode data |
US5852727A (en) * | 1997-03-10 | 1998-12-22 | Advanced Micro Devices, Inc. | Instruction scanning unit for locating instructions via parallel scanning of start and end byte information |
US6202142B1 (en) | 1997-03-10 | 2001-03-13 | Advanced Micro Devices, Inc. | Microcode scan unit for scanning microcode instructions using predecode data |
US5850532A (en) * | 1997-03-10 | 1998-12-15 | Advanced Micro Devices, Inc. | Invalid instruction scan unit for detecting invalid predecode data corresponding to instructions being fetched |
US6076146A (en) * | 1997-03-12 | 2000-06-13 | Advanced Micro Devices, Inc. | Cache holding register for delayed update of a cache line into an instruction cache |
US5859992A (en) * | 1997-03-12 | 1999-01-12 | Advanced Micro Devices, Inc. | Instruction alignment using a dispatch list and a latch list |
US5983321A (en) * | 1997-03-12 | 1999-11-09 | Advanced Micro Devices, Inc. | Cache holding register for receiving instruction packets and for providing the instruction packets to a predecode unit and instruction cache |
US5859998A (en) * | 1997-03-19 | 1999-01-12 | Advanced Micro Devices, Inc. | Hierarchical microcode implementation of floating point instructions for a microprocessor |
US5887185A (en) * | 1997-03-19 | 1999-03-23 | Advanced Micro Devices, Inc. | Interface for coupling a floating point unit to a reorder buffer |
US5828873A (en) * | 1997-03-19 | 1998-10-27 | Advanced Micro Devices, Inc. | Assembly queue for a floating point unit |
US5930492A (en) * | 1997-03-19 | 1999-07-27 | Advanced Micro Devices, Inc. | Rapid pipeline control using a control word and a steering word |
US5987235A (en) * | 1997-04-04 | 1999-11-16 | Advanced Micro Devices, Inc. | Method and apparatus for predecoding variable byte length instructions for fast scanning of instructions |
US5901076A (en) * | 1997-04-16 | 1999-05-04 | Advanced Micro Designs, Inc. | Ripple carry shifter in a floating point arithmetic unit of a microprocessor |
US6003128A (en) * | 1997-05-01 | 1999-12-14 | Advanced Micro Devices, Inc. | Number of pipeline stages and loop length related counter differential based end-loop prediction |
US6122729A (en) * | 1997-05-13 | 2000-09-19 | Advanced Micro Devices, Inc. | Prefetch buffer which stores a pointer indicating an initial predecode position |
US6367006B1 (en) | 1997-05-13 | 2002-04-02 | Advanced Micro Devices, Inc. | Predecode buffer including buffer pointer indicating another buffer for predecoding |
US5845101A (en) * | 1997-05-13 | 1998-12-01 | Advanced Micro Devices, Inc. | Prefetch buffer for storing instructions prior to placing the instructions in an instruction cache |
US5951675A (en) * | 1997-06-11 | 1999-09-14 | Advanced Micro Devices, Inc. | Instruction alignment unit employing dual instruction queues for high frequency instruction dispatch |
US6009511A (en) * | 1997-06-11 | 1999-12-28 | Advanced Micro Devices, Inc. | Apparatus and method for tagging floating point operands and results for rapid detection of special floating point numbers |
US5872946A (en) * | 1997-06-11 | 1999-02-16 | Advanced Micro Devices, Inc. | Instruction alignment unit employing dual instruction queues for high frequency instruction dispatch |
US6073230A (en) * | 1997-06-11 | 2000-06-06 | Advanced Micro Devices, Inc. | Instruction fetch unit configured to provide sequential way prediction for sequential instruction fetches |
US5940602A (en) * | 1997-06-11 | 1999-08-17 | Advanced Micro Devices, Inc. | Method and apparatus for predecoding variable byte length instructions for scanning of a number of RISC operations |
US6085311A (en) * | 1997-06-11 | 2000-07-04 | Advanced Micro Devices, Inc. | Instruction alignment unit employing dual instruction queues for high frequency instruction dispatch |
US6101595A (en) * | 1997-06-11 | 2000-08-08 | Advanced Micro Devices, Inc. | Fetching instructions from an instruction cache using sequential way prediction |
US5983337A (en) * | 1997-06-12 | 1999-11-09 | Advanced Micro Devices, Inc. | Apparatus and method for patching an instruction by providing a substitute instruction or instructions from an external memory responsive to detecting an opcode of the instruction |
US5933629A (en) * | 1997-06-12 | 1999-08-03 | Advanced Micro Devices, Inc. | Apparatus and method for detecting microbranches early |
US6014741A (en) * | 1997-06-12 | 2000-01-11 | Advanced Micro Devices, Inc. | Apparatus and method for predicting an end of a microcode loop |
US5898865A (en) * | 1997-06-12 | 1999-04-27 | Advanced Micro Devices, Inc. | Apparatus and method for predicting an end of loop for string instructions |
US6009513A (en) * | 1997-06-12 | 1999-12-28 | Advanced Micro Devices, Inc. | Apparatus and method for detecting microbranches early |
US6192468B1 (en) | 1997-06-12 | 2001-02-20 | Advanced Micro Devices, Inc. | Apparatus and method for detecting microbranches early |
US5933626A (en) * | 1997-06-12 | 1999-08-03 | Advanced Micro Devices, Inc. | Apparatus and method for tracing microprocessor instructions |
US6012125A (en) * | 1997-06-20 | 2000-01-04 | Advanced Micro Devices, Inc. | Superscalar microprocessor including a decoded instruction cache configured to receive partially decoded instructions |
US5978901A (en) * | 1997-08-21 | 1999-11-02 | Advanced Micro Devices, Inc. | Floating point and multimedia unit with data type reclassification capability |
US6101577A (en) * | 1997-09-15 | 2000-08-08 | Advanced Micro Devices, Inc. | Pipelined instruction cache and branch prediction mechanism therefor |
US5931943A (en) * | 1997-10-21 | 1999-08-03 | Advanced Micro Devices, Inc. | Floating point NaN comparison |
US6032252A (en) * | 1997-10-28 | 2000-02-29 | Advanced Micro Devices, Inc. | Apparatus and method for efficient loop control in a superscalar microprocessor |
US5974542A (en) * | 1997-10-30 | 1999-10-26 | Advanced Micro Devices, Inc. | Branch prediction unit which approximates a larger number of branch predictions using a smaller number of branch predictions and an alternate target indication |
US6230259B1 (en) | 1997-10-31 | 2001-05-08 | Advanced Micro Devices, Inc. | Transparent extended state save |
US6157996A (en) * | 1997-11-13 | 2000-12-05 | Advanced Micro Devices, Inc. | Processor programably configurable to execute enhanced variable byte length instructions including predicated execution, three operand addressing, and increased register space |
US6199154B1 (en) | 1997-11-17 | 2001-03-06 | Advanced Micro Devices, Inc. | Selecting cache to fetch in multi-level cache system based on fetch address source and pre-fetching additional data to the cache for future access |
US6367001B1 (en) | 1997-11-17 | 2002-04-02 | Advanced Micro Devices, Inc. | Processor including efficient fetch mechanism for L0 and L1 caches |
US6505292B1 (en) | 1997-11-17 | 2003-01-07 | Advanced Micro Devices, Inc. | Processor including efficient fetch mechanism for L0 and L1 caches |
US6079005A (en) * | 1997-11-20 | 2000-06-20 | Advanced Micro Devices, Inc. | Microprocessor including virtual address branch prediction and current page register to provide page portion of virtual and physical fetch address |
US6154818A (en) * | 1997-11-20 | 2000-11-28 | Advanced Micro Devices, Inc. | System and method of controlling access to privilege partitioned address space for a model specific register file |
US6516395B1 (en) | 1997-11-20 | 2003-02-04 | Advanced Micro Devices, Inc. | System and method for controlling access to a privilege-partitioned address space with a fixed set of attributes |
US6266752B1 (en) | 1997-11-20 | 2001-07-24 | Advanced Micro Devices, Inc. | Reverse TLB for providing branch target address in a microprocessor having a physically-tagged cache |
US6079003A (en) * | 1997-11-20 | 2000-06-20 | Advanced Micro Devices, Inc. | Reverse TLB for providing branch target address in a microprocessor having a physically-tagged cache |
US5974432A (en) * | 1997-12-05 | 1999-10-26 | Advanced Micro Devices, Inc. | On-the-fly one-hot encoding of leading zero count |
US5870578A (en) * | 1997-12-09 | 1999-02-09 | Advanced Micro Devices, Inc. | Workload balancing in a microprocessor for reduced instruction dispatch stalling |
US6157986A (en) * | 1997-12-16 | 2000-12-05 | Advanced Micro Devices, Inc. | Fast linear tag validation unit for use in microprocessor |
US6115792A (en) * | 1997-12-16 | 2000-09-05 | Advanced Micro Devices, Inc. | Way prediction logic for cache array |
US6016545A (en) * | 1997-12-16 | 2000-01-18 | Advanced Micro Devices, Inc. | Reduced size storage apparatus for storing cache-line-related data in a high frequency microprocessor |
US6016533A (en) * | 1997-12-16 | 2000-01-18 | Advanced Micro Devices, Inc. | Way prediction logic for cache array |
US6112018A (en) * | 1997-12-18 | 2000-08-29 | Advanced Micro Devices, Inc. | Apparatus for exchanging two stack registers |
US6205541B1 (en) * | 1997-12-18 | 2001-03-20 | Advanced Micro Devices, Inc. | System and method using selection logic units to define stack orders |
US6018798A (en) * | 1997-12-18 | 2000-01-25 | Advanced Micro Devices, Inc. | Floating point unit using a central window for storing instructions capable of executing multiple instructions in a single clock cycle |
US6112296A (en) * | 1997-12-18 | 2000-08-29 | Advanced Micro Devices, Inc. | Floating point stack manipulation using a register map and speculative top of stack values |
US6175908B1 (en) | 1998-04-30 | 2001-01-16 | Advanced Micro Devices, Inc. | Variable byte-length instructions using state of function bit of second byte of plurality of instructions bytes as indicative of whether first byte is a prefix byte |
US6141745A (en) * | 1998-04-30 | 2000-10-31 | Advanced Micro Devices, Inc. | Functional bit identifying a prefix byte via a particular state regardless of type of instruction |
US6119223A (en) * | 1998-07-31 | 2000-09-12 | Advanced Micro Devices, Inc. | Map unit having rapid misprediction recovery |
US6247106B1 (en) | 1998-07-31 | 2001-06-12 | Advanced Micro Devices, Inc. | Processor configured to map logical register numbers to physical register numbers using virtual register numbers |
US6230262B1 (en) | 1998-07-31 | 2001-05-08 | Advanced Micro Devices, Inc. | Processor configured to selectively free physical registers upon retirement of instructions |
US6122656A (en) * | 1998-07-31 | 2000-09-19 | Advanced Micro Devices, Inc. | Processor configured to map logical register numbers to physical register numbers using virtual register numbers |
US6415360B1 (en) | 1999-05-18 | 2002-07-02 | Advanced Micro Devices, Inc. | Minimizing self-modifying code checks for uncacheable memory types |
US6393536B1 (en) | 1999-05-18 | 2002-05-21 | Advanced Micro Devices, Inc. | Load/store unit employing last-in-buffer indication for rapid load-hit-store |
US6473832B1 (en) | 1999-05-18 | 2002-10-29 | Advanced Micro Devices, Inc. | Load/store unit having pre-cache and post-cache queues for low latency load memory operations |
US6473837B1 (en) | 1999-05-18 | 2002-10-29 | Advanced Micro Devices, Inc. | Snoop resynchronization mechanism to preserve read ordering |
US6427193B1 (en) | 1999-05-18 | 2002-07-30 | Advanced Micro Devices, Inc. | Deadlock avoidance using exponential backoff |
US6266744B1 (en) | 1999-05-18 | 2001-07-24 | Advanced Micro Devices, Inc. | Store to load forwarding using a dependency link file |
US6549990B2 (en) | 1999-05-18 | 2003-04-15 | Advanced Micro Devices, Inc. | Store to load forwarding using a dependency link file |
US6442707B1 (en) | 1999-10-29 | 2002-08-27 | Advanced Micro Devices, Inc. | Alternate fault handler |
US6662280B1 (en) | 1999-11-10 | 2003-12-09 | Advanced Micro Devices, Inc. | Store buffer which forwards data based on index and optional way match |
US6961824B2 (en) | 2000-08-07 | 2005-11-01 | Broadcom Corporation | Deterministic setting of replacement policy in a cache |
US6732234B1 (en) | 2000-08-07 | 2004-05-04 | Broadcom Corporation | Direct access mode for a cache |
US7228386B2 (en) | 2000-08-07 | 2007-06-05 | Broadcom Corporation | Programmably disabling one or more cache entries |
US7177986B2 (en) | 2000-08-07 | 2007-02-13 | Broadcom Corporation | Direct access mode for a cache |
US6748492B1 (en) | 2000-08-07 | 2004-06-08 | Broadcom Corporation | Deterministic setting of replacement policy in a cache through way selection |
US6848024B1 (en) | 2000-08-07 | 2005-01-25 | Broadcom Corporation | Programmably disabling one or more cache entries |
US6981132B2 (en) | 2000-08-09 | 2005-12-27 | Advanced Micro Devices, Inc. | Uniform register addressing using prefix byte |
US6877084B1 (en) | 2000-08-09 | 2005-04-05 | Advanced Micro Devices, Inc. | Central processing unit (CPU) accessing an extended register set in an extended register mode |
US6738792B1 (en) | 2001-03-09 | 2004-05-18 | Advanced Micro Devices, Inc. | Parallel mask generator |
US7000076B2 (en) | 2001-05-15 | 2006-02-14 | Broadcom Corporation | Random generator |
US6748495B2 (en) | 2001-05-15 | 2004-06-08 | Broadcom Corporation | Random generator |
US6988168B2 (en) | 2002-05-15 | 2006-01-17 | Broadcom Corporation | Cache programmable to partition ways to agents and/or local/remote blocks |
US7321964B2 (en) | 2003-07-08 | 2008-01-22 | Advanced Micro Devices, Inc. | Store-to-load forwarding buffer using indexed lookup |
US7117290B2 (en) | 2003-09-03 | 2006-10-03 | Advanced Micro Devices, Inc. | MicroTLB and micro tag for reducing power in a processor |
US10311191B2 (en) | 2017-01-26 | 2019-06-04 | Advanced Micro Devices, Inc. | Memory including side-car arrays with irregular sized entries |
Also Published As
Publication number | Publication date |
---|---|
DE4429921A1 (en) | 1995-03-09 |
SG49220A1 (en) | 1998-05-18 |
GB9408016D0 (en) | 1994-06-15 |
GB2281422B (en) | 1997-09-10 |
IE80854B1 (en) | 1999-04-07 |
JPH0784965A (en) | 1995-03-31 |
IE940337A1 (en) | 1995-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
GB2281422A (en) | Processor ordering consistency for a processor performing out-of-order instruction execution | |
US5721855A (en) | Method for pipeline processing of instructions by controlling access to a reorder buffer using a register file outside the reorder buffer | |
US5742780A (en) | Dual pipeline superscalar reduced instruction set computer system architecture | |
US6021485A (en) | Forwarding store instruction result to load instruction with reduced stall or flushing by effective/real data address bytes matching | |
US7552318B2 (en) | Branch lookahead prefetch for microprocessors | |
US6079014A (en) | Processor that redirects an instruction fetch pipeline immediately upon detection of a mispredicted branch while committing prior instructions to an architectural state | |
US5630149A (en) | Pipelined processor with register renaming hardware to accommodate multiple size registers | |
US5751983A (en) | Out-of-order processor with a memory subsystem which handles speculatively dispatched load operations | |
US6341324B1 (en) | Exception processing in superscalar microprocessor | |
US6138230A (en) | Processor with multiple execution pipelines using pipe stage state information to control independent movement of instructions between pipe stages of an execution pipeline | |
US5931957A (en) | Support for out-of-order execution of loads and stores in a processor | |
US6721874B1 (en) | Method and system for dynamically shared completion table supporting multiple threads in a processing system | |
EP1296230B1 (en) | Instruction issuing in the presence of load misses | |
US6845442B1 (en) | System and method of using speculative operand sources in order to speculatively bypass load-store operations | |
EP0649085B1 (en) | Microprocessor pipe control and register translation | |
US5625788A (en) | Microprocessor with novel instruction for signaling event occurrence and for providing event handling information in response thereto | |
US5987600A (en) | Exception handling in a processor that performs speculative out-of-order instruction execution | |
US6728872B1 (en) | Method and apparatus for verifying that instructions are pipelined in correct architectural sequence | |
KR100284788B1 (en) | Method and system for processing branches during emulation in data processing system | |
US5748937A (en) | Computer system that maintains processor ordering consistency by snooping an external bus for conflicts during out of order execution of memory access instructions | |
US5721857A (en) | Method and apparatus for saving the effective address of floating point memory operations in an out-of-order microprocessor | |
KR19990029287A (en) | Indirect unconditional branch in data processing system emulation mode | |
US20040215936A1 (en) | Method and circuit for using a single rename array in a simultaneous multithread system | |
US6073231A (en) | Pipelined processor with microcontrol of register translation hardware | |
US7441109B2 (en) | Computer system with a debug facility for a pipelined processor using predicated execution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PCNP | Patent ceased through non-payment of renewal fee |
Effective date: 20010422 |