[go: up one dir, main page]

0% found this document useful (0 votes)
29 views15 pages

Unit-4 Processor in DPCO

Cs3351 dpco

Uploaded by

SudhaRaman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
29 views15 pages

Unit-4 Processor in DPCO

Cs3351 dpco

Uploaded by

SudhaRaman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 15
Unita Instruction Execution ~ Building a Data Path — Designing a Control Unit — Hardwired Control, Microprogrammed Control — Pipelining — Data Hazard — Control Hazards. * Please watch the videos before referring the notes 1. _ Instruction Execution Steps in detail: All instructions start by using the program counter to supply the instruction address to the instruction memory. After the instruction is fetched, the register operands used by an instruction are specified by fields of that instruction. & Once the register operands have been fetched, all the instruction classes, except jump, use the ALU after reading the registers. > Memory reference instructions (load or store) use the ALU for an address calculation, > Arithmetic Logical instructions use the ALU for the operation execution, > Branches use the ALU for comparison. The second input to the ALU can come from a register or the immediate field of the instruction. © After using the ALU, the actions required to complete various instruction classes are not same. > Ifthe operation is a memory reference instruction a load or store, the ALU result is used as an address to either store a value from the registers or load a value fiom memory into the registers. The result from the ALU or memory is, written back into the register file. > If the instruction is an arithmetic-logical instruction, the result from the ALU. must be written to a register. > Branches require the use of the ALU output to determine the next instruction address, which comes either from the ALU (where the PC and branch off set are summed) or from an adder that increments the current PC by 4. Main 5 steps 1. Fetch an instruction and increment the program counter. Decode the instruction and read registers from the register file. Perform an ALU operation. Read or write memory data if the instruction involves a memory operand. Write the result into the destination register, if needed. yawns Load Instruction Eg. Load RS, X(R7) Steps are as follows: 1. Fetch the instruction from the memory. 2. Increment the program counter. 3. Decode the instruction to determine the operation to be performed. 4. Read register R7. 5. Add the immediate value X to the contents of R7. 6. Use the sum X + [R7]as the effective address of the source operand, and read the contents of that location in the memory. 7. Load the data received from the memory into the destination register, RS. ‘© Depending on how the hardware is organized, some of these actions can be performed at the same time. Arithmetic and Logic Instruction ‘© There are either two source registers, or a source register and an immediate source operand. © Noaccess to memory operands is required. Eg. Add R3, R4, RS Steps as follows 1. Fetch the instruction and increment the program counter. 2. Decode the instruction and read registers R4 and RS. 3. Compute the sum [R4] + [R5]. 4. No action. 5. Load the result into the destination register, R3. Store Instruction Store R6, X(R8) Steps as follows: 1. Fetch the instruction and increment the program counter. 2. Decode the instruction and read registers R6 and R8. 3. Compute the effective address X + [R8]. 4. Store the contents of register R6 into memory location X + [RS]. 5. No action, IL. Building a Datapath - Diagram is Mandatory ( Write individual blocks separately first, then at last draw this final diagram. Individual blocks I have mentioned in the video) Datapath © A datapath is a collection of functional units such as arithmetic logic units or multipliers that perform data processing operations, registers, and buses.Along with the control unit it composes the central processing unit (CPU). © A larger datapath can be made by joining more than one datapaths using multiplexers. 1, Program Counter(PC) A program counter is a register in a computer processor that contains the address (location) of the instruction being executed at the current time. As each instruction gets fetched, the program counter increases its stored value by 1 2. Adder Used to increment the PC to the address of the next instruction. It is built from the ALU. 3. Instruction Memory a, A memory unit to store the instructions of a program and supply instructions given an address. 4. Registers % The processor’s 32 general-purpose registers are stored in a structure called a re file, A register file is a collection of registers in which any register can be read or written by specifying the number of the register in the file. % — Theregister file contains the register state of the computer — AnALU is used to operate on the values read from the ogisters. 5. Processing of R- format instruction in ALU: add $tl, $t2, $t3, — R-format instructions have three register operands, 0 we will need to read two data Example words from the register file and write one data word into the register file for each instruction, 4 Foreach data word to be read from the registers, we need an input to the re that specifies the register number to be read and an output from the register file that, will carry the value that has been read from the registers, ‘The two values read are added using an ALU. r file % — Towrite a data word, we will need two inputs: one to specify the register number to be written and one to supply the data to be written into the register. 6. Processing of Load/Store Instruction: Example Iw Stl offet_value($t2) ‘sw Stl ,offsct_value ($12) 1. Sign Eatend- Convert the 16-bit offset field in the instruction toa 32-bit signed value. 2. Data Memory - The memory unit is a state element with inputs for the address and the write data, and a single output for the read result. There are separate read and write controls, although only one of these may be asserted on any given clock. 7. Processing of Jump Instructions Eg. beq Stl,St2,offset Explanation of example : The beq instruction has three operands, two registers that are compared for equality. If contents of tl = contents of 2 — Compute target using offset and take branch. ( ALU used to check equality- If zero flag is set means t1==t2) Else - Proceed with next instruction 1. Separate adder - Used for computing the branch target address. 2. Shift left - Used to add two zeroes to the low-order end of the sign-extended offset field. Multiplexer - It is mainly used to select the circuit combination as per the nature of instruction, Control Signals - I have not covered this in video. But you can read if required. Ea ie etc Effect when asserted RegDst The register destination number for the —_| The register destination number for the Write Wirite register comes from the rtfield | register comes from the rd feld (bits 15:11). (bits 20:16). RegWrite None. ‘The register on the Write register input is ‘written with the value on the Write data input. ‘ALUStc The second ALU operand comes from the | The second ALU operand isthe sign- second register fle output (Reac data 2). | extended, lower 16 bits of the instruction. PCSre The PCs replaced by the output of the | The PC is replaced by the output ofthe adder adder that computes the value of PC+ 4. | that computes the branch target MemRead None. Data memory contents designated by the address input are put on the Read dala output. MemWrite | None. Data memory contents designated by the address input are replaced by the vaiue on the Write data input. MemtoReg | The value fed tothe register Write data | The value fed to the register Write data input input comes from the ALU. comes from the data memory. TIL. Control Unit The setting of the control signals depends on. + Contents of the step counter + Contents of the instruction register + The result of a computation or a comparison operation + External input signals, such as interrupt requests Hardwired Control Unit © It isa method of generating control signals with the help of Finite State Machines (FSM). It’s made in the form of a sequential logic circuit by physically connecting components such as flip-flops, gates, and drums that result in the finished circuit. As a result, it's known as a hardwired controller. © Instruction register is a type of processor register used to contain an instruction that is currently in execution. It generates the OP-code bits respective of the operation as well as the addressing mode of operands. © The instruction decoder decodes the opcode. Now on the basis of the addressing ‘mode of instruction and operation which exists in the instruction register, the instruction decoder sets the corresponding Instruction signal INS, to 1. ‘© Step Counter - specifies the current step of instruction execution. It contains the signals from TI.,...., TS. Now on the basis of the step which contains the instruction, one of the signals of a step counter will be set from TI to TS to 1. © — Clock - The one-clock cycle of the clock will be completed for each step. For example, suppose that if the stop counter sets T3 to 1, then after completing one clock cycle, the step counter will set T4 to 1. © Counter Enable will "disable" the Step Counter so that it will stop till current step of execution is complete,then increment to the next step signal. © Condition Signals - There are various conditions in which the signals are gencrated with the help of control signals that can be less than, greater than, less than equal, greater than equal, and many more. © The external input is the last one. It is used to tell the Control Signal Generator about the interrupts, which will affect the execution of an instruction. ‘Microprogrammed Control Unit © Acontrol unit whose binary control values are saved as words in memory is called a microprogrammed control unit. 1. Control Word: A control word is a word whose individual bits represent various control signals. 2. Micro-routine: A sequence of control words corresponding to the control sequence of a machine instruction constitutes the micro-routine for that instruction. 3. Micro-instruction: Individual control words in this micro-routine are referred to.as microinstructions. . Micro-program: A sequence of micro-instructions is called amicro-program, whichis stored ina ROM or RAM called @ Control Memory (CM). 5. Control Store: the micro-routines for allinstructions in the instruction set of a computer are stored ina special memory called the Control Store. " 1 Microinstruction / 1 Control Word oer) free 110101110001 1010) raed RM EC ee SEECCe) Pn) eerie) Ecco The Contiol memory adress register species the address ofthe microcnstucton The Control memory is assumed to be a ROM, within which all contol information is permanently stored. ‘The control register holds the microinstruction fetched from the memory. The micro-insbuction contains a control word that specifies one or more micio-opeations forthe data processor. While the micro-operations are being executed, the next adress is computed in the next address generator circuit and then transferted into the control addiess register to read the next microinstruction. The next address generator is oten referred to as a micro-program sequencer, as it determines the address sequence thats read from control memory. ‘Speed ast ‘Slow Cost of More Cheaper Implementation Flexibility Difficult to modify Flexible Ability to handle Difficult Easier complex instruction Decoding Complex, Easy Application isc cise Instruction Set Size Small Large Control Memory Absent Present Pipelining © Pipelining is an implementation technique in which multiple instructions are overlapped in execution. © Pipelining is a process of arrangement of hardware elements of the CPU such that its overall performance is increased. ‘© Simultaneous execution of more than one instruction takes place in a pipelined processor. Real Life Example - Explanation refer my video opm 7 8 9 aac eacr aioe 12 4 2am i Task order * soa 8 Goes ¢ aoe=8 > Goei — i . FIGURE 4.25 The laundry analogy for pipelining. Ann, rian, Cathy, and Don each have dirty clothes to be washed, dried, felded, and put away. The washer, dryer, “folder and “storer” each take 30 minutes for their task. Sequential laundry takes 8 hours for 4 loads of wash, while pipelined laundry takes just 3.5 hours. We show the pipeline stage of different loads over time by showing copies ofthe four resources ‘on this two-dimensional time line, but we realy have just one of each resource Design of a basic pipeline © In apipelined processor, a pipeline has two ends, the input end and the output end. Between these ends, there are multiple stages/segments such that the output of one stage is connected to the input of the next stage and each stage performs a specific operation. ‘© Interface registers arc used to hold the intermediate output between two stages. These interface registers are also called latch or buffer. © Alll the stages in the pipeline along with the interface registers are controlled by a common clock. Diagrammatic Representation of No-Pipeline vs Pipeline Pipelined Version Inst, Inst, Inst, Inst, Inst, Main points: ‘© Instruction pipelining is a technique that implements a form of parallelism called as. instruction level parallelism within a single processor. ‘© Multiple instructions are executed parallely.. © Staging: © The hardware of the CPU is split up into several functional units. © Each functional unit performs a dedicated task. © The number of functional units may vary from processor to processor. © These functional units are called as stages of the pipeline. © — Control unit manages all the stages using control signals. ° ° There is a register associated with each stage that holds the data. There is a global clock that synchronizes the working of all the stages. At the beginning of each clock cycle, each stage takes the input from its register. Each stage then processes the data and feed its output to the register of the next stage. ° Advantages vs Drawbacks ‘Aovantages of Pipelining ‘© Insiructon throughput increeses, ‘© Increase in the numberof pipeline stages increases the numberof instuctions executed simultaneously ‘© Faster ALU can be designee when pipalining is used. © Pipelined CPU's works at higher clock frequencies than the RAW, ‘© Pipelining increases the overall performance ofthe Py, Disadvantages of Pipelining Designing of the pipelined processor is complex. Instruction latency increases in pipelined processors, The throughput of pipelined processor is dificuttto predict. The longer the pipeline, worse the problem of hazard for branch instructions, Hazards There are situations in pipelining when the next instruction cannot execute in the following clock cycle. These events are called hazards, and there are three different types. 1) Structural Hazard 2) Data Hazard 3) Control Hazard 1. Structural Hazard ® _Itmeans that the hardware cannot support the combination of instructions that we ‘want to execute in the same clock cycle. 4 This dependency arises due to the resource contfict in the pipeline. A resource contct isa situation when more than one instruction ties to access the same resource in the same oycle. A resource can be a register, memory, or ALU. Fem) 10 x Ie(em) EX. © Inthe above scenario, in cycle 4, instructions I1 and I4 are trying to access same resource (Memory) which introduces a resource conflict. © To avoid this problem, we have to keep the instruction on wait until the required resource (memory in our case) becomes available. This wait will introduce stalls in the pipeline as shown below: IF(Mem) ID EX Mem WB IF(Mem) ID EX Mem WB IF(Mem) ID EX Mem WB 2. Data Hazard ata hazards occur when instructions that exhibit data dependence, modify data in different sages ofa pipeline. 11: ADD RI, R2, R3 2: SUB RS, Ri, R2 © When the above instructions are executed in a pipelined processor, then data dependency condition will occur, which means that 12 tries to read the data before [1 writes it, therefore, 12 incorrectly gets the old value from Il. © Tominimize data dependency stalls in the pipeline, operand forwarding is used. Operand Forwarding : In operand forwarding, we use the interface registers present between the stages to hold intermediate output so that dependent instruction can access new value from the interface register directly. Program execution ei a 200400 6008001900 FIGURE 4.29 Graphical representation of forwarding. Tie connection shows the forwarding path from the output ofthe EX stage of to the input of the EX stage fr replacing the valu from regisier 530 rad inthe second stage of sub. 3.Control Hazard ‘This ype ofepandoncy acceding te tartar cf contol inenuctns sch az BRANCH, CALL MP, te ‘On many nstuctonarhtecuros, ho procastr wil not know to target address of thes insructons when it needs lo inser the new nston ino te pipeline (ue fi, unwanted isructons are fd toe pane, Eq 00. oe ee 10112 (IMP 250) (Gump adress inown attr 0 stage ony) soz: [Expected output It >a > Ble MEM 1 10(Pc250) EX Mem wa ‘The output which we get II > 12 > 13 > BIL So, the output sequence is not equal to the expected output, that means the pipeline is not implemented correctly. 1. Using Stalls - To correct the above problem we need to stop the Instruction fetch until ‘we get target address of branch instruction. This can be implemented by introducing delay slot until we get the target address. Tae asad —X MEM WB IF ID (PC:250) EX Mem WB Output Sequence: ly -> I -> Delay (Stall) -> Bh 2. Branch Prediction - There are 2 different types of prediction 1. Static a. In this strategy branch can be predicted based on branch code types statically. This means that the probability of branch with respect to a particular branch type is used to predict the branch. b. This branch strategy may not produce accurate results every time. One improvement over branch stalling is to predict that the branch will not be taken and thus continue execution down the sequential instruction stream. 2. Dynamic a. This strategy uses recent branch history during program execution to predict whether or not the branch will be taken next time when it occurs. It uses recent branch information to predict the next branch. This technique is called dynamic branch prediction. b. A branch prediction buffer or branch history table is a small memory indexed by the lower portion of the address of the branch instruction. The memory contains a bit that says whether the branch was recently taken or not. 3. Delayed Branching 1) The slot directly after a delayed branch instruction, which in the MIPS architecture is filled by an instruction that does not affect the branch. 2) An instruction that always executes after the branch in the branch delay slot.

You might also like