Unit-4 Processor in DPCO

Cs3351 dpco

Uploaded by

SudhaRaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

29 views15 pages

Unit-4 Processor in DPCO

Cs3351 dpco

Uploaded by

SudhaRaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 15

Unita Instruction Execution ~ Building a Data Path — Designing a Control Unit — Hardwired Control, Microprogrammed Control — Pipelining — Data Hazard — Control Hazards. * Please watch the videos before referring the notes 1. _ Instruction Execution Steps in detail: All instructions start by using the program counter to supply the instruction address to the instruction memory. After the instruction is fetched, the register operands used by an instruction are specified by fields of that instruction. & Once the register operands have been fetched, all the instruction classes, except jump, use the ALU after reading the registers. > Memory reference instructions (load or store) use the ALU for an address calculation, > Arithmetic Logical instructions use the ALU for the operation execution, > Branches use the ALU for comparison. The second input to the ALU can come from a register or the immediate field of the instruction. © After using the ALU, the actions required to complete various instruction classes are not same. > Ifthe operation is a memory reference instruction a load or store, the ALU result is used as an address to either store a value from the registers or load a value fiom memory into the registers. The result from the ALU or memory is, written back into the register file. > If the instruction is an arithmetic-logical instruction, the result from the ALU. must be written to a register. > Branches require the use of the ALU output to determine the next instruction address, which comes either from the ALU (where the PC and branch off set are summed) or from an adder that increments the current PC by 4. Main 5 steps 1. Fetch an instruction and increment the program counter. Decode the instruction and read registers from the register file. Perform an ALU operation. Read or write memory data if the instruction involves a memory operand. Write the result into the destination register, if needed. yawns Load Instruction Eg. Load RS, X(R7) Steps are as follows: 1. Fetch the instruction from the memory. 2. Increment the program counter.3. Decode the instruction to determine the operation to be performed. 4. Read register R7. 5. Add the immediate value X to the contents of R7. 6. Use the sum X + [R7]as the effective address of the source operand, and read the contents of that location in the memory. 7. Load the data received from the memory into the destination register, RS. ‘© Depending on how the hardware is organized, some of these actions can be performed at the same time. Arithmetic and Logic Instruction ‘© There are either two source registers, or a source register and an immediate source operand. © Noaccess to memory operands is required. Eg. Add R3, R4, RS Steps as follows 1. Fetch the instruction and increment the program counter. 2. Decode the instruction and read registers R4 and RS. 3. Compute the sum [R4] + [R5]. 4. No action. 5. Load the result into the destination register, R3. Store Instruction Store R6, X(R8) Steps as follows: 1. Fetch the instruction and increment the program counter. 2. Decode the instruction and read registers R6 and R8. 3. Compute the effective address X + [R8]. 4. Store the contents of register R6 into memory location X + [RS]. 5. No action,IL. Building a Datapath - Diagram is Mandatory ( Write individual blocks separately first, then at last draw this final diagram. Individual blocks I have mentioned in the video) Datapath © A datapath is a collection of functional units such as arithmetic logic units or multipliers that perform data processing operations, registers, and buses.Along with the control unit it composes the central processing unit (CPU). © A larger datapath can be made by joining more than one datapaths using multiplexers. 1, Program Counter(PC) A program counter is a register in a computer processor that contains the address (location) of the instruction being executed at the current time. As each instruction gets fetched, the program counter increases its stored value by 1 2. Adder Used to increment the PC to the address of the next instruction. It is built from the ALU. 3. Instruction Memory a, A memory unit to store the instructions of a program and supply instructions given an address.4. Registers % The processor’s 32 general-purpose registers are stored in a structure called a re file, A register file is a collection of registers in which any register can be read or written by specifying the number of the register in the file. % — Theregister file contains the register state of the computer — AnALU is used to operate on the values read from the ogisters. 5. Processing of R- format instruction in ALU: add $tl, $t2, $t3, — R-format instructions have three register operands, 0 we will need to read two data Example words from the register file and write one data word into the register file for each instruction, 4 Foreach data word to be read from the registers, we need an input to the re that specifies the register number to be read and an output from the register file that, will carry the value that has been read from the registers, ‘The two values read are added using an ALU. r file % — Towrite a data word, we will need two inputs: one to specify the register number to be written and one to supply the data to be written into the register. 6. Processing of Load/Store Instruction: Example Iw Stl offet_value($t2) ‘sw Stl ,offsct_value ($12) 1. Sign Eatend- Convert the 16-bit offset field in the instruction toa 32-bit signed value. 2. Data Memory - The memory unit is a state element with inputs for the address and the write data, and a single output for the read result. There are separate read and write controls, although only one of these may be asserted on any given clock. 7. Processing of Jump Instructions Eg. beq Stl,St2,offset Explanation of example : The beq instruction has three operands, two registers that are compared for equality. If contents of tl = contents of 2 — Compute target using offset and take branch. ( ALU used to check equality- If zero flag is set means t1==t2) Else - Proceed with next instruction1. Separate adder - Used for computing the branch target address. 2. Shift left - Used to add two zeroes to the low-order end of the sign-extended offset field. Multiplexer - It is mainly used to select the circuit combination as per the nature of instruction, Control Signals - I have not covered this in video. But you can read if required. Ea ie etc Effect when asserted RegDst The register destination number for the —_| The register destination number for the Write Wirite register comes from the rtfield | register comes from the rd feld (bits 15:11). (bits 20:16). RegWrite None. ‘The register on the Write register input is ‘written with the value on the Write data input. ‘ALUStc The second ALU operand comes from the | The second ALU operand isthe sign- second register fle output (Reac data 2). | extended, lower 16 bits of the instruction. PCSre The PCs replaced by the output of the | The PC is replaced by the output ofthe adder adder that computes the value of PC+ 4. | that computes the branch target MemRead None. Data memory contents designated by the address input are put on the Read dala output. MemWrite | None. Data memory contents designated by the address input are replaced by the vaiue on the Write data input. MemtoReg | The value fed tothe register Write data | The value fed to the register Write data input input comes from the ALU. comes from the data memory.TIL. Control Unit The setting of the control signals depends on. + Contents of the step counter + Contents of the instruction register + The result of a computation or a comparison operation + External input signals, such as interrupt requests Hardwired Control Unit © It isa method of generating control signals with the help of Finite State Machines (FSM). It’s made in the form of a sequential logic circuit by physically connecting components such as flip-flops, gates, and drums that result in the finished circuit. As a result, it's known as a hardwired controller. © Instruction register is a type of processor register used to contain an instruction that is currently in execution. It generates the OP-code bits respective of the operation as well as the addressing mode of operands. © The instruction decoder decodes the opcode. Now on the basis of the addressing‘mode of instruction and operation which exists in the instruction register, the instruction decoder sets the corresponding Instruction signal INS, to 1. ‘© Step Counter - specifies the current step of instruction execution. It contains the signals from TI.,...., TS. Now on the basis of the step which contains the instruction, one of the signals of a step counter will be set from TI to TS to 1. © — Clock - The one-clock cycle of the clock will be completed for each step. For example, suppose that if the stop counter sets T3 to 1, then after completing one clock cycle, the step counter will set T4 to 1. © Counter Enable will "disable" the Step Counter so that it will stop till current step of execution is complete,then increment to the next step signal. © Condition Signals - There are various conditions in which the signals are gencrated with the help of control signals that can be less than, greater than, less than equal, greater than equal, and many more. © The external input is the last one. It is used to tell the Control Signal Generator about the interrupts, which will affect the execution of an instruction. ‘Microprogrammed Control Unit © Acontrol unit whose binary control values are saved as words in memory is called a microprogrammed control unit. 1. Control Word: A control word is a word whose individual bits represent various control signals. 2. Micro-routine: A sequence of control words corresponding to the control sequence of a machine instruction constitutes the micro-routine for that instruction. 3. Micro-instruction: Individual control words in this micro-routine are referred to.as microinstructions. . Micro-program: A sequence of micro-instructions is called amicro-program, whichis stored ina ROM or RAM called @ Control Memory (CM). 5. Control Store: the micro-routines for allinstructions in the instruction set of a computer are stored ina special memory called the Control Store." 1 Microinstruction / 1 Control Word oer) free 110101110001 1010) raed RM EC ee SEECCe) Pn) eerie) Ecco The Contiol memory adress register species the address ofthe microcnstucton The Control memory is assumed to be a ROM, within which all contol information is permanently stored. ‘The control register holds the microinstruction fetched from the memory. The micro-insbuction contains a control word that specifies one or more micio-opeations forthe data processor. While the micro-operations are being executed, the next adress is computed in the next address generator circuit and then transferted into the control addiess register to read the next microinstruction. The next address generator is oten referred to as a micro-program sequencer, as it determines the address sequence thats read from control memory. ‘Speed ast ‘Slow Cost of More Cheaper Implementation Flexibility Difficult to modify Flexible Ability to handle Difficult Easier complex instruction Decoding Complex, Easy Application isc cise Instruction Set Size Small Large Control Memory Absent PresentPipelining © Pipelining is an implementation technique in which multiple instructions are overlapped in execution. © Pipelining is a process of arrangement of hardware elements of the CPU such that its overall performance is increased. ‘© Simultaneous execution of more than one instruction takes place in a pipelined processor. Real Life Example - Explanation refer my video opm 7 8 9 aac eacr aioe 12 4 2am i Task order * soa 8 Goes ¢ aoe=8 > Goei — i . FIGURE 4.25 The laundry analogy for pipelining. Ann, rian, Cathy, and Don each have dirty clothes to be washed, dried, felded, and put away. The washer, dryer, “folder and “storer” each take 30 minutes for their task. Sequential laundry takes 8 hours for 4 loads of wash, while pipelined laundry takes just 3.5 hours. We show the pipeline stage of different loads over time by showing copies ofthe four resources ‘on this two-dimensional time line, but we realy have just one of each resource Design of a basic pipeline © In apipelined processor, a pipeline has two ends, the input end and the output end. Between these ends, there are multiple stages/segments such that the output of one stage is connected to the input of the next stage and each stage performs a specific operation. ‘© Interface registers arc used to hold the intermediate output between two stages. These interface registers are also called latch or buffer. © Alll the stages in the pipeline along with the interface registers are controlled by a common clock.Diagrammatic Representation of No-Pipeline vs Pipeline Pipelined Version Inst, Inst, Inst, Inst, Inst, Main points: ‘© Instruction pipelining is a technique that implements a form of parallelism called as. instruction level parallelism within a single processor. ‘© Multiple instructions are executed parallely.. © Staging: © The hardware of the CPU is split up into several functional units. © Each functional unit performs a dedicated task. © The number of functional units may vary from processor to processor. © These functional units are called as stages of the pipeline. © — Control unit manages all the stages using control signals. ° ° There is a register associated with each stage that holds the data. There is a global clock that synchronizes the working of all the stages. At the beginning of each clock cycle, each stage takes the input from its register. Each stage then processes the data and feed its output to the register of the next stage. °Advantages vs Drawbacks ‘Aovantages of Pipelining ‘© Insiructon throughput increeses, ‘© Increase in the numberof pipeline stages increases the numberof instuctions executed simultaneously ‘© Faster ALU can be designee when pipalining is used. © Pipelined CPU's works at higher clock frequencies than the RAW, ‘© Pipelining increases the overall performance ofthe Py, Disadvantages of Pipelining Designing of the pipelined processor is complex. Instruction latency increases in pipelined processors, The throughput of pipelined processor is dificuttto predict. The longer the pipeline, worse the problem of hazard for branch instructions,Hazards There are situations in pipelining when the next instruction cannot execute in the following clock cycle. These events are called hazards, and there are three different types. 1) Structural Hazard 2) Data Hazard 3) Control Hazard 1. Structural Hazard ® _Itmeans that the hardware cannot support the combination of instructions that we ‘want to execute in the same clock cycle. 4 This dependency arises due to the resource contfict in the pipeline. A resource contct isa situation when more than one instruction ties to access the same resource in the same oycle. A resource can be a register, memory, or ALU. Fem) 10 x Ie(em) EX. © Inthe above scenario, in cycle 4, instructions I1 and I4 are trying to access same resource (Memory) which introduces a resource conflict. © To avoid this problem, we have to keep the instruction on wait until the required resource (memory in our case) becomes available. This wait will introduce stalls in the pipeline as shown below: IF(Mem) ID EX Mem WB IF(Mem) ID EX Mem WB IF(Mem) ID EX Mem WB2. Data Hazard ata hazards occur when instructions that exhibit data dependence, modify data in different sages ofa pipeline. 11: ADD RI, R2, R3 2: SUB RS, Ri, R2 © When the above instructions are executed in a pipelined processor, then data dependency condition will occur, which means that 12 tries to read the data before [1 writes it, therefore, 12 incorrectly gets the old value from Il. © Tominimize data dependency stalls in the pipeline, operand forwarding is used. Operand Forwarding : In operand forwarding, we use the interface registers present between the stages to hold intermediate output so that dependent instruction can access new value from the interface register directly. Program execution ei a 200400 6008001900 FIGURE 4.29 Graphical representation of forwarding. Tie connection shows the forwarding path from the output ofthe EX stage of to the input of the EX stage fr replacing the valu from regisier 530 rad inthe second stage of sub.3.Control Hazard ‘This ype ofepandoncy acceding te tartar cf contol inenuctns sch az BRANCH, CALL MP, te ‘On many nstuctonarhtecuros, ho procastr wil not know to target address of thes insructons when it needs lo inser the new nston ino te pipeline (ue fi, unwanted isructons are fd toe pane, Eq 00. oe ee 10112 (IMP 250) (Gump adress inown attr 0 stage ony) soz: [Expected output It >a > Ble MEM 1 10(Pc250) EX Mem wa ‘The output which we get II > 12 > 13 > BIL So, the output sequence is not equal to the expected output, that means the pipeline is not implemented correctly. 1. Using Stalls - To correct the above problem we need to stop the Instruction fetch until ‘we get target address of branch instruction. This can be implemented by introducing delay slot until we get the target address. Tae asad —X MEM WB IF ID (PC:250) EX Mem WB Output Sequence: ly -> I -> Delay (Stall) -> Bh 2. Branch Prediction - There are 2 different types of prediction 1. Static a. In this strategy branch can be predicted based on branch code types statically. This means that the probability of branch with respect to a particular branch type is used to predict the branch. b. This branch strategy may not produce accurate results every time. One improvement over branch stalling is to predict that the branch will not be taken and thus continue execution down the sequential instruction stream. 2. Dynamica. This strategy uses recent branch history during program execution to predict whether or not the branch will be taken next time when it occurs. It uses recent branch information to predict the next branch. This technique is called dynamic branch prediction. b. A branch prediction buffer or branch history table is a small memory indexed by the lower portion of the address of the branch instruction. The memory contains a bit that says whether the branch was recently taken or not. 3. Delayed Branching 1) The slot directly after a delayed branch instruction, which in the MIPS architecture is filled by an instruction that does not affect the branch. 2) An instruction that always executes after the branch in the branch delay slot.

CPU Instruction Execution Guide
No ratings yet
CPU Instruction Execution Guide
15 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
15 pages
Unit 4
No ratings yet
Unit 4
53 pages
RISC-V Datapath Overview
No ratings yet
RISC-V Datapath Overview
36 pages
Unit III
No ratings yet
Unit III
35 pages
Co Unit2 Part1
No ratings yet
Co Unit2 Part1
11 pages
The Processor: (Datapath and Pipelining)
No ratings yet
The Processor: (Datapath and Pipelining)
144 pages
Stores A Program in Memory As Instructions and Executes Them Sequentially Using The ALU, Control Unit and Registers
No ratings yet
Stores A Program in Memory As Instructions and Executes Them Sequentially Using The ALU, Control Unit and Registers
22 pages
Embedded Microprocessor Lecture 2
No ratings yet
Embedded Microprocessor Lecture 2
15 pages
Module 5
No ratings yet
Module 5
21 pages
Lec07 Annotated
No ratings yet
Lec07 Annotated
26 pages
Cs3351 - DP - Co Unit 4 Reg2021
No ratings yet
Cs3351 - DP - Co Unit 4 Reg2021
50 pages
Processor DP Control
No ratings yet
Processor DP Control
44 pages
DDCO-M5-Notes-prof - Shaheen Mujawar
No ratings yet
DDCO-M5-Notes-prof - Shaheen Mujawar
15 pages
CAO Unit 3 Notes
No ratings yet
CAO Unit 3 Notes
20 pages
MIPS Processor Implementation Guide
No ratings yet
MIPS Processor Implementation Guide
65 pages
Ict Theory Assighnment 1
No ratings yet
Ict Theory Assighnment 1
7 pages
04 The+processor
No ratings yet
04 The+processor
11 pages
L7 Single Cycle DP
No ratings yet
L7 Single Cycle DP
24 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
173 pages
Module 5
No ratings yet
Module 5
9 pages
MIPS Single Cycle Implementation Guide
No ratings yet
MIPS Single Cycle Implementation Guide
10 pages
BGT
No ratings yet
BGT
364 pages
DPCO Unit-4
No ratings yet
DPCO Unit-4
44 pages
Unit - 2 COA
No ratings yet
Unit - 2 COA
53 pages
BCS302 Module 5 Notes
No ratings yet
BCS302 Module 5 Notes
17 pages
CPU Instruction Basics
No ratings yet
CPU Instruction Basics
8 pages
Notes Co Unit3
No ratings yet
Notes Co Unit3
8 pages
Lecture # Datapat
No ratings yet
Lecture # Datapat
46 pages
Multi Cycle2
No ratings yet
Multi Cycle2
54 pages
Module-2 Memory
No ratings yet
Module-2 Memory
44 pages
32-Bit Mips Processor Lab Report
100% (1)
32-Bit Mips Processor Lab Report
17 pages
Unit III
No ratings yet
Unit III
43 pages
CA04 2024S2 Printout
No ratings yet
CA04 2024S2 Printout
31 pages
Computer Architecture Lecture
No ratings yet
Computer Architecture Lecture
31 pages
Microprocessor Lab Manual
No ratings yet
Microprocessor Lab Manual
85 pages
The Final Datapath: Add M U X
No ratings yet
The Final Datapath: Add M U X
32 pages
Chapter V Processor Architecture
No ratings yet
Chapter V Processor Architecture
140 pages
Digital Design & CPU Basics
No ratings yet
Digital Design & CPU Basics
10 pages
Chapter 1
No ratings yet
Chapter 1
12 pages
CA04 2024S2 Printout
No ratings yet
CA04 2024S2 Printout
31 pages
Lecture08 RISCV Impl 2
No ratings yet
Lecture08 RISCV Impl 2
55 pages
DLCOunit 3
No ratings yet
DLCOunit 3
49 pages
Ddco With Answers
No ratings yet
Ddco With Answers
46 pages
CPU Components and Instruction Cycle
No ratings yet
CPU Components and Instruction Cycle
31 pages
LU11-12 Instruction Execution
No ratings yet
LU11-12 Instruction Execution
18 pages
An Instructional Processor Design Using VHDL and An Fpga
No ratings yet
An Instructional Processor Design Using VHDL and An Fpga
10 pages
Chapter 5 - The Processor, Datapath and Control
No ratings yet
Chapter 5 - The Processor, Datapath and Control
23 pages
Basic Processor Modules: "Floating". To Transfer Data From Module XY To Module PQ, The
No ratings yet
Basic Processor Modules: "Floating". To Transfer Data From Module XY To Module PQ, The
3 pages
COA Module4
No ratings yet
COA Module4
50 pages
MIPS Processor Basics for Engineers
No ratings yet
MIPS Processor Basics for Engineers
25 pages
KAIST cs311 05 Proc I
No ratings yet
KAIST cs311 05 Proc I
28 pages
Verilog Datapath Lab Guide
No ratings yet
Verilog Datapath Lab Guide
17 pages
Digital Computer Organization
No ratings yet
Digital Computer Organization
13 pages
COA Mod 1 Part 2
No ratings yet
COA Mod 1 Part 2
33 pages
Processor Instruction Execution
No ratings yet
Processor Instruction Execution
27 pages
16'bit RISC-V PROCESSOR
No ratings yet
16'bit RISC-V PROCESSOR
18 pages
Chapter 5: The Processor: Datapath and Control: I. MIPS Implementation
No ratings yet
Chapter 5: The Processor: Datapath and Control: I. MIPS Implementation
6 pages
CA Chap4 p1 NLT2013
No ratings yet
CA Chap4 p1 NLT2013
26 pages
Mechatronics and Iot Me3791
No ratings yet
Mechatronics and Iot Me3791
50 pages
Generative AI For Engineering: 1. Customer Support Chatbot For E-Commerce: Objective
No ratings yet
Generative AI For Engineering: 1. Customer Support Chatbot For E-Commerce: Objective
5 pages
DPCO Record With Result-1
No ratings yet
DPCO Record With Result-1
64 pages
HDL Lab Manual 2018
No ratings yet
HDL Lab Manual 2018
49 pages

Unit-4 Processor in DPCO

Uploaded by

Unit-4 Processor in DPCO

Uploaded by

You might also like