[go: up one dir, main page]

0% found this document useful (0 votes)
37 views75 pages

Processor R

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views75 pages

Processor R

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Processors

Embedded Systems CSC504


Processor technology
• The architecture of the computation engine used to implement
a system’s desired functionality
• Processor does not have to be programmable
• “Processor” not equal to general-purpose processor
Controller Datapath Controller Datapath Controller Datapath
Control index
Control Register Control logic Registers
logic
logic and file and State total
State register
Custom State
register +
ALU register
General
IR PC ALU IR PC
Data Data
memory memory
Program Data Program
memory memory memory
Assembly code Assembly code
for: for:

total = 0 total = 0
for i =1 to … for i =1 to …
General-purpose (“software”) Application-specific Single-purpose (“hardware”)
Processor technology
• Processors vary in their customization for the problem at hand

total = 0
for i = 1 to N loop
total += M[i]
end loop
Desired
functionality

General-purpose Application-specific Single-purpose


processor processor processor
General-purpose processors
• Programmable device used in a variety of
Controller Datapath
applications
Control
• Also known as “microprocessor” logic and
Register
file
• Features State
register
• Program memory General
IR PC ALU
• General datapath with large register file and
general ALU
Program Data
• User benefits memory memory
• Low time-to-market and NRE costs Assembly code
• High flexibility for:

total = 0
• “Pentium” the most well-known, but for i =1 to …

there are hundreds of others


Single-purpose processors
• Digital circuit designed to execute exactly Datapath
Controller
one program Control index
• a.k.a. coprocessor, accelerator or peripheral logic
total
• Features State
register +
• Contains only the components needed to
execute a single program
• No program memory Data
memory
• Benefits
• Fast
• Low power
• Small size
Application-specific processors
• Programmable processor optimized for a Controller Datapath
particular class of applications having Control Registers
common characteristics logic and
State
• Compromise between general-purpose and register
Custom
single-purpose processors ALU
IR PC
• Features
Data
• Program memory Program memory
• Optimized datapath memory

• Special functional units Assembly code


for:
• Benefits total = 0
• Some flexibility, good performance, size and for i =1 to …

power
Introduction
• General-Purpose Processor
• Processor designed for a variety of computation tasks
• Low unit cost, in part because manufacturer spreads NRE over large numbers
of units
• Motorola sold half a billion 68HC05 microcontrollers in 1996 alone
• Carefully designed since higher NRE is acceptable
• Can yield good performance, size and power
• Low NRE cost, short time-to-market/prototype, high flexibility
• User just writes software; no processor design
• a.k.a. “microprocessor” – “micro” used when they were implemented on one
or a few chips rather than entire rooms

7
Basic Architecture
• Control unit and Processor
datapath Control unit Datapath
• Note similarity to single- ALU
purpose processor Controller Control
/Status
• Key differences
• Datapath is general Registers

• Control unit doesn’t


store the algorithm –
the algorithm is PC IR

“programmed” into the


memory
I/O
Memory

8
Datapath Operations
• Load Processor
• Read memory location Control unit Datapath
into register ALU
• ALU operation Controller Control
/Status
+1

– Input certain registers


through ALU, store Registers
back in register
• Store 10 11
– Write register to PC IR

memory location
I/O
...
Memory
10
11
...

9
Control Unit
• Control unit: configures the datapath
operations Processor
• Sequence of desired operations Control unit Datapath
(“instructions”) stored in memory –
ALU
“program”
Controller Control
• Instruction cycle – broken into several /Status
sub-operations, each one clock cycle,
e.g.: Registers

• Fetch: Get next instruction into IR


• Decode: Determine what the
instruction means PC IR R0 R1
• Fetch operands: Move data from
memory to datapath register
• Execute: Move data through the ALU I/O
• Store results: Write data from ...
100 load R0, M[500] Memory
register to memory 500 10
101 inc R1, R0
102 store M[501], R1
501 ...

10
Control Unit Sub-Operations
• Fetch Processor

• Get next instruction Control unit Datapath

into IR ALU
Controller Control
• PC: program /Status

counter, always
Registers
points to next
instruction
• IR: holds the PC 100 IR
load R0, M[500] R0 R1
fetched instruction
I/O
...
100 load R0, M[500] Memory
500 10
101 inc R1, R0
102 store M[501], R1
501 ...

11
Control Unit Sub-Operations
• Decode Processor

• Determine what the Control unit Datapath

instruction means ALU


Controller Control
/Status

Registers

PC 100 IR R0 R1
load R0, M[500]

I/O
...
100 load R0, M[500] Memory
500 10
101 inc R1, R0
102 store M[501], R1
501 ...

12
Control Unit Sub-Operations
• Fetch operands Processor

• Move data from Control unit Datapath

memory to ALU
Controller
datapath register Control
/Status

Registers

10
PC 100 IR R0 R1
load R0, M[500]

I/O
...
100 load R0, M[500] Memory
500 10
101 inc R1, R0
102 store M[501], R1
501 ...

13
Control Unit Sub-Operations
• Execute Processor

• Move data through Control unit Datapath

the ALU ALU


Controller Control
• This particular /Status

instruction does
Registers
nothing during this
sub-operation
10
PC 100 IR R0 R1
load R0, M[500]

I/O
...
100 load R0, M[500] Memory
500 10
101 inc R1, R0
102 store M[501], R1
501 ...

14
Control Unit Sub-Operations
• Store results Processor

• Write data from Control unit Datapath

register to memory ALU


Controller Control
• This particular /Status

instruction does
Registers
nothing during this
sub-operation
10
PC 100 IR R0 R1
load R0, M[500]

I/O
...
100 load R0, M[500] Memory
500 10
101 inc R1, R0
102 store M[501], R1
501 ...

15
Instruction Cycles
PC=100 Processor

Fetch Decode Fetch Exec. Store Control unit Datapath


ops result ALU
clk s Controller Control
/Status

Registers

10
PC 100 IR R0 R1
load R0, M[500]

I/O
...
100 load R0, M[500] Memory
500 10
101 inc R1, R0 501 ...
102 store M[501], R1

16
Instruction Cycles
PC=100 Processor

Fetch Decode Fetch Exec. Store Control unit Datapath


ops result ALU
clk s Controller Control +1
/Status

PC=101
Registers
Fetch Decode Fetch Exec. Store
ops result
clk s 10 11
PC 101 IR R0 R1
inc R1, R0

I/O
...
100 load R0, M[500] Memory
500 10
101 inc R1, R0 501 ...
102 store M[501], R1

17
Instruction Cycles
PC=100 Processor

Fetch Decode Fetch Exec. Store Control unit Datapath


ops result ALU
clk s Controller Control
/Status

PC=101
Registers
Fetch Decode Fetch Exec. Store
ops result
clk s 10 11
PC 102 IR R0 R1
store M[501], R1
PC=102
Fetch Decode Fetch Exec. Store I/O
ops result ...
s 100 load R0, M[500] Memory
clk 500 10
101 inc R1, R0 501 11
...
102 store M[501], R1

18
Architectural Considerations
• N-bit processor Processor

• N-bit ALU, registers, Control unit Datapath

buses, memory data ALU


Controller
interface Control
/Status

• Embedded: 8-bit, 16-


Registers
bit, 32-bit common
• Desktop/servers: 32-
bit, even 64 PC IR

• PC size determines
address space I/O
Memory

19
Two Memory Architectures
Processor Processor
• Princeton
• Fewer memory
wires
• Harvard
• Simultaneous Program Data memory Memory
memory (program and data)
program and data
memory access

Harvard Princeton

20
Cache Memory
• Memory access may be slow Fast/expensive technology, usually on
the same chip
• Cache is small but fast
memory close to processor Processor

• Holds copy of part of memory


• Hits and misses Cache

Memory

Slower/cheaper technology, usually on


a different chip

21
Programmer’s View
• Programmer doesn’t need detailed understanding of architecture
• Instead, needs to know what instructions can be executed
• Two levels of instructions:
• Assembly level
• Structured languages (C, C++, Java, etc.)
• Most development today done using structured languages
• But, some assembly level programming may still be necessary
• Drivers: portion of program that communicates with and/or controls (drives) another device
• Often have detailed timing considerations, extensive bit manipulation
• Assembly level may be best for these

22
Assembly-Level Instructions
Instruction 1 opcode operand1 operand2

Instruction 2 opcode operand1 operand2

Instruction 3 opcode operand1 operand2

Instruction 4 opcode operand1 operand2

...

• Instruction Set
• Defines the legal set of instructions for that processor
• Data transfer: memory/register, register/register, I/O, etc.
• Arithmetic/logical: move register through ALU and back
• Branches: determine next PC value when not just PC+1

23
A Simple (Trivial) Instruction Set
Assembly instruct. First byte Second byte Operation

MOV Rn, direct 0000 Rn direct Rn = M(direct)

MOV direct, Rn 0001 Rn direct M(direct) = Rn

MOV @Rn, Rm 0010 Rn Rm M(Rn) = Rm

MOV Rn, #immed. 0011 Rn immediate Rn = immediate

ADD Rn, Rm 0100 Rn Rm Rn = Rn + Rm

SUB Rn, Rm 0101 Rn Rm Rn = Rn - Rm

JZ Rn, relative 0110 Rn relative PC = PC+ relative


(only if Rn is 0)
opcode operands

24
Addressing Modes
Addressing Register-file Memory
mode Operand field contents contents

Immediate Data

Register-direct
Register address Data

Register
Register address Memory address Data
indirect

Direct Memory address Data

Indirect Memory address Memory address

Data

25
Internal structure and basic operation of
microprocessor

Address bus
ALU Register
Section
Data bus

Control and timing


section Control bus

Block diagram of a microprocessor


26
Arithmetic and logic unit (ALU)
• The component that performs the arithmetic and
logical operations
• the most important components in a
microprocessor, and is typically the part of the
processor that is designed first.
• able to perform the basic logical operations (AND,
OR), including the addition operation.

27
Control unit
• The circuitry that controls the flow of
information through the processor, and
coordinates the activities of the other units
within it.
• In a way, it is the "brain within the brain", as it
controls what happens inside the processor,
which in turn controls the rest of the PC.
• On a regular processor, the control unit
performs the tasks of fetching, decoding,
managing execution and then storing results.

29
Register sets
• The register section/array consists completely of
circuitry used to temporarily store data or
program codes until they are sent to the ALU or
to the control section or to memory.

• The number of registers are different for any


particular CPU and the more register a CPU have
will result in easier programming tasks.

• Registers are normally measured by the number


of bits they can hold, for example, an "8-bit
register" or a "32-bit register".
30
accumulator
• a register in which intermediate arithmetic and
logic results are stored.
• example for accumulator use is summing a list of
numbers.
• The accumulator is initially set to zero, then each
number in turn is added to the value in the
accumulator.
• Only when all numbers have been added is the result
held in the accumulator written to main memory or
to another, non-accumulator, CPU register.

31
Program counter (PC)
• a 16 bit register, used to store the next address
of the operation code to be fetched by the CPU.
• Not much use in programming, but as an
indicator to user only.
• Purpose of PC in a Microprocessor
• to store address of tos (top of stack)
• to store address of next instruction to be
executed.
• count the number of instructions.

34
Stack pointer (SP)
• The stack is configured as a data structure that
grows downward from high memory to low
memory.
• At any given time, the SP holds the 16-bit
address of the next free location in the stack.
• The stack acts like any other stack when there is
a subroutine call or on an interrupt. ie. pushing
the return address on a jump, and retrieving it
after the operation is complete to come back to
its original location.

35
Data bus
• The data bus is 'bi-directional'
• data or instruction codes from memory or
input/output.are transferred into the microprocessor
• the result of an operation or computation is sent out
from the microprocessor to the memory or
input/output.
• Depending on the particular microprocessor, the
data bus can handle 8 bit or 16 bit data.

36
Address bus
• The address bus is 'unidirectional', over which
the microprocessor sends an address code to
the memory or input/output.
• The size (width) of the address bus is specified
by the number of bits it can handle.
• The more bits there are in the address bus, the
more memory locations a microprocessor can
access.
• A 16 bit address bus is capable of addressing
65,536 (64K) addresses.

37
Control bus
• The control bus is used by the microprocessor to
send out or receive timing and control signals in
order to coordinate and regulate its operation
and to communicate with other devices, i.e.
memory or input/output.

38
Micro processor clock
• Also called clock rate, the speed at which a
microprocessor executes instructions.
Every computer contains an internal clock
that regulates the rate at which
instructions are executed and
synchronizes all the various computer
components.

39
Examples of micro processor
• Intel 8086
• Motorola 6800
• Zilog Z80

40
Application-Specific Instruction-Set
Processors (ASIPs)
• General-purpose processors
• Sometimes too general to be effective in demanding application
• e.g., video processing – requires huge video buffers and operations on large arrays of
data, inefficient on a GPP
• But single-purpose processor has high NRE, not programmable
• ASIPs – targeted to a particular domain
• Contain architectural features specific to that domain
• e.g., embedded control, digital signal processing, video processing, network processing,
telecommunications, etc.
• Still programmable

43
A Common ASIP: Microcontroller
• For embedded control applications
• Reading sensors, setting actuators
• Mostly dealing with events (bits): data is present, but not in huge amounts
• e.g., VCR, disk drive, digital camera (assuming SPP for image compression), washing machine,
microwave oven
• Microcontroller features
• On-chip peripherals
• Timers, analog-digital converters, serial communication, etc.
• Tightly integrated for programmer, typically part of register space
• On-chip program and data memory
• Direct programmer access to many of the chip’s pins
• Specialized instructions for bit-manipulation and other low-level operations

44
Another Common ASIP: Digital Signal
Processors (DSP)
• For signal processing applications
• Large amounts of digitized data, often streaming
• Data transformations must be applied fast
• e.g., cell-phone voice filter, digital TV, music synthesizer
• DSP features
• Several instruction execution units
• Multiple-accumulate single-cycle instruction, other instrs.
• Efficient vector operations – e.g., add two arrays
• Vector ALUs, loop buffers, etc.

45
Trend: Even More Customized ASIPs
• In the past, microprocessors were acquired as chips
• Today, we increasingly acquire a processor as Intellectual Property (IP)
• e.g., synthesizable VHDL model
• Opportunity to add a custom datapath hardware and a few custom instructions,
or delete a few instructions
• Can have significant performance, power and size impacts
• Problem: need compiler/debugger for customized ASIP
• Remember, most development uses structured languages
• One solution: automatic compiler/debugger generation
• e.g., www.tensillica.com
• Another solution: retargettable compilers
• e.g., www.improvsys.com (customized VLIW architectures)

46
Selecting a Microprocessor
• Issues
• Technical: speed, power, size, cost
• Other: development environment, prior expertise, licensing, etc.
• Speed: how evaluate a processor’s speed?
• Clock speed – but instructions per cycle may differ
• Instructions per second – but work per instr. may differ
• Dhrystone: Synthetic benchmark, developed in 1984. Dhrystones/sec.
• MIPS: 1 MIPS = 1757 Dhrystones per second (based on Digital’s VAX 11/780). A.k.a. Dhrystone
MIPS. Commonly used today.
• So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per second
• SPEC: set of more realistic benchmarks, but oriented to desktops
• EEMBC – EDN Embedded Benchmark Consortium, www.eembc.org
• Suites of benchmarks: automotive, consumer electronics, networking, office automation,
telecommunications

47
General Purpose Processors
Processor Clock speed Periph. Bus Width MIPS Power Trans. Price
General Purpose Processors
Intel PIII 1GHz 2x16 K 32 ~900 97W ~7M $900
L1, 256K
L2, MMX
IBM 550 MHz 2x32 K 32/64 ~1300 5W ~7M $900
PowerPC L1, 256K
750X L2
MIPS 250 MHz 2x32 K 32/64 NA NA 3.6M NA
R5000 2 way set assoc.
StrongARM 233 MHz None 32 268 1W 2.1M NA
SA-110
Microcontroller
Intel 12 MHz 4K ROM, 128 RAM, 8 ~1 ~0.2W ~10K $7
8051 32 I/O, Timer, UART
Motorola 3 MHz 4K ROM, 192 RAM, 8 ~.5 ~0.1W ~10K $5
68HC811 32 I/O, Timer, WDT,
SPI
Digital Signal Processors
TI C5416 160 MHz 128K, SRAM, 3 T1 16/32 ~600 NA NA $34
Ports, DMA, 13
ADC, 9 DAC
Lucent 80 MHz 16K Inst., 2K Data, 32 40 NA NA $75
DSP32C Serial Ports, DMA

Sources: Intel, Motorola, MIPS, ARM, TI, and IBM Website/Datasheet; Embedded Systems Programming, Nov. 1998

48
Summary
• General-purpose processors
• Good performance, low NRE, flexible
• Controller, datapath, and memory
• Structured languages prevail
• But some assembly level programming still necessary
• Many tools available
• Including instruction-set simulators, and in-circuit emulators
• ASIPs
• Microcontrollers, DSPs, network processors, more customized ASIPs
• Choosing among processors is an important step
• Designing a general-purpose processor is conceptually the same as designing a single-
purpose processor

52
Designing a Single Purpose Processor
and Optimization Issues
Embedded Systems
CS 504
Single-purpose processors
• Digital circuit designed to execute exactly one Controller Datapath
program
Control logic index
• a.k.a. coprocessor, accelerator or peripheral
• Features total
State register
• Contains only the components needed to execute a +
single program
• No program memory
• Benefits
Data
memory
• Fast total = 0
for i = 1 to N loop
• Low power total += M[i]
• Small size end loop
Processors

General- purpose Single-purpose

Custom Standard
Single Purpose Processors
• A single-purpose processor is a digital system intended to solve a
specific computation task.
• The processor may be a standard one, intended for use in a wide
variety of applications in which the same task must be performed.
The manufacturer of such an off-the-shelf processor sells the device
in large quantities.
• On the other hand, the processor may be a custom one, built by a
designer to implement a task specific to a particular application.
• An embedded system designer choosing to use a standard single
purpose, rather than a general-purpose, processor to implement part
of a system’s functionality may achieve several benefits.
Standard single-purpose processors
Known as Peripherals . (exist on the periphery of the CPU)
“Off-the shelf” → pre-designed for a common task.
Embedded system designers use standard single-processor rather than general-
purpose processor to achieve the following benefits:
Fast performance
Fewer clock cycles .
Shorter cycles .
Small size
No program memory .
Small instruction set
Simple datapath and controller .
Low unit cost
Introduction
• Processor
• Digital circuit that performs a computation Digital camera chip

tasks CCD

• Controller and datapath CCD preprocessor Pixel coprocessor D2A


A2D
• General-purpose: variety of computation
tasks lens

• Single-purpose: one particular computation


task JPEG codec Microcontroller Multiplier/Accum

• Custom single-purpose: non-standard task


• A custom single-purpose processor DMA controller Display ctrl

may be
• Fast, small, low power
• But, high NRE, longer time-to-market, less Memory controller ISA bus interface UART LCD ctrl

flexible

65
BENEFITS OF CUSTOM SINGLE PURPOSE PROCESSOR

• Performance may be faster, due to fewer clock cycles resulting from a


customized data path and due to shorter clock cycles resulting from
the simpler controller logic.
• Size may be smaller due to simplest data path and no program
memory.
• Power consumption may be less due to more efficient computation.
• However, cost could be higher because of high NRE cost. Also time to
market may be longer.
Basic logic gates
x x x x
F x F F
x y F F x y F F x y F
y
0 0 y 0 0 0 0 0 0 y 0 0 0
1 1 0 1 0 0 1 1 0 1 1
F=x F=xy 1 0 0 F=x+y 1 0 1 F=xy 1 0 1
1 1 1 1 1 1 1 1 0
Driver AND OR XOR

x F x F x x y F x x y F x x y F
F F F
0 1 y 0 0 1 y 0 0 1 y 0 0 1
1 0 0 1 1 0 1 0 0 1 0
F = x’ F = (x y)’ 1 0 1 F = (x+y)’ 1 0 0 F=x y 1 0 0
Inverter NAND 1 1 0 NOR 1 1 0 XNOR 1 1 1
Combinational Logic Design
Combinational components
Sequential Components
Sequential Logic Design
Sequential Logic Design
Custom Single Purpose Processor basic model
0: int p,q;
1: while(1)
2: { while(!go);
3: p=x;
4: q=y;
5: while(p!=q)
6: { if(p<q)
7: q=q-p;
else
8: p=p-q;
}
9: z=p;
}
Example: Greatest Common Divisor(GCD)
Example: Greatest Common Divisor(GCD)
State Diagram templates
Creating the datapath
Creating the controller’s FSM
Splitting into a controller and datapath
Controller state table for the GCD example
Completing the GCD custom single-purpose
processor design … …

controller datapath

• We finished the datapath


next-state registers
• We have a state table for the next and
control
state and control logic logic

• All that’s left is combinational logic


design state functional
register units
• This is not an optimized design, but
we see the basic steps
… …

a view inside the controller and datapath


Optimizing single purpose processors
Optimizing the original program
Optimizing the original program
Optimizing the FSMD
Optimizing the FSMD(contd..)
Optimizing the data path
Optimizing the FSM
Summary
• Custom single-purpose processors
• Straightforward design techniques
• Can be built to execute algorithms
• Typically start with FSMD
• CAD tools can be of great assistance

You might also like