Microprocessor Study Materials
Microprocessor Study Materials
3
Course requirements
Pre-requisites to this course are
i) knowledge of
• Digital logic; principles and design.
• High level language programming.
ii) core concepts of
• Computer Organisation/Architecture
iii) and rudimentary idea of
• Operating System
4
What is a Microprocessor?
• It is usually a single IC device that contains the
CPU of a typical computer. So, a
microprocessor is a CPU in a single chip.
• Major functional units of a microprocessor are:
– Control Unit (CU)
– Arithmetic and logic unit (ALU)
– Instruction Decoder (ID)
– Small high speed memory (Registers)
– Bus interface units (BIU)
– Buffers, Cache and different pipelines
These functional units are connected through
internal buses and exclusive data paths. 5
Why Microprocessors?
Microprocessor based design has practically outdone all
discrete IC based non-CPU oriented design due its
many faceted advantages.
• Accumulator based
– Old architecture with limited hardware resources.
– One register (Accumulator) is used to hold one of
the operands and the result (1-address machine).
• General Register
– All registers are equally powerful.
– All modern processors are of this type (3-address
machine). 8
Processor: Resource and action
• Irrespective of the architecture, the main
resource of any processor is the memory in
which instruction and data are stored. A
processor is connected to this resource
through Buses; Address and data buses.
• The processor does very simple external
(i.e., bus) operations; namely, bus read and
bus write in order to fetch instruction and
data from the memory and to write the
computed results back into the memory.
9
Memory: The main resource
• Memory holds everything; data, instruction and
any other information.
• By the term memory we mean the external
semiconductor memory (RAM + ROM).
• CPU usually has a small amount of very high
speed memory, known as register.
• Registers hold data and addresses to reduce
CPU memory (external) interactions for faster
processing.
• Modern CPUs also have internal cache memory.
10
The Memory Hierarchy
Few bytes to
Registers
Mega and Giga bytes
Cost/bit
On-chip Cache
is high
External Cache
13
Processor & Memory
Data Bus
(n-bits)
CPU MEMORY
Control
Lines
14
Microprocessor: Word size
• Arithmetic and logic operations are carried out
in the Arithmetic and Logic Unit (ALU).
• The size of the operands in bits is known as the
word size.
• If a processor can process N bit operand at a
time (for majority of its instruction); the word
size is N and consequently the CPU is known
as an N-bit processor.
• These days, N happens to be an integral
multiple of 8. So, 8, 16, 32 or 64 bit CPUs are
commonly available. 15
Word size vs. Bus size
• Address bus size dictates the memory address
space ( 2m for m bits; e.g., 64K for 16 bits).
• Data bus size determines the number of bits
that can be read/written in a single bus cycle. It
is usually equal to the word size of the
processor; i.e., N bits for N-bit CPU. However,
exceptions are there and the CPU reads/writes
word data through ½ Word size (or less)
through multiple bus cycles. Performance
penalty is compromised for the ease of
backward integration with the available
peripherals as well as for cost benefits.
16
Should we start with 8-bit
• 8-bit processors, though not very powerful, are
ideal to learn the basic principles and
mastering the hardware design due to their
simplicity, low cost and high availability of the
support systems.
• As such we cannot say that a particular
processor is the best one; there are so many
with different attributes and varying computing
power.
• A particular processor may be suitable for a
particular application.
• However, the underlying principles are same
17
for all processors.
Hypothetical or real life CPU
• We will use INTEL 8085A 8-bit microprocessor
for learning the basic principles. It is a low cost
simple CPU with high availability of all sorts of
hardware and software support for learning and
development systems.
• Sometimes, a hypothetical, ideal, all powerful
CPU is considered for training. However, this
approach suffers from the non-availability of all
types of real life development and testing
facilities; simulation can only be done.
18
Hardware and Software
• For small system design (most usage fall in this
category) hardware development is easy as off
the self and compatible components and
peripherals are available from the vendors and
the design is more of an assembly of functional
blocks.
• Software development calls for more effort
(70% or more) in the development cycle of a
product.
• S/W for small system is developed (usually) in
assembly language; a pre-requisite to this is to
know the processor programming model.
19
Processor Programming Model
• Processor programming model is the graphical
representation of different registers whose
contents can be manipulated through machine
instructions.
• For assembly language programming we also
need to know the instruction set and the
addressing modes available for a particular
CPU.
• Instruction set is the set of machine instructions
for computing and other operations while the
addressing modes dictate different ways of
20
getting the operands from memory to CPU.
8085A: Programming Model
b7 b0 b7 b0 b7 b0
H L
Interrupt musk register (I)
b15 b0
8 bit register pairs (drawn side by side) may be used to hold 16 bit operands 21
Instruction Set
Instructions may be classified into 4 major groups
• Data Movement
– Any movement between registers, register to
memory or memory to register.
• Arithmetic and Logic
– Any processing involving the ALU (or Register).
• Branch
– All jumps, calls, returns & s/w interrupts; i.e.,
whenever we break the normal sequential flow of
execution.
• Special or Machine Control
– Any instruction other than the first three types. 22
Instruction Set: Examples
Here is a non-exhaustive list of instructions.
• Data Movement: MOV, MVI, LXI, PUSH, POP,
XCHG, XTHL, IN, OUT
• Arithmetic and Logic: ANA, ORA, XRA, CMP,
ADD, ADI, ADC, SUB, SUI, SBB, INC, DCR,
INX, DCX, DAD, DAA, CMC, STC
• Branch: JMP, JZ, JNZ, JC, JNC, JPO, JPE, JP,
JS, CALL, CZ, CNZ, CC, CNC, RET, RZ, RST
• Special or Machine control: HLT, NOP, EI, DI,
SIM, RIM 23
Data Movement
Some examples:
ANA B
XRA L
CMA ; complement accumulator
STC ; set carry
26
Branch Instructions
• All conditional and unconditional Jumps,
subroutine calls and returns as well as the
software interrupts are used to deviate from the
sequential execution flow where the next
instruction in memory is not executed and the
execution control is diverted to the instruction
stored elsewhere in the memory. e.g.,
27
RET ; PCHL; RST 5
Special and Machine Control
These instructions usually have no memory
operands and are used for special purpose.
Notable examples are;
29
Direct Addressing Mode
Here the operand address is directly (fully) specified in
the instruction. For example;
STA 2050H; encoding →32 50 20
ADD B; A A + B
INX H; HL HL + 1
PUSH H; (--SP) H, (--SP) L
DAD B; HL HL + BC
32
Register Indirect Mode
Here the address of the operand is available in
the register(s). This is advantageous for address
manipulation. No. of bits (In 8085A only 3-bit is
necessary to specify any 8 bit register) required
to represent register(s) is also much lower than
full m-bit address. e.g.,
MOV A, M ; Move the content of location M
; ( held in HL registers) to A
PUSH H ; Push H & L into stack top
whose ; address comes from SP register.
33
Branch Instruction
Execution of instructions is usually sequential.
However after executing nth instruction we may
not execute the very next, i.e., the (n+1)th
instruction. This is done by branch instruction;
branch is either conditional or unconditional.
Examples:
Unconditional branch
JMP 508AH; go to location 508AH
PCHL ; Load PC with HL : basically jump to
; location stored in HL unconditionally
Conditional branch
34
JZ to_addr; jump if Z flag = 1
Special or machine control
Some instructions do not really need any memory
or register operand and are used for special
purpose. Examples:
NOP ; No operation
HLT ; Halt
Addressing modes:
Register indirect mode is available only in its
crude form; offsets cannot be used. A negative
point is the absence of relative addressing mode.
36
Assembly Language Programming
• Assembly language is CPU specific
• It has a very simple form; a single line contains
a single instruction
• An instruction contains 3 parts
– label: a symbolic reference to a location
– op-code: part of the machine instruction
specifying the type of operation (movement,
branch etc.) to be performed.
– operand: one or more operands on which the
operation will be done.
• Comments (starts with a ; i.e., a semi-colon)
can also be added anywhere to improve
37
readability and are ignored by the assembler.
Instructions and addressing modes
• Note that an instruction class (say, data
movement) is available for many addressing
modes; e.g.;
MOV B, L ; register mode
MOV C, M ; register indirect mode
LDA 0B035H ; direct mode
LXI SP, 20FFH ; immediate mode
• Ideally each type of instruction should be
available for all addressing modes and all
registers should hold operands for all
instructions. However, this orthogonal property
may not be observed in all real life CPUs.
38
Assembly instructions: examples
LXI SP, 20FFH ; op-code and operand only
44
Assembler directives
• PLC control
– ORG (origin) is used to initialize the PLC; e.g.,
ORG 2000H; Next byte of code/data will be
; assembled from 2000H
– $ is to get the current value of the PLC; e.g.,
ORG $+16; PLC PLC + 16
( $ is appearing in operand field; a special use)
45
Constants
EQU (equate) is used to define a constant, e.g;
NANDB MACRO
ANA B
CMA
ENDM
• Name of the macro is NANDB.
• MACRO and ENDM are directives indicating the
beginning and the end of the Macro. 53
Calling a Macro
Calling a macro is done by referring to its name as a
pseudo op-code in an instruction. e.g.,
NANDB
NOP
NANDB
Would be expanded by the macro-processor as:
ANA B
CMA
NOP
ANA B
CMA
So each call is replaced by the macro body. Note that
call and return overheads of the subroutine are absent.
54
Macro with parameter
NAND MACRO &R
ANA &R
CMA
ENDM
• This is a more powerful macro and can perform
NAND operation with Accumulator and any
other 8 bit register. e.g.,
NAND L ; NAND of A and L
NAND B ; NAND of A and B
• Multiple parameters can also be used if
necessary.
55
Conditional assembly
Other than copy-code and parameter passing the
third facility offered by macro assembler is
conditional assembly.
• It is similar to an IF statement in high level
language to test a condition and either assemble
a block of instruction or bypass it.
• This is useful to configure part(s) of a system
software to accommodate different h/w facilities
for the same platform. So, conditional assembly
is important and useful for the so called system
generation.
56
8085A Hardware
• 40 pin DIP package
• 8-bit data and 16-bit address lines
• Separate address space for Memory and I/O
• 5 interrupting inputs
• Multiplexed data and address (lower order) bus
• Special built-in serial ports
• Single +5 V power supply
• On-chip clock generator circuit
• DMA facility
• Ready input for interfacing with slow devices57
CPU: Pins (Function-wise)
• Address lines
• Data lines
• Control lines
• Status lines
• Interrupt input and acknowledge lines
• DMA/Bus arbitration lines
• Interfacing with slow peripherals line
• Master reset line
• Clock out, power supply and ground lines etc.
• Special purpose lines 58
8085A pin diagram
59
CPU: Pins (Function-wise)
• Address lines (A8-A15)
• Data lines (AD0 – AD7)
• Control lines (RD’, WR’, INTA’)
• Status lines (IO/M’, S0, S1)
• Interrupt input and acknowledge lines (TRAP, RST7.5,
RST6.5, RST5.5, INTR, INTA’)
• DMA/Bus arbitration lines (HOLD, HLDA)
• Interfacing with slow peripherals line (READY)
• Master reset lines (RESTETIN’, RESETOUT)
• Clock REF, clcokout, power supply and ground lines
etc. (X1, X2, CLKOUT, VCC, VSS) 60
• It takes 3 T states
• Low order address is
latched during T1
• Memory needs time
to drive data bus with
the required data
during which the data
bus is in High Z state
• Throughout the cycle
Write and Ready lines
are high. 65
Memory Write
• It is similar to MR cycle.
• Address and Data, both are
supplied by the CPU.
• Data is available in the BUS
right from the beginning of T2;
so there is no high Z state
between T1 and T2 ( a notable
difference with MR where
Memory supplies data and it
takes time to drive the data bus
and hence there is a high Z
state between T1 and T2) 66
Op-code Fetch machine cycle
• It takes 4T states
• Operation in the first 3T
states is similar to MR
• One more T state is
required to decode the
instruction.
• Many instructions do not
take extra time as they call
for internal operation and
that can be carried out just
after the decoding. OF for
a few instructions take 6T
for complex internal
operations. 67
Machine cycles/Instruction
• Each instruction consists of a number of
machine cycles (Min. 1 to Max. 6).
• Each machine cycle consists of a number of
clock cycles (3, 4 or 6); e.g.,
STA 2050H ; encoding 32 50 20
• This instruction consists of the 4 machine
cycles (OF, MR, MR and MW) and 13T states.
• OF fetches the op-code (32H); CPU knows that
it requires 3 more machine cycles; two MRs to
read the address (50H and then 20H) and
finally an MW is needed to write the contents of
A register in the direct address 2050H. 68
Machine cycles/Instruction
For instruction MOV A, B; (code 78H)
• We need only the OF (4T) machine cycle.
• The opcode is fetched within the first 3T states.
• In next T state the code is deciphered and
contents of B is copied to A.
• As the registers are internal to the CPU it is
possible to decode and execute within 1 T.
However, MOV A, M; (code 84H)
• Takes OF and MR (i.e., 4T + 3T or 7T states)
• OF fetches the instruction that calls for a
memory read (external operation) to transfer
69
the contents from memory to A.
Instruction complexity
• Most OF cycles are 4T long; however some
takes 6T. To be precise fetching op-code is
similar to memory read and takes 3T. However,
complex instruction takes more time to decode
and even to complete internal operations it
takes extra time; e.g.,
LXI H, JMPTAB :
MVI B, 0 RTN1: <instr>
ADD A ;AA*2 :
MOV C, A :
DAD B
RTNn: <instr>
MOV E, M
INX H
JMPTAB: DW RTN0
MOV D, M
DW RTN1
XCHG
PCHL :
: DW RTNn
86
Peripherals
• For any real life computing system we need to
connect peripheral devices with the basic CPU-
Memory computing backbone. These
peripherals are primarily I/O devices of different
types. In order to reduce the burden on the
CPU off-the-shelf peripheral controllers are
also used to interface them. These peripheral
controllers under instruction from the CPU
increases the throughput of the system by
making the CPU free from routine work of a
structured and synchronised approach to
respond to the need of these devices and to
ensure a fair and maximum utilisation of them. 87
88
Peripheral controllers
• Practically for any I/O device we thus need
peripheral controllers.
• These controllers are programmable; i.e., the
CPU initializes them and send basic operating
instructions.
• Peripheral controllers are logically the extended
arm of the CPU relieving it from the daily chores
of managing the need of the I/O devices
connected to the system.
• Off the shelf peripheral controllers are PIO, PIC,
DMA, SIO, CRT and Keyboard controllers. 89
Interrupts
• Interrupts are requests made by the external
devices to get some service from the CPU.
• CPU after getting interrupt request suspends
the current task, identifies the interrupting
device and executes a selected Interrupt
Service Routine (ISR) to serve the device.
• The suspended task is resumed once the ISR is
done.
• Other than the external h/w interrupts s/w
interrupts and internal h/w interrupts are also
posssible. 90
External H/W Interrupts
• By the term interrupt we normally mean External
hardware interrupts from the I/O devices.
• These are asynchronous external event and are
used to increase the I/O throughput of the
system.
• In case of multiple simultaneous interrupts the
CPU applies some priority logic to decide whom
to serve first.
• CPU may not accept or ignore any interrupt
requests if it is busy doing something important.
• A non-maskable interrupt input is usually
available in the CPU. 91
Interrupt lines 8085A
102
Programming Model
Data memory
127
Program memory
07FFH
31
Reg. Bank 1
(RB1)
24
(8 – 23) 16 byte Stack
7: Timer interrupt
7: R7 8 Registers, R0
& R1 can act
as address
3: Interrupt
Registers.
1: R1
Reg. Bank 0
0: Reset 0: R0
(RB0)
103
Loc. no. 0, 3 & 7 are special
Stack
MCS-48 uC’s have limited data memory stack (16 bytes
only) allowing the user up to 8 level of nesting. For all
practical purpose this is enough. However, implementing
recursive routine will be risky due to small stack space.
PC bits and some flag (PSW7-4) bits are stored in the
stack automatically.
CY AC F0 BS 1 S2 S1 S0
104
Addressing Modes
MCS-48 is equipped with all the standard
addressing modes as well as special modes
like paged mode. Here are examples through
instructions.
• Direct ; JMP address (12 bits)
• Immediate ; MOV A, #data
• Register ; ADD A, Ri (i=0,1,…,7)
• Register Indirect ; ADD A, @Rx (x=0,1)
• Paged ; MOVP A, @A
• Relative ; J(cond) address (8 bits)
– several possible conditions are possible; e.g.,
105
JC/JNC/JZ/JNZ/JBb etc.
Instructions
• In comparison to 8085, MCS-48 instructions are
short but powerful.
• Most of the instructions (over 90%) are 1 byte
and executed in a single cycle.
• Bit testing facility is available (e.g., JBb <addr>)
allows user to test any bits (b=0 to 7) of the
accumulator to branch to an address.
• Logic operations can be done directly at the I/O
ports and facilitates control programs (e.g.,
instruction ORL P0, #data; does a logical OR
operation with the current data at port P0 with
106
the mask specified by #data
Sample programs
pakdig: ; packs bits 0 – 3 of locations 50-51 into
location 50.
mov R0, #50
mov R1, #51
XCHD A, @R0 ; exch. bits 0-3 of Acc
; with location 50
SWAP A ; exch bits 0-3 & 4-7
; of Acc
XCHD A, @R1
MOV @R0, A
107
More examples
LOC3: JNI INIT ; Jump to routine INIT if
; interrupt input is 0
INIT: MOV R7, A
SEL RB1
MOV R7, #0FA
:
SEL RB0
MOV A,R7
RETR ; RET FROM INTR
; RESTORE A & PC 108
More examples
This routine disables interrupt; but jumps to
interrupt routine after 8 overflows and stop timer.
JMP MAIN
COUNT: INC R7
MOV A, R7
JB3 INT
JMP MAIN
INT: STOP TCNT
JMP 7H 109
More examples
mov128: mov A, #128
movp A, @A
This two instructions would move the contents of
memory location 129 (in the current page) to the
accumulator.
page3: mov a, #0bFH
ani a, #7FH
movp3 a, @a
This will transfer the contents of location no. 38H of
page 3 to Accumulator. This two instructions are useful
to access data stored permanently from program
memory. 110
Logic operations at I/O ports
• Logic operations can be done directly at the ports. Let us
assume that one 8-bit 8255 port; say bit 3 is controlling a process.
Now you would like to set bit 3 without disturbing the other bits.
; 8085 example
; MCS-48 example
CONWORD DS 1 ; allocate a byte
111
Powerful CPUs
• 8-bit processors are good for simple
applications. Natural extension to these 8-bit
CPUs are the 16-bit processors which may be
used to design general purpose low cost
computers.
• Now 32 or even 64 bit CPUs are available in
the market and are extensively used in
computing.
• It may be noted that the general principles are
same for any processor; however the higher
order processors are powerful and have more
throughput due to various factors. 112
Powerful CPUs
• No practical limitation on address space and
data width.
• Superscalar performance exploiting pipeline and
other mechanism for parallel operation.
• Use of multiple level cache to reduce the CPU
memory speed gap.
• Special Hardware/Software features to
implement multiprocessing/multitasking OS.
AD7 HOLD
29. WR (LOCK)
AD6 HLDA
15 0
AH AL AX
BH BL BX
CH CL CX
DH DL DX
SI Flag register MS byte
DI
BP OF DF IF TF
SP
CS
DS Flag register LS byte
SS
Same as 8085A
ES
F
IP 117
8086 Operand addressing modes
Mode Forms and alternatives Examples
118
Assembly Language Examples
• X86 assembly language is complex but powerful.
• A program consists a number of logical segments each
pointed through a segment register.
• The segments define code, data and stack segment of
the program that are mapped to the memory for
execution.
• The assembler directives are very powerful and
support complex data structure as well as macro
operations.
We present next a hypothetical program showing
different aspects of X-86 assembly language.
119
; example program
main_data segment
packed_number db 4 dup (0)
main_data ends
other_data segment
unpacked_number db 8, 7, 6, 5, 4, 3, 2, 1
other_data segment
prog_data segment
assume cs:prog_data, ds:main_data, es:other_data
prog_start: mov ax, main_data
mov ds, ax
mov ax, other_data
mov es, ax 120
mov bx, offset packed_number
mov si, 0
mov di, si
mov cx, 4
pack: mov ax, word ptr es: unpacked_number[si]
mov dx, cx
mov cl, 4
shl ah, cl
add al, ah
mov [bx][di], al
add si, 2
inc di
loop pack
hlt ; a better option is a syscall to OS
prog_code ends
end prog_start 121
More examples
; One of the common use of the stack is
parameter passing between functions. The
following program fragments show an example.
push bx
push cx ;save caller’s
pushf ; reg & flags
sub sp, 6 ; allocate local storage
; end of prolgue
127
128
RISC Processor
The processors we have discussed so far are
known as CISC (Complex instruction set
computer) characterized by:
• Large no. of instructions of varied types.
• Many addressing modes.
• Complex instruction format with varied length of
instruction
The motive of the designers were to reduce the
semantic gap between higher level language and
the machine level (or assembly level) facilities.
129
RISC Processor contd.
In late 70s’ through some studies it was found
that complex instructions and addressing modes
are not frequently used by the compliers leading
to the questions --why not go for a RISC (reduced
instruction set computer) processor with
– Reduced number of instructions (with fixed length
format for all instructions) and
– Limited addressing modes so as to get a much
simpler decoding and control circuitry.
– Load/Store machine for less memory access.
– More compile time effort to produce optimal code
.
130
RISC Processor contd.
In real life also bricks are the building blocks of
• walls and
• subsequent bigger and complex structures; the
reverse is not true.
• Moreover, bigger and complex structures can be
built with the help of lower level structures.
131
RISC Processor contd.
The benefits of Reduced no. of instructions and
addressing modes are manifold.
• Fixed length simple instruction with less
addressing complexity
– Less complex decoder
• saves space in the IC.
–More on-chip register.
–Bigger on chip cache
–allows h/w control instead of microprogramming
– Single cycle instruction execution
• simplified pipeline
132
Berkeley RISC I : Features
One of the early projects endorsing the advantages of RISC
architecture was Berkeley RISC I. Features of RISC I are:
• 32 bit CPU
• 32 bit address
• 8, 16 or 32 bit data
• 32 bit instruction; three instruction formats only.
• 31 Instructions
• 3 addressing modes (Register, Immediate and PC relative)
• 138 CPU register (32 is active at a time – called a window)
• load store machine
• Overlapped register window for faster subroutine call and return
133
RISC Processor contd.
One out of 8 instruction in HLL is a call/ret and that has a
very high execution overhead due to stack (memory)
access. Moreover, the standard technique of stack based
parameter passing also calls for more memory access. This
overhead is highest considering the number of instructions
in the low level code and corresponding memory references
compared to all other instructions.
134
Berkeley RISC I : Instruction Format
Instruction Formats:
OR Rs, S2, Rd Rd Rs v S2 OR
138
Berkeley RISC I : Instruction Set
Op-code operand Register Transfer Description
139
Berkeley RISC I : Instruction Set
Op-code operand Register Transfer Description
140
Berkeley RISC I : Instruction Set
Op-code operand Register Transfer Description
PROGRAM CONTROL INSTRUCTIONS :
CALLR Rd, Y CWP--; CALL SUBROUTINE AND
Rd PC; CHANGE WINDOW (REL)
Next PC PC + Y;
141
Berkeley RISC I : Instruction Set
Most notable absence in the instruction set is the MOV
instructions (Register to Register). However, an innovative
idea of storing a 0 (zero) always in R0 solves the problem.
142
Berkeley RISC I : Instruction Set
143
Berkeley RISC I : Register Window Scheme
•No of calls (& corresponding return) instruction are frequent; one
in 7/8 instructions in HLL; and the highest in the no. of
instructions (#33) and the memory references (#45).
• Reducing this overhead improves the overall performance.
144
Berkeley RISC I : Register Window Scheme
• Pushing/Popping return address into/from stack during
call/ret and STACK based parameter passing can be
avoided by doing the same through CPU registers provided a
lot of registers can be made available right inside the CPU.
• Simpler instruction decoder and control lead to space saving
and in RISC I architecture a lot of CPU registers could be
provided.
• Nesting depth of the call/ret rarely exceeds 8 and a 8-window
based circular overlapped window is made available in RISC
I to reduce memory references and faster parameter passing.
145
Berkeley RISC I : Register Window Scheme
Salient points:
• RISC I has provided 138 registers in the CPU and at any point of
time 32 of them are active.
– It has been decided to use for the current function in execution.
» 10 Global registers are in use (always)
» 10 registers local to the function
» 6 registers (common to the ‘calle’ and the ‘called’ function at any
level)
146
Berkeley RISC I : Register Window Scheme
R10-R15
Registers
• The scheme is shown common to
both B & A
Registers
R16-R25
• For 8 windows the Registers
local to A
PROC A
requirement is
R10-R15
10 + 10 x 8 + 8 x 6 Registers
common to
both B & A
= 138 registers R0-R9
10 Global
Registers 147
RISC
148
The ARM (Advanced Risc Machine) Processor
THE ARM PROCESSOR IS SPECIALLY DESIGNED TO BE SMALL TO
REDUCE POWER REQUIREMENT (TO BE SOURCED FROM THE
BATTERY ) AND ARE IDEAL FOR EMBEDDED AND MOBILE
APPLICATIONS.
• High code density
• Focus is not on raw processing speed but higher system performance
• Embed specialized peripherals right onto the chip
• Allow the use of low cost memory to reduce system cost
149
The ARM Instruction Set
The ARM instruction differs the pure RISC instructions in certain ways as
depicted below.
DATA
Instruction Decoder
Sign Extension
A B
Rm Acc
r15 Rn Barrel Shifter
Multiply & Accumulate
ALU
151
ADDRESS
The ARM Dataflow
• The instruction and data comes from the memory. Instructions are put to
ID and the data is passed through the sign-extender. The processed
data also sent back through the same bus.
• ARM being a RISC machine follows the LOAD/STORE principle and the
operations are carried on the ALU (Typical 3 address instructions are
used; <op> Rd, Rn, Rm; Rd Rn <op> Rm). The result goes back to
one of the Registers (Rd). One of the operands (Rm) is passed through
a barrel shifter; optionally it shifts the operand before submitting the sane
to the ALU.
• The ALU or MAC takes the operand and the result from the data
processing instruction is written back to Rd. .
• The address bus is driven by R15 (the PC) for instruction fetch.
LOAD/STORE instructions use ALU to generate an address to be held
by the address register and placed on the address bus.
152
The ARM Registers and Modes
• The ARM uses 16 registers r0 to r15. r0 to r13 are fully orthogonal and
available for all instructions. They hold both the data and address. r13 is
the sp, r14 is the link register holding the return address and r15 is the
PC.
• Two more registers, namely cpsr (current program status register) and
spsr (saved program status register) are also used by ARM.
• The psr is 32 bit holding the NZVC flags (b31,b30,b29 & b28). Bit 0 to 4
specifies MODE and b7 and b6 are the interrupt masks for normal
interrupt and fast interrupt operations. Bit b5 indicates the thumb state.
• ARM processor acts in one of the 7 modes; i) user; ii) fast interrupt
request ; iii) interrupt request; iv) supervisor; v) undefined; vi) abort and
vii) system mode. Other than the user mode rest are privileged and can
change the psr.
153
The ARM Registers and Modes
A processor mode determines which register is active and the access rights
to cpsr. A privilege mode allows a full r/w access to cpsr. While a non
privilege mode allows read access of the cpsr and r/w to the flags only.
• User mode (Non privilege) : used for programs and applications
• Supervisor: goes to this mode after power-up. OS kernel runs in this
mode
• System: Similar to user mode but has full r/w to cpsr
• Interrupt: moves to this mode on interrupt request
• Fast Interrupt: moves to this mode on first interrupt request
• Abort: enters this mode on failed attempt to access memory
• Undefined: enters this mode when unknown instruction is encountered.
154
The Banked Registers
155
ARM INSTRUCTIONS