EECS 373
Design of Microprocessor-Based Systems
Prabal Dutta
University of Michigan
Lecture 4: Review, Simulation, ABI, and Memory-Mapped I/O
September 15, 2011
1
Announcements
• Homework 1 to be posted
– ARM Cortex Simulator
– Will test low-level understanding
– Intentionally poorly specified, but see:
– http://www.eecs.umich.edu/~prabal/teaching/eecs373-
f11/roadmap.html
– Use the class email list for questions, discussion
– Discuss with classmates
– Think through solutions before looking at others code
– Get started early! Your classmates will be depending
on you!
2
Outline
• Minute quiz
• Announcements
• Review
• Assembly, C, and the ABI
• Memory
• Memory-mapped I/O
3
What happens after a power-on-reset (POR)?
• On the ARM Cortex-M3 .equ STACK_TOP,
0x20000800
• SP and PC are loaded .text
.syntax unified
from the code (.text) .thumb
segment .global _start
.type start, %function
• Initial stack pointer
_start:
– LOC: 0x00000000 .word STACK_TOP, start
– POR: SP start:
movs r0, #10
mem(0x00000000)
...
• Interrupt vector table
– Initial base: 0x00000004
– Vector table is relocatable
– Entries: 32-bit values
– Each entry is an address
– Entry #1: reset vector
• LOC: 0x0000004
• POR: PC mem(0x00000004)
• Execution begins
4
Major elements of an Instruction Set Architecture
(registers, memory, word size, endianess, conditions, instructions, addressing
modes)
32-bits 32-bits
mov r0, #1
ld r1,
[r0,#5]
mem((r0)+5)
bne loop
Endianess subs r2, #1 Endianess
5
Instruction encoding
• Instructions are encoded in machine language opcodes
• Sometimes
– Distinguish opcodes from each other
– Necessary to decode opcodes and itemize arch state impacts
• How?
Instruction Register Value Memory
s Value
movs r0, 001|00|000|00001010 (LSB)
#10 (MSB)
(msb) (lsb) 0a 20 00
movs r1, #0 21
001|00|001|00000000
ARMv7
ARM
Instruction encoding/decoding
• Thumb instructions are a sequence of half-word-
aligned half-words
• Each Thumb instruction is either
– a 16-bit half-word in that stream
– A 32-bit instruction consisting of two half-words in that
stream
• If bits [15:11] of the half-word being decoded take
on any of the following values
– 0b11101
– 0b11110
– 0b11111
– then half-word is the first half-word of a 32-bit instruction
– otherwise the half-word is a 16-bit instruction
• See ARM ARM A5.1, A5.5, A5-13
Instruction encoding/decoding (class-level)
Instruction encoding/decoding (instruction-level)
Linker script
OUTPUT_FORMAT("elf32-littlearm") • Specifies little-endian arm in ELF
OUTPUT_ARCH(arm) format.
ENTRY(main) • Specifies ARM CPU
• Should start executing at label
MEMORY
{ named “main”
/* SmartFusion internal eSRAM */ • We have 64k of memory starting at
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 64k 0x20000000. You can read (r),
} write (w) and execute (x) out of it.
We’ve named it “ram”
SECTIONS
{
.text :
{
• “.” is a reference to the current
. = ALIGN(4); memory location
*(.text*) • First align to a word (4 byte)
. = ALIGN(4); boundry
_etext = .; • Place all sections that include .text
} >ram at the start (* here is a wildcard)
} • Define a label named _etext to be
end = .;
the current address.
• Put it all in the memory location
defined by the ram memory
location.
10
Some things to think about (TTTA)
• What instruction set? Thumb!
• What is conditional execution (ARM ARM, A4.1.2)?
• What are the side effects of instruction execution?
11
How does an assembly language program
get turned into a executable program image?
Binary
program
file (.bin)
Assembly Object
Executable
files (.s) files (.o) y
image file p
jco
ld ob
(linker) ob
as
jd
(assembler) um
p
Memor
Memor
y
y
layou
layou
t
t
Disassembled
Linker code (.lst)
script (.ld)
12
Outline
• Minute quiz
• Announcements
• Review
• Assembly, C, and the ABI
• Memory
• Memory-mapped I/O
13
Cheap trick: use asm() or __asm() macros to
sprinkle simple assembly in standard C code!
int main() {
int i;
int n;
unsigned int input = 40, output = 0; $ arm-none-eabi-gcc \
for (i = 0; i < 10; ++i) { -mcpu=cortex-m3 \
n = factorial(i); -mthumb main.c \
printf("factorial(%d) = %d\n", i, -T generic-hosted.ld \
n); -o factorial
} $ qemu-arm -cpu cortex-
__asm("nop\n"); m3 \
__asm("mov r0, %0\n" ./factorial
"mov r3, #5\n" factorial(0) = 1
"udiv r0, r0, r3\n" factorial(1) = 1
"mov %1, r0\n" factorial(2) = 2
:"=r" (output) factorial(3) = 6
: "r" (input) factorial(4) = 24
: "cc", "r3" ); factorial(5) = 120
__asm("nop\n"); factorial(6) = 720
printf("%d\n", output); factorial(7) = 5040
} factorial(8) = 40320
Answer: 40/5 factorial(9) = 362880
8
14
How does a mixed C/Assembly program
get turned into a executable program image?
C files (.c)
Binary
program
ld file (.bin)
(linker)
Assembly Object
Executable
files (.s) files (.o) y
image file p
jco
gcc ob
(compile
as + link) ob
jd
(assembler) um
p
Memor
Memor
y
y
layou
layou
t
t
Disassembled
Code (.lst)
Library Linker
object script (.ld) 15
Passing parameters via the stack
• Benefits?
• Drawbacks?
16
Passing parameters via the registers/stack
17
ABI Basic Rules
1. A subroutine must preserve the contents of the
registers r4-r11 and SP
2. Arguments are passed though r0 to r3
– If we need more, we put a pointer into memory in one of
the registers.
• We’ll worry about that later.
3. Return value is placed in r0
– r0 and r1 if 64-bits.
4. Allocate space on stack as needed. Use it as
needed.
– Put it back when done…
– Keep word aligned.
18
Other useful facts
• Stack grows down.
– And pointed to by “SP”
• Address we need to go back to in “LR”
And useful things for the example
• Assembly instructions
– add adds two values
– mul multiplies two values
– bx branch to register
19
A simple ABI routine
• int bob(int a, int b)
– returns a2 + b2
• Instructions you might need
– add adds two values
– mul multiplies two values
– bx branch to register
20
Same thing, but for no good reason using the
stack
• int bob(int a, int b)
– returns a2 + b2
21
Some disassembly
• 0x20000490 <bob>: push {r7}
• 0x20000492 <bob+2>: sub sp, #20
• 0x20000494 <bob+4>: add r7, sp, #0 • return(x);
• 0x20000496 <bob+6>: str r0, [r7, #4] • 0x200004b6 <bob+38>: ldr r3,
• 0x20000498 <bob+8>: str r1, [r7, #0]
• x=a*a; [r7, #8]
• 0x2000049a <bob+10>: ldr r3, [r7, #4] • }
• 0x2000049c <bob+12>: ldr r2, [r7, #4] • 0x200004b8 <bob+40>: mov r0,
• 0x2000049e <bob+14>: mul.w r3, r2, r3 r3
• 0x200004a2 <bob+18>: str r3, [r7, #8]
• y=b*b;
• 0x200004ba <bob+42>: add.w r7,
• 0x200004a4 <bob+20>: ldr r3, [r7, #0] r7, #20
• 0x200004a6 <bob+22>: ldr r2, [r7, #0] • 0x200004be <bob+46>: mov sp,
• 0x200004a8 <bob+24>: mul.w r3, r2, r3 r7
• 0x int bob(int a,
• 0x200004c0 int b) pop {r7}
<bob+48>:
• x=x+y;
• 0x200004ae <bob+30>: ldr r2, [r7, #8] • { 0x200004c2 <bob+50>: bx lr
• 0x200004b0 <bob+32>: ldr r3, [r7, #12] int x, y;
• 0x200004b2 <bob+34>: add r3, r2
• 0x200004b4 <bob+36>: str r3, [r7, #8] x=a*a;
• 0x200004ac <bob+28>: str r3, [r7, #12] y=b*b;
x=x+y;
return(x);
}
22
Outline
• Minute quiz
• Announcements
• Review
• Assembly, C, and the ABI
• Memory
• Memory-mapped I/O
23
System
Memory
Map
Outline
• Minute quiz
• Announcements
• Review
• Assembly, C, and the ABI
• Memory
• Memory-mapped I/O
25
Memory-mapped I/O
• The idea is really simple
– Instead of real memory at a given memory
address, have an I/O device respond.
• Example:
– Let’s say we want to have an LED turn on if we
write a “1” to memory location 5.
– Further, let’s have a button we can read (pushed
or unpushed) by reading address 4.
• If pushed, it returns a 1.
• If not pushed, it returns a 0.
26
Now…
• How do you get that to happen?
– We could just say “magic” but that’s not very
helpful.
– Let’s start by detailing a simple bus and hooking
hardware up to it.
• We’ll work on a real bus next time!
27
Basic example
• Discuss a basic bus protocol
– Asynchronous (no clock)
– Initiator and Target
– REQ#, ACK#, Data[7:0], ADS[7:0], CMD
• CMD=0 is read, CMD=1 is write.
• REQ# low means initiator is requesting something.
• ACK# low means target has done its job.
A read transaction
• Say initiator wants to read location 0x24
– Initiator sets ADS=0x24, CMD=0.
– Initiator then sets REQ# to low. (why do we need a delay?
How much of a delay?)
– Target sees read request.
– Target drives data onto data bus.
– Target then sets ACK# to low.
– Initiator grabs the data from the data bus.
– Initiator sets REQ# to high, stops driving ADS and CMD
– Target stops driving data, sets ACK# to high terminating
the transaction
Read transaction
ADS[7:0] ?? 0x24 ??
CMD
Data[7:0] ?? 0x55 ??
REQ#
ACK#
ABCD E F G HI
A write transaction
(write 0xF4 to location 0x31)
– Initiator sets ADS=0x31, CMD=1, Data=0xF4
– Initiator then sets REQ# to low.
– Target sees write request.
– Target reads data from data bus. (Just has to store in a register, need not
write all the way to memory!)
– Target then sets ACK# to low.
– Initiator sets REQ# to high & stops driving other lines.
– Target sets ACK# to high terminating the transaction
The push-button
(if ADS=0x04 write 0 or 1 depending on button)
ADS[7]
ADS[6]
ADS[5] Delay ACK#
ADS[4]
ADS[3]
ADS[2]
ADS[1]
ADS[0]
REQ#
Data[7]
..
..
0
..
..
..
Button (0 or 1) Data[0]
Button (0 or 1)
The push-button
(if ADS=0x04 write 0 or 1 depending on button)
ADS[7]
ADS[6]
ADS[5] Delay ACK#
ADS[4]
ADS[3]
ADS[2]
ADS[1]
ADS[0]
REQ#
Data[7]
..
..
0 What about
.. CMD?
..
..
Button (0 or 1) Data[0]
The LED
(1 bit reg written by LSB of address 0x05)
ADS[7]
ADS[6]
ADS[5]
ADS[4] Delay ACK#
ADS[3]
ADS[2]
ADS[1]
ADS[0]
D Flip-flop
REQ#
which
clock controls LED
DATA[7]
DATA[6]
DATA[5]
DATA[4]
DATA[3]
DATA[2]
DATA[1]
DATA[0]
Basic example
• Discuss a basic bus protocol
– Asynchronous (no clock)
– Initiator and Target
– REQ#, ACK#, Data[7:0], ADS[7:0], CMD
• CMD=0 is read, CMD=1 is write.
• REQ# low means initiator is requesting something.
• ACK# low means target has done its job.
A read transaction
• Say initiator wants to read location 0x24
– Initiator sets ADS=0x24, CMD=0.
– Initiator then sets REQ# to low. (why do we need a delay?
How much of a delay?)
– Target sees read request.
– Target drives data onto data bus.
– Target then sets ACK# to low.
– Initiator grabs the data from the data bus.
– Initiator sets REQ# to high, stops driving ADS and CMD
– Target stops driving data, sets ACK# to high terminating
the transaction
Read transaction
ADS[7:0] ?? 0x24 ??
CMD
Data[7:0] ?? 0x55 ??
REQ#
ACK#
ABCD E F G HI
A write transaction
(write 0xF4 to location 0x31)
– Initiator sets ADS=0x31, CMD=1, Data=0xF4
– Initiator then sets REQ# to low.
– Target sees write request.
– Target reads data from data bus. (Just has to store in a register, need not
write all the way to memory!)
– Target then sets ACK# to low.
– Initiator sets REQ# to high & stops driving other lines.
– Target sets ACK# to high terminating the transaction
The push-button
(if ADS=0x04 write 0 or 1 depending on button)
ADS[7]
ADS[6]
ADS[5] Delay ACK#
ADS[4]
ADS[3]
ADS[2]
ADS[1]
ADS[0]
REQ#
Data[7]
..
..
0
..
..
..
Button (0 or 1) Data[0]
Button (0 or 1)
The push-button
(if ADS=0x04 write 0 or 1 depending on button)
ADS[7]
ADS[6]
ADS[5] Delay ACK#
ADS[4]
ADS[3]
ADS[2]
ADS[1]
ADS[0]
REQ#
Data[7]
..
..
0 What about
.. CMD?
..
..
Button (0 or 1) Data[0]
The LED
(1 bit reg written by LSB of address 0x05)
ADS[7]
ADS[6]
ADS[5]
ADS[4] Delay ACK#
ADS[3]
ADS[2]
ADS[1]
ADS[0]
D Flip-flop
REQ#
which
clock controls LED
DATA[7]
DATA[6]
DATA[5]
DATA[4]
DATA[3]
DATA[2]
DATA[1]
DATA[0]
Questions?
Comments?
Discussion?
42