EP1200 Introduction to
Computing Systems Engineering
Assembler
Outlook: from Hack to a “real” computer
Outlook
Outlook: from Hack to a “real” computer
The Hack CPU and computer HW is as simple as possible
In reality design issues include
• Memory hierarchy – memory access needs time and energy
• Specific processors for specific tasks (graphics, floating point arithmetic)
• Pipelining
– Several consecutive instructions are processed at the same time, in
different stages (e.g., instruction decode and computation)
• Parallel processing
– Instruction processed at several processors, or
– Several instructions processed, if order does not matter
• Communication inside the computer - how processors, memory, I/O
devices interact
– Buses and switches – a small network in itself
Outlook: From Hack to a “real” computer
Contributed by Rajesh Kothandapani
http://www.laynetworks.com
Where we are at:
Human Abstract design Software
abstract interface
Thought hierarchy
Chapters 9, 12
H.L. Language Compiler
& abstract interface
Chapters 10 - 11
Operating Sys.
Virtual VM Translator
abstract interface
Machine Chapters 7 - 8
Assembly
Language
Assembler
Chapter 6
abstract interface
Computer
Machine Architecture
abstract interface
Language
Chapters 4 - 5
Hardware Gate Logic
abstract interface
Platform Chapters 1 - 3 Electrical
Chips & Engineering
Hardware Physics
Logic Gates
hierarchy
Why care about assemblers?
Because …
Assemblers are the first step of the software hierarchy ladder
An assembler is a translator of a simple language – needs simple
programming tools
Writing an assembler = practice for writing compilers
For now,
Assembler example ignore all
details!
Source code (example) Target code
// Computes 1+...+RAM[0] 0000000000010000
// And stored the sum in RAM[1] 1110111111001000
@i 0000000000010001
M=1 1110101010001000
@sum 0000000000010000
M=0 1111110000010000
(LOOP) assemble 0000000000000000
execute
@i 1111010011010000
D=M 0000000000010010
@R0 1110001100000001
D=D-M 0000000000010000
@WRITE 1111110000010000
D;JGT 0000000000010001
... // Etc. ...
For now,
Assembler example ignore all
details!
Source code (example) Target code
// Computes 1+...+RAM[0] 0000000000010000
// And stored the sum in RAM[1] 1110111111001000
@i 0000000000010001
M=1 1110101010001000
@sum 0000000000010000
M=0 1111110000010000
(LOOP) 0000000000000000
@i assemble execute
1111010011010000
D=M 0000000000010010
@R0 1110001100000001
D=D-M 0000000000010000
@WRITE 1111110000010000
D;JGT 0000000000010001
... // Etc. ...
The program translation challenge
Extract the program’s semantics from the source program,
using the syntax rules of the source language
Re-express the program’s semantics in the target language,
using the syntax rules of the target language
Assembler = simple translator
Translates each assembly instruction into one binary machine instruction
Handles symbols (e.g. i, sum, LOOP, …) – maintains a Symbol table <symbol, address>
Translating / assembling A-instructions
Symbolic: @value // Where value is either a non-negative decimal number
// or a symbol referring to such number.
value (v = 0 or 1)
Binary: 0 v v v v v v v v v v v v v v v
Translation to binary:
If value is a non-negative decimal number, simple
If value is a symbol (label or variable) get address from the symbol
table
Translating / assembling C-instructions
Symbolic: dest=comp;jump // Either the dest or jump fields may be empty.
// If dest is empty, the "=" is ommitted;
// If jump is empty, the ";" is omitted.
comp dest jump
Binary: 1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3
Translating / assembling C-instructions
Translate the a,c,d,j bits:
Use the definitions in the tables
@R0
The overall assembler logic D=M
@END
D;JLE
@counter
For each (real) command M=D
Parsing @SCREEN
D=A
break the command into its underlying fields (mnemonics) @x
M=D
Code generation
(LOOP)
A-instruction: replace the symbolic reference (if any) with @x
A=M
the corresponding memory address, using the symbol table M=-1
C-instruction: for each field in the instruction, generate the @x
D=M
corresponding binary code @32
Assemble the translated binary codes into a complete 16-bit D=D+A
@x
machine instruction M=D
Write the 16-bit instruction to the output file @counter
MD=M-1
@LOOP
Note that comment lines and label declarations (pseudo D;JGT
(END)
commands) generate no code @END
0;JMP
Typical symbolic Hack
assembly code:
Handling symbols (aka symbol resolution) @R0
D=M
@END
D;JLE
@counter
Assembly programs can have many different symbols M=D
@SCREEN
D=A
Labels that mark destinations of goto commands (ROM) @x
M=D
• LOOP, END (LOOP)
Pre-defined variables that mark special memory locations @x
A=M
(RAM) M=-1
• R0, SCREEN @x
D=M
User defined variables (RAM) @32
D=D+A
• counter, x @x
M=D
@counter
Symbols are maintained with the help of a symbol table. MD=M-1
@LOOP
D;JGT
(END)
@END
0;JMP
Handling symbols: symbol table
Source code (example) Symbol table
// Computes 1+...+RAM[0] R0 0
// And stored the sum in RAM[1] R1 1
0 @i R2 2
1 M=1 // i = 1 ... ...
2 @sum R15 15
3 M=0 // sum = 0 SCREEN 16384
(LOOP) KBD 24576 Predefined RAM
4 @i // if i>RAM[0] goto WRITE SP 0 locations, filled in
5 D=M before the assembler
LCL 1
6 @R0 process
ARG 2
7 D=D-M THIS 3
8 @WRITE THAT 4
9 D;JGT LOOP 4
10 @i // sum += i Labels
WRITE 18
11 D=M
END 22
12 @sum
i 16 Variables
13 M=D+M
sum 17
14 @i // i++
15 M=M+1
16 @LOOP // goto LOOP
17 0;JMP This symbol table is generated by the
(WRITE)
18 @sum assembler, and used to translate the
19 D=M symbolic code into binary code.
20 @R1
21 M=D // RAM[1] = the sum
(END)
22 @END
23 0;JMP
Typical symbolic Hack
assembly code:
The assembly process (detailed) @R0
D=M
Initialization: @END
D;JLE
@counter
• Create the symbol table, include pre-defined symbols M=D
@SCREEN
First pass: D=A
@x
• Go through the source code without generating any code. M=D
(LOOP)
• For each (LABEL) add the pair <LABEL , n > to the symbol @x
table (what is n?) A=M
M=-1
Second pass: march again through the source code, and @x
D=M
translate each line:
@32
D=D+A
• If the line is a C-instruction, simple @x
M=D
• If the line is @xxx where xxx is a number, simple @counter
MD=M-1
• If the line is @xxx and xxx is a symbol, use the symbol table @LOOP
D;JGT
– Add <symbol,n> pair, or get n, if entry already added. (END)
@END
0;JMP
The result ...
Source Assembly code Target Machine Language code
// Computes 1+...+RAM[0] 0000000000010000
// And stored the sum in RAM[1] 1110111111001000
@i 0000000000010001
M=1 // i = 1 1110101010001000
@sum 0000000000010000
M=0 // sum = 0 1111110000010000
(LOOP) 0000000000000000
@i // if i>RAM[0] goto WRITE 1111010011010000
D=M 0000000000010010
@R0 1110001100000001
D=D-M 0000000000010000
@WRITE 1111110000010000
D;JGT assemble 0000000000010001
@i // sum += i 1111000010001000
D=M 0000000000010000
@sum 1111110111001000
M=D+M 0000000000000100
@i // i++ 1110101010000111
M=M+1 0000000000010001
@LOOP // goto LOOP 1111110000010000
0;JMP 0000000000000001
(WRITE) 1110001100001000
@sum 0000000000010110
D=M 1110101010000111
@R1
M=D // RAM[1] = the sum
(END)
@END
0;JMP
Assembler implementation
(Chapter 6 project)
Read Chapter 6 Assembler
Set up a high level programming environment (suggested: Python)
Project: Complete the provided Assembler implementation
Understand the provided incomplete code
Add: Parsing and code generation for dest in C-instructions
Add: Symbol table handling: address resolution for LABELs
Hand in Project 5 by April 7, 8:00.