L02: Memory & Data I 2019
Memory, Data, & Addressing I
L02: Memory & Data I 2019
Roadmap
C: Java: Memory & data
car *c = malloc(sizeof(car)); Car c = new Car(); Integers & floats
c->miles = 100; c.setMiles(100); x86 assembly
c->gals = 17; c.setGals(17); Procedures & stacks
float mpg = get_mpg(c); float mpg = Executables
free(c); c.getMPG(); Arrays & structs
Memory & caches
Assembly get_mpg: Processes
pushq %rbp
language: movq %rsp, %rbp Virtual memory
... Memory allocation
popq %rbp Java vs. C
ret
OS:
Machine 0111010000011000
100011010000010000000010
code: 1000100111000010
110000011111101000011111
Computer
system:
2
L02: Memory & Data I 2019
Memory, Data, and Addressing
❖ Hardware - High Level Overview
❖ Representing information as bits and bytes
▪ Memory is a byte-addressable array
▪ Machine “word” size = address size = register size
❖ Organizing and addressing data in memory
▪ Endianness – ordering bytes in memory
❖ Manipulating data in memory using C
❖ Boolean algebra and bit-level manipulations
3
L02: Memory & Data I 2019
Hardware: Physical View
USB…
CPU
(empty slot)
I/O
controller
Memory
Storage connections
4
L02: Memory & Data I 2019
Hardware: Logical View
CPU Memory
Bus
Disks Net USB Etc.
5
L02: Memory & Data I 2019
Hardware: 351 View (version 0)
? Memory
CPU
❖ The CPU executes instructions
❖ Memory stores data How are data
and instructions
❖ Binary encoding! represented?
▪ Instructions are just data 6
L02: Memory & Data I 2019
Aside: Why Base 2?
❖ Electronic implementation
▪ Easy to store with bi-stable elements
▪ Reliably transmitted on noisy and inaccurate wires
0 1 0
3.3V
2.8V
0.5V
0.0V
❖ Other bases possible, but not yet viable:
▪ DNA data storage (base 4: A, C, G, T) is a hot topic
▪ Quantum computing
7
L02: Memory & Data I 2019
Binary Encoding Additional Details
❖ Because storage is finite in reality, everything is
stored as “fixed” length
▪ Data is moved and manipulated in fixed-length chunks
▪ Multiple fixed lengths (e.g. 1 byte, 4 bytes, 8 bytes)
▪ Leading zeros now must be included up to “fill out” the fixed
length
❖ Example: the “eight-bit” representation of the
number 4 is 0b00000100
Least Significant Bit (LSB)
Most Significant Bit (MSB)
8
L02: Memory & Data I 2019
Hardware: 351 View (version 0)
instructions
? Memory
data
CPU
❖ To execute an instruction, the CPU must:
1) Fetch the instruction
2) (if applicable) Fetch data needed by the instruction
3) Perform the specified computation
4) (if applicable) Write the result back to memory
9
L02: Memory & Data I 2019
Hardware: 351 View (version 1)
i-cache
instructions
take 469 Memory
data
CPU registers
❖ More CPU details:
▪ Instructions are held temporarily in the instruction cache
▪ Other data are held temporarily in registers
❖ Instruction fetching is hardware-controlled
❖ Data movement is programmer-controlled (assembly) 10
L02: Memory & Data I 2019
Hardware: 351 View (version 1)
i-cache
instructions
take 469 Memory
data
CPU registers
❖ We will start by learning about Memory
How does a
program find its
data in memory?
11
L02: Memory & Data I 2019
An Address Refers to a Byte of Memory
•••
❖ Conceptually, memory is a single, large array of bytes,
each with a unique address (index)
▪ Each address is just a number represented in fixed-length binary
❖ Programs refer to bytes in memory by their addresses
▪ Domain of possible addresses = address space
▪ We can store addresses as data to “remember” where other data is in
memory
❖ But not all values fit in a single byte… (e.g. 351)
▪ Many operations actually use multi-byte values
12
L02: Memory & Data I 2019
Peer Instruction Question
❖ If we choose to use 4-bit addresses, how big is our
address space?
▪ i.e. How much space can we “refer to” using our addresses?
A. 16 bits
B. 16 bytes
C. 4 bits
D. 4 bytes
E. We’re lost…
13
L02: Memory & Data I 2019
Machine “Words”
❖ Instructions encoded into machine code (0’s and 1’s)
▪ Historically (still true in some assembly languages), all
instructions were exactly the size of a word
❖ We have chosen to tie word size to address size/width
▪ word size = address size = register size
▪ word size = 𝑤 bits → 2𝑤 addresses
❖ Current x86 systems use 64-bit (8-byte) words
▪ Potential address space: 𝟐𝟔𝟒 addresses
264 bytes 1.8 x 1019 bytes
= 18 billion billion bytes = 18 EB (exabytes)
▪ Actual physical address space: 48 bits
14
L02: Memory & Data I 2019
Word-Oriented Memory Organization
64-bit 32-bit Addr.
Addresses still specify Bytes
❖ Words Words (hex)
locations of bytes in memory 0x00
Addr
▪ Addresses of successive words =
0x01
differ by word size (in bytes): ?? 0x02
Addr
e.g. 4 (32-bit) or 8 (64-bit) =
0x03
▪ Address of word 0, 1, … 10? ?? 0x04
Addr
=
0x05
?? 0x06
0x07
0x08
Addr
=
0x09
Addr
?? 0x0A
= 0x0B
?? 0x0C
Addr
=
0x0D
?? 0x0E
0x0F 15
L02: Memory & Data I 2019
Address of a Word = Address of First Byte in the Word
64-bit 32-bit Addr.
Addresses still specify Bytes
❖ Words Words (hex)
locations of bytes in memory 0x00
Addr
▪ Addresses of successive words =
0x01
differ by word size (in bytes): ??
0000 0x02
Addr
e.g. 4 (32-bit) or 8 (64-bit) =
0x03
▪ Address of word 0, 1, … 10? ??
0000 0x04
Addr
=
0x05
❖ Address of word ??
0004 0x06
= address of first byte in word 0x07
▪ The address of any chunk of 0x08
Addr
memory is given by the address =
0x09
of the first byte Addr
??
0008 0x0A
▪ Alignment = 0x0B
??
0008 0x0C
Addr
=
0x0D
??
0012 0x0E
0x0F 16
L02: Memory & Data I 2019
A Picture of Memory (64-bit word view)
❖ A “64-bit (8-byte) word-aligned” view of memory:
▪ In this type of picture, each row is composed of 8 bytes
▪ Each cell is a byte one word
▪ A 64-bit pointer
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
will fit on one row Address
0x00
0x
0x
0x
0x
0x
0x
0x
0x
0x
17
L02: Memory & Data I 2019
A Picture of Memory (64-bit word view)
❖ A “64-bit (8-byte) word-aligned” view of memory:
▪ In this type of picture, each row is composed of 8 bytes
▪ Each cell is a byte one word
▪ A 64-bit pointer
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
will fit on one row Address
0x00
0x08
0x10
0x18
0x20
0x28
0x08 0x09 0x0A 0x0B 0x0C 0x0D 0x0E 0x0F 0x30
0x38
0x40
0x48
18
L02: Memory & Data I 2019
64-bit example
Addresses and Pointers (pointers are 64-bits wide)
big-endian
❖ An address refers to a location in memory
❖ A pointer is a data object that holds an address
▪ Address can point to any data
Address
❖ Value 504 stored at 0x00
address 0x08 00 00 00 00 00 00 01 F8 0x08
0x10
▪ 50410 = 1F816 0x18
= 0x 00 ... 00 01 F8 0x20
0x28
❖ Pointer stored at 0x30
0x38 points to 00 00 00 00 00 00 00 08 0x38
address 0x08 0x40
0x48
19
L02: Memory & Data I 2019
64-bit example
Addresses and Pointers (pointers are 64-bits wide)
big-endian
❖ An address refers to a location in memory
❖ A pointer is a data object that holds an address
▪ Address can point to any data
Address
❖ Pointer stored at 0x00
0x48 points to 00 00 00 00 00 00 01 F8 0x08
0x10
address 0x38 0x18
▪ Pointer to a pointer! 0x20
0x28
❖ Is the data stored 0x30
at 0x08 a pointer? 00 00 00 00 00 00 00 08 0x38
0x40
▪ Could be, depending 00 00 00 00 00 00 00 38 0x48
on how you use it
20
L02: Memory & Data I 2019
Data Representations
❖ Sizes of data types (in bytes)
Java Data Type C Data Type 32-bit (old) x86-64
boolean bool 1 1
byte char 1 1
char 2 2
short short int 2 2
int int 4 4
float float 4 4
long int 4 8
double double 8 8
long long 8 8
long double 8 16
(reference) pointer * 4 8
address size = word size
To use “bool” in C, you must #include <stdbool.h> 21
L02: Memory & Data I 2019
Memory Alignment
❖ Aligned: Primitive object of 𝐾 bytes must have an
address that is a multiple of 𝐾
▪ More about alignment later in the course
𝐾 Type
1 char
2 short
4 int, float
8 long, double, pointers
❖ For good memory system performance, Intel (x86)
recommends data be aligned
▪ However the x86-64 hardware will work correctly otherwise
• Design choice: x86-64 instructions are variable bytes long
22
L02: Memory & Data I 2019
Byte Ordering
❖ How should bytes within a word be ordered in
memory?
▪ Example: store the 4-byte (32-bit) int:
0x a1 b2 c3 d4
❖ By convention, ordering of bytes called endianness
▪ The two options are big-endian and little-endian
• In which address does the least significant byte go?
• Based on Gulliver’s Travels: tribes cut eggs on different sides
(big, little)
23
L02: Memory & Data I 2019
Byte Ordering
❖ Big-endian (SPARC, z/Architecture)
▪ Least significant byte has highest address
❖ Little-endian (x86, x86-64)
▪ Least significant byte has lowest address
❖ Bi-endian (ARM, PowerPC)
▪ Endianness can be specified as big or little
❖ Example: 4-byte data 0xa1b2c3d4 at address 0x100
0x100 0x101 0x102 0x103
Big-Endian 01 23 45 67
0x100 0x101 0x102 0x103
Little-Endian 67 45 23 01
24
L02: Memory & Data I 2019
Byte Ordering
❖ Big-endian (SPARC, z/Architecture)
▪ Least significant byte has highest address
❖ Little-endian (x86, x86-64)
▪ Least significant byte has lowest address
❖ Bi-endian (ARM, PowerPC)
▪ Endianness can be specified as big or little
❖ Example: 4-byte data 0xa1b2c3d4 at address 0x100
0x100 0x101 0x102 0x103
Big-Endian 01
a1 23
b2 45
c3 67
d4
0x100 0x101 0x102 0x103
Little-Endian 67
d4 45
c3 23
b2 01
a1
25
L02: Memory & Data I 2019
Decimal: 12345
0011 0000 0011 1001
Byte Ordering Examples Binary:
Hex: 3 0 3 9
IA32, x86-64 SPARC
(little-endian) (big-endian)
int x = 12345; 0x00 39 00 0x00
// or x = 0x3039; 0x01 30 00 0x01
0x02 00 30 0x02
0x03 00 39 0x03
32-bit 64-bit 32-bit 64-bit
long int y = 12345; IA32 x86-64 SPARC SPARC
// or y = 0x3039; 0x00 39 39 0x00 0x00 00 00 0x00
0x01 30 30 0x01 0x01 00 00 0x01
0x02 00 00 0x02 0x02 30 00 0x02
0x03 00 00 0x03 0x03 39 00 0x03
(A long int is 00 0x04 00 0x04
00 0x05 00 0x05
the size of a word)
00 0x06 30 0x06
00 0x07 39 0x07
26
L02: Memory & Data I 2019
Peer Instruction Question:
❖ We store the value 0x 01 02 03 04 as a word at
address 0x100 in a big-endian, 64-bit machine
❖ What is the byte of data stored at address 0x104?
A. 0x04
B. 0x40
C. 0x01
D. 0x10
E. We’re lost…
27
L02: Memory & Data I 2019
Endianness
❖ Endianness only applies to memory storage
❖ Often programmer can ignore endianness because it
is handled for you
▪ Bytes wired into correct place when reading or storing from
memory (hardware)
▪ Compiler and assembler generate correct behavior (software)
❖ Endianness still shows up:
▪ Logical issues: accessing different amount of data than how
you stored it (e.g. store int, access byte as a char)
▪ Need to know exact values to debug memory errors
▪ Manual translation to and from machine code (in 351)
28
L02: Memory & Data I 2019
Summary
❖ Memory is a long, byte-addressed array
▪ Word size bounds the size of the address space and memory
▪ Different data types use different number of bytes
▪ Address of chunk of memory given by address of lowest byte
in chunk
▪ Object of 𝐾 bytes is aligned if it has an address that is a
multiple of 𝐾
❖ Pointers are data objects that hold addresses
❖ Endianness determines memory storage order for
multi-byte data
29