[go: up one dir, main page]

0% found this document useful (0 votes)
16 views27 pages

Lec13 Memory 1 Notes

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 27

CENG 3420

Computer Organization & Design


Lecture 13: Memory Organization-1
Bei Yu
CSE Department, CUHK
byu@cse.cuhk.edu.hk

(Textbook: Chapters 5.1–5.2 & A.8–A.9)

Spring 2022
Introduction
Review: Major Components of a Computer

Processor Devices

Control Input
Memory

Datapath Output

Memory
Main
Cache

Secondary
Memory
(Disk)

3/24
Why We Need Memory?

Combinational Circuit:
• Always gives the same output for a given set of inputs

• E.g., adders

Sequential Circuit:
• Store information

• Output depends on stored information

• E.g., counter

• Need a storage element

4/24
Who Cares About the Memory Hierarchy?

1000 Processor
Processor  Growth  Curve  follows CPU

60%/yr.
“Moore’s  Law”
Performance
(2x/1.5  yr)
100 Processor-­Memory
Performance  Gap:
(grows  50%  /  year)
10
DRAM
DRAM
9%/yr.
1 (2x/10  yrs)
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
Time
Processor-DRAM Memory Performance Gap
5/24
6/24
7/24
Memory System Revisted

• Maximum size of memory is determined by addressing scheme

E.g.
16-bit addresses can only address 216 = 65536 memory locations

• Most machines are byte-addressable


• each memory address location refers to a byte
• Most machines retrieve/store data in words
• Common abbreviations
• 1k ≈ 210 (kilo)
• 1M ≈ 220 (Mega)
• 1G ≈ 230 (Giga)
• 1T ≈ 240 (Tera)

8/24
Simplified View
Data transfer takes place through
• MAR: memory address register
• MDR: memory data register

Processor Memory
k-bit
address bus
MAR
n -bit
data bus
Up to 2 k addressable
MDR locations

Word length = n bits


Several  addressable
Control lines locations  (bytes)  are  
grouped  into  a  word
( R / W , MFC, etc.)

9/24
Big Picture

Processor usually runs much faster than main memory:


• Small memories are fast, large memories are slow.
• Use a cache memory to store data in the processor that is likely to be used.

Main memory is limited:


• Use virtual memory to increase the apparent size of physical memory by moving
unused sections of memory to disk (automatically).
• A translation between virtual and physical addresses is done by a memory
management unit (MMU)
• To be discussed in later lectures

10/24
Characteristics of the Memory Hierarchy

Processor
Inclusive–
4-8 bytes (word) what is in L1$
is a subset of
Increasing L1$ what is in L2$
distance 8-32 bytes (block) is a subset of
from the L2$ what is in MM
processor that is a
1 to 4 blocks
in access subset of is in
time Main Memory
SM
1,024+ bytes (disk sector = page)

Secondary Memory

(Relative) size of the memory at each level

11/24
Memory Hierarchy: Why Does it Work?

Temporal Locality (locality in time)


If a memory location is referenced then it will tend to be referenced again soon

• Keep most recently accessed data items closer to the processor

12/24
Memory Hierarchy: Why Does it Work?

Temporal Locality (locality in time)


If a memory location is referenced then it will tend to be referenced again soon

• Keep most recently accessed data items closer to the processor

Spatial Locality (locality in space)


If a memory location is referenced, the locations with nearby addresses will tend to be
referenced soon

• Move blocks consisting of contiguous words closer to the processor

12/24
Memory Hierarchy

Taking advantage of the principle of locality:


• Present the user with as much memory as is available in the cheapest technology.
• Provide access at the speed offered by the fastest technology

Processor

Control Tertiary
Secondary Storage
Storage (Tape)
Second Main
(Disk)
On-Chip
Registers

Level Memory
Cache

Datapath Cache (DRAM)


(SRAM)

Speed: ~1 ns Tens ns Hundreds ns – 1 us Tens ms Tens sec


Size (bytes): Hundreds Mega's Giga's Tera's
13/24
https://youtu.be/p3q5zWCw8J4

14/24
Terminology

Random Access Memory (RAM)


Property: comparable access time for any memory locations

Block (or line)


the minimum unit of information that is present (or not) in a cache

15/24
Terminology

• Hit Rate: the fraction of memory accesses found in a level of the memory hierarchy
• Miss Rate: the fraction of memory accesses not found in a level of the memory
hierarchy, i.e. 1 - (Hit Rate)

Hit Time
Time to access the block + Time to determine hit/miss

Miss Penalty
Time to replace a block in that level with the corresponding block from a lower level

Hit Time << Miss Penalty

16/24
Bandwidth v.s. Latency

Example
• Mary acts FAST but she’s always LATE.
• Peter is always PUNCTUAL but he is SLOW.

17/24
Bandwidth v.s. Latency

Example
• Mary acts FAST but she’s always LATE.
• Peter is always PUNCTUAL but he is SLOW.

Bandwidth:
• talking about the “number of bits/bytes per second” when transferring a block of
data steadily.
Latency:
• amount of time to transfer the first word of a block after issuing the access signal.
• Usually measure in “number of clock cycles” or in ns/µs.

17/24
Question:
Suppose the clock rate is 500 MHz. What is the latency and what is the bandwidth,
assuming that each data is 64 bits?

Clock

Row
Access
Strobe

Data d0 d1 d2

18/24
• 500 MHz = 2.0 × 10−9 second
• latency = 5 cycle = 10−8 second
8
• bandwidth = = 4 × 109 byte / second.
2 × 10−9

19/24
Information Storage
Storage based on Feedback

• What if we add feedback to a pair of inverters?

21/24
Storage based on Feedback

• What if we add feedback to a pair of inverters?

• Usually drawn as a ring of cross-coupled inverters


• Stable way to store one bit of information (w. power)

21/24
How to change the value stored?

• Replace inverter with NOR gate


• SR-Latch

22/24
QUESTION:
What’s the Q value based on different R, S inputs?

• R=S=1:

• S=0, R=1:

• S=1, R=0:

• R=S=0:

23/24
How to remember?
• S: set
• R: re-set

• R=S=1: not determined, not allowed


• S=0, R=1: set value to 0
• S=1, R=0:set value to 1
• R=S=0: latch holds current value

24/24

You might also like