0% found this document useful (0 votes)

79 views11 pages

Cache Memory Types and Write Strategies

- Larger block sizes in caches can improve performance by loading nearby data into the cache to take advantage of spatial locality. - Set-associative caches assign each memory address to a particular set but allow blocks to be placed in multiple locations within that set. Higher associativity leads to fewer conflicts but higher hardware costs. - Writing to a cache can cause inconsistencies between the cache and main memory. Write-through caches solve this by writing to both locations but are slow. Write buffers queue writes to memory to allow the CPU to continue while writes are pending.

Uploaded by

Hari C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views11 pages

Cache Memory Types and Write Strategies

Uploaded by

Hari C

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Review

How is this cache different if… 2 bits

Address (32 bits)
-  the block is 4 words?
20 10
-  the index field is 12 bits?

Index Valid Tag Data

0
1
2
3
...
...
1022
1023
Tag 8 8 8 8
=
Mux

8
Data 1
Hit

2-way set associative implementation

Compare a 2-way cache set Address (m bits)
Block
Tag Index
associative cache with a offset
(m-k-n) k
fully-associative cache?

Index Valid Tag Data Valid Tag Data

0
...
2k

2n 2n
= =

Only 2 comparators
needed
Cache tags are a little
shorter too 2-to-1 mux

Hit 2n
… deciding replacement? Data

1
Set associative caches are a general idea
By now you have noticed the 1-way set associative
cache is the same as a direct-mapped cache
Similarly, if a cache has 2k blocks, a 2k-way set
associative cache would be the same as a fully-
associative cache
1-way 2-way 4-way 8-way
8 sets, 4 sets, 2 sets, 1 set,
1 block each 2 blocks each 4 blocks each 8 blocks

Set Set Set Set

0
1 0
2 0
1
3
0
4
2
5 1
6 3
7

direct mapped fully associative

Summary
Larger block sizes can take advantage of spatial
locality by loading data from not just one address,
but also nearby addresses, into the cache
Associative caches assign each memory address to a
particular set within the cache, but not to any
specific block within that set
 Set sizes range from 1 (direct-mapped) to 2k (fully
associative)
 Larger sets and higher associativity lead to fewer cache
conflicts and lower miss rates, but they also increase the
hardware cost
 In practice, 2-way through 16-way set-associative caches
strike a good balance between lower miss rates and higher
costs
Next, we’ll talk more about measuring cache
performance, and also discuss the issue of writing
data to a cache 4

2
Four important questions
1.  When we copy a block of data from main memory to the
cache, where exactly should we put it?

2. How can we tell if a word is already in the cache, or if it has

to be fetched from main memory first?

3. Eventually, the small cache memory might fill up. To load a

new block from main RAM, we’d have to replace one of the
existing blocks in the cache... which one?

4. How can write operations be handled by the memory

system?

  Previous lectures answered the first 3. Today, we consider the 4th!

Writing to a cache
Writing to a cache raises several additional issues
First, let’s assume that the address we want to write to
is already loaded in the cache. We’ll assume a
simple direct-mapped cache
Index V Tag Data Address Data
... ...
110 1 11010 42803 1101 0110 42803
... ...
If we write a new value to that address, we can store
the new data in the cache, and avoid an expensive
main memory access
Mem[214] = 21763

Index V Tag Data Address Data

... ...
110 1 11010 21763 1101 0110 42803
... ... 6

3
Inconsistent memory
But now the cache and memory contain different,
inconsistent data!
First Rule of Data Management: No inconsistent data
Second Rule: Don’t Even Think About Violating 1st Rule
How can we ensure that subsequent loads will return
the right value?
This is also problematic if other devices are sharing the
main memory, as in I/O or a multiprocessor system

Index V Tag Data Address Data

... ...
110 1 11010 21763 1101 0110 42803
... ...

Write-through caches
A write-through cache solves the inconsistency
problem by forcing all writes to update both the
cache and the main memory.
Mem[214] = 21763

Index V Tag Data Address Data

... ...
110 1 11010 21763 1101 0110 21763
... ...

This is simple to implement and keeps the cache and

memory consistent
Why might it be not so good?
8

4
Write-through caches
A write-through cache solves the inconsistency
problem by forcing all writes to update both the
cache and the main memory.
Mem[214] = 21763

Index V Tag Data Address Data

... ...
110 1 11010 21763 1101 0110 21763
... ...

This is simple to implement and keeps the cache and

memory consistent.
The bad thing is that forcing every write to go to main
memory, we use up bandwidth between the cache 9
and the memory.

Write buffers
Write-through caches can result in slow writes, so processors
typically include a write buffer, which queues pending writes to
main memory and permits the CPU to continue …
Producer Buffer Consumer

Buffers are commonly used when two devices run at different

speeds
  If a producer generates data too quickly for a consumer to handle,
the extra data is stored in a buffer and the producer can continue
on with other tasks, without waiting for the consumer
  Conversely, if the producer slows down, the consumer can
continue running at full speed as long as there is excess data in
the buffer
For us, the producer is the CPU and the consumer is the main
memory

5
Write buffers
Write-through caches can result in slow writes, so
processors typically include a write buffer, which
queues pending writes to main memory and permits
the CPU to continue …
Write
CPU Memory
Buffer

Notice that the write buffer allows the CPU to continue

before the write is complete, but write-through has
the problem: It uses memory bandwidth
int sum_array_rows(int a[M][N])
{
int i, j, sum = 0;

for (i = 0; i < M; i++)

for (j = 0; j < N; j++)
sum += a[i][j];
return sum;
} 11

Write-back caches
In a write-back cache, the memory is not updated until the
cache block needs to be replaced (e.g., when loading
data into a full cache set)
For example, we might write some data to the cache at
first, leaving it inconsistent with the main memory as
shown before
 The cache block is marked “dirty” to indicate this inconsistency
Mem[214] = 21763

Index V Dirty Tag Data Address Data

... 1000 1110 1225
110 1 1 11010 21763 1101 0110 42803
... ...

Subsequent reads to the same memory address will be

serviced by the cache, which contains the correct,
updated data 12

6
Finishing the write back
We don’t need to store the new value back to main
memory unless the cache block gets replaced
For example, on a read from Mem[142], which maps to
the same cache block, the modified cache contents
will first be written to main memory
Index V Dirty Tag Data Address Data
... 1000 1110 1225
110 1 1 11010 21763 1101 0110 21763
... ...

Only then can the cache block be replaced with data

from address 142
Index V Dirty Tag Data Address Data
... 1000 1110 1225
110 1 0 10001 1225 1101 0110 21763
... ...
13

Write-back cache discussion

The advantage of write-back caches is that not
all write operations need to access main
memory, as with write-through caches
 If a single address is frequently written to, then it
doesn’t pay to keep writing that data through to
main memory
 If several bytes within the same cache block are
modified, they will only force one memory write
operation at write-back time

7
Write-back cache discussion
Each block in a write-back cache needs a dirty bit to
indicate whether or not it must be saved to main
memory before being replaced—otherwise we might
perform unnecessary writebacks
Notice the penalty for the main memory access will not
be applied until the execution of some subsequent
instruction following the write
 In our example, the write to Mem[214] affected only the
cache
 But the load from Mem[142] resulted in two memory
accesses: one to save data to address 214, and one to load
data from address 142
•  The write can be “buffered” as was shown in write-through

Write misses
A second scenario is if we try to write to an address
that is not already contained in the cache; this is
called a write miss.
Let’s say we want to store 21763 into Mem[1101 0110]
but we find that address is not currently in the cache.
Index V Tag Data Address Data
... ...
110 1 00010 123456 1101 0110 6378
... ...

When we update Mem[1101 0110], should we also load

it into the cache?
16

8
Write around caches == write-no-allocate
With a write around policy, the write operation goes
directly to main memory without affecting the cache
Mem[214] = 21763

Index V Tag Data Address Data

... ...
110 1 00010 123456 1101 0110 21763
... ...

This is good when data is written but not immediately

used again, in which case there’s no point to load it
into the cache yet

for (int i = 0; i < SIZE; i++)

a[i] = i;
17

Allocate on write
An allocate on write strategy would instead load the
newly written data into the cache
Mem[214] = 21763

Index V Tag Data Address Data

... ...
110 1 11010 21763 1101 0110 21763
... ...

If that data is needed again soon, it will be available in

the cache

9
Which is it?
Given the following trace of accesses, can you
determine whether the cache is write-allocate or
write-no-allocate?
 Assume A and B are distinct, and can be in the cache
simultaneously.
Miss Load A
Miss Store B
Hit Store A
Hit Load A
Miss Load B
Hit Load B
Hit Load A

Which is it?
Given the following trace of accesses, can you
determine whether the cache is write-allocate or
write-no-allocate?
 Assume A and B are distinct, and can be in the cache
simultaneously.
Miss Load A
Miss Store B
Hit Store A
Hit Load A
Miss Load B
Hit Load B
Hit Load A
On a write-allocate
Answer: Write-no-allocate cache this would
be a hit
20

10
First Observations
Split Instruction/Data caches:
 Pro: No structural hazard between IF & MEM stages
•  A single-ported unified cache stalls fetch during load or store
 Con: Static partitioning of cache between instructions &
data
•  Bad if working sets unequal: e.g., code/DATA or CODE/data

Cache Hierarchies:
 Trade-off between access time & hit rate
•  L1 cache can focus on fast access time (okay hit rate)
•  L2 cache can focus on good hit rate (okay access time)
 Such hierarchical design is another “big idea”

CPU L1 cache L2 cache Main

Memory
21

Opteron Vital Statistics

CPU L1 cache L2 cache Main
Memory

L1 Caches: Instruction & Data

  64 kB
  64 byte blocks
  2-way set associative
  2 cycle access time
L2 Cache:
  1 MB
  64 byte blocks
  4-way set associative
  16 cycle access time (total, not just miss penalty)
Memory
  200+ cycle access time 22

Cse 410 Computer Systems: Hal Perkins Spring 2010 L T 13 C Hwit DPF Lecture 13 - Cache Writes and Performance
No ratings yet
Cse 410 Computer Systems: Hal Perkins Spring 2010 L T 13 C Hwit DPF Lecture 13 - Cache Writes and Performance
20 pages
Lectures wk11
No ratings yet
Lectures wk11
21 pages
Direct-Mapped Cache: Write Allocate With Write-Through Protocol
No ratings yet
Direct-Mapped Cache: Write Allocate With Write-Through Protocol
25 pages
L18 Cache Wrap Up
No ratings yet
L18 Cache Wrap Up
30 pages
Chap 6
No ratings yet
Chap 6
48 pages
Cache Mapping
100% (1)
Cache Mapping
44 pages
Cache Optimization Techniques
No ratings yet
Cache Optimization Techniques
23 pages
Chapter 7
No ratings yet
Chapter 7
23 pages
Memory
No ratings yet
Memory
42 pages
Cache Basics and Operation
No ratings yet
Cache Basics and Operation
42 pages
Computer Architecture: Cache Design
No ratings yet
Computer Architecture: Cache Design
61 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Cache Memory Architecture Guide
No ratings yet
Cache Memory Architecture Guide
33 pages
Comp Arch Lect5
No ratings yet
Comp Arch Lect5
26 pages
6.module 2 - Part 2
No ratings yet
6.module 2 - Part 2
39 pages
18 Caches Cornell PDF
No ratings yet
18 Caches Cornell PDF
43 pages
Computer Organization and Architecture (AT70.01) : Comp. Sc. and Inf. MGMT
No ratings yet
Computer Organization and Architecture (AT70.01) : Comp. Sc. and Inf. MGMT
49 pages
Memory Hierarchy Design Guide
No ratings yet
Memory Hierarchy Design Guide
115 pages
Cache Memory
No ratings yet
Cache Memory
47 pages
Lec8 Memory
No ratings yet
Lec8 Memory
17 pages
CH04 COA9e Cache Memory Repaired
No ratings yet
CH04 COA9e Cache Memory Repaired
42 pages
Unit V
No ratings yet
Unit V
44 pages
Cache Memory Explained
No ratings yet
Cache Memory Explained
9 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
CMP3010L08 Memory
No ratings yet
CMP3010L08 Memory
45 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
No ratings yet
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
25 pages
Cache
No ratings yet
Cache
34 pages
Chapter 2z
No ratings yet
Chapter 2z
54 pages
Assosiative Mapping - Cache Memory
No ratings yet
Assosiative Mapping - Cache Memory
2 pages
05) Cache Memory Introduction
No ratings yet
05) Cache Memory Introduction
20 pages
9 - Cache
No ratings yet
9 - Cache
58 pages
Computer Org and Arch: R.Magesh
No ratings yet
Computer Org and Arch: R.Magesh
48 pages
CH04 COA10e
No ratings yet
CH04 COA10e
46 pages
361 Computer Architecture Lecture 14: Cache Memory
No ratings yet
361 Computer Architecture Lecture 14: Cache Memory
20 pages
CMSC 611: Advanced Computer Architecture
No ratings yet
CMSC 611: Advanced Computer Architecture
21 pages
CH04 COA10e
No ratings yet
CH04 COA10e
41 pages
09 Caches Tlbs
No ratings yet
09 Caches Tlbs
33 pages
Coa PPT
No ratings yet
Coa PPT
158 pages
6.cache Memory - BVK
No ratings yet
6.cache Memory - BVK
47 pages
CMP3010L09 MemoryII
No ratings yet
CMP3010L09 MemoryII
39 pages
Lec8 - Caches
No ratings yet
Lec8 - Caches
55 pages
Cache Memory, Virtual Memory and Auxiliary Memory Notes
No ratings yet
Cache Memory, Virtual Memory and Auxiliary Memory Notes
42 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
49 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
Cache Memory
No ratings yet
Cache Memory
42 pages
Cache Memories
No ratings yet
Cache Memories
41 pages
Cache Memory
No ratings yet
Cache Memory
39 pages
16-Cache Memory-13-03-2024
No ratings yet
16-Cache Memory-13-03-2024
50 pages
Cache Mapping
No ratings yet
Cache Mapping
23 pages
CH04 Cache Memory
No ratings yet
CH04 Cache Memory
44 pages
CAO - Lecutre7 Cache Memory
100% (1)
CAO - Lecutre7 Cache Memory
39 pages
AC14L08 Memory Hierarchy
No ratings yet
AC14L08 Memory Hierarchy
20 pages
Cache Memory
No ratings yet
Cache Memory
51 pages
LL
No ratings yet
LL
26 pages
Distance vs Path-Vector Routing
No ratings yet
Distance vs Path-Vector Routing
25 pages
Transmission Time - Wikipedia
No ratings yet
Transmission Time - Wikipedia
2 pages
Float To Decimal Conversion
100% (1)
Float To Decimal Conversion
3 pages
A Notebook On Microprocessor System: August 2012
No ratings yet
A Notebook On Microprocessor System: August 2012
151 pages
Loader and Linker Lec5 6 7 8 9
100% (1)
Loader and Linker Lec5 6 7 8 9
58 pages
Linker (Computing) : 1 2 Dynamic Linking 3 Static Linking 4 Relocation 5 Linkage Editor 6 See Also 7 References
100% (1)
Linker (Computing) : 1 2 Dynamic Linking 3 Static Linking 4 Relocation 5 Linkage Editor 6 See Also 7 References
4 pages
Transmission Time - Wikipedia
No ratings yet
Transmission Time - Wikipedia
2 pages
Unit-5: Communication Technologies: Network
No ratings yet
Unit-5: Communication Technologies: Network
14 pages
Aban Impex
No ratings yet
Aban Impex
3 pages
Steal Force
No ratings yet
Steal Force
25 pages
Crash Recovery in Databases
No ratings yet
Crash Recovery in Databases
35 pages
Handy Mysql Commands Description Command: Main Menu Blog About
No ratings yet
Handy Mysql Commands Description Command: Main Menu Blog About
3 pages
Best ADSL Routers for BSNL India
No ratings yet
Best ADSL Routers for BSNL India
4 pages
Unix Filesystem: From Wikipedia, The Free Encyclopedia
No ratings yet
Unix Filesystem: From Wikipedia, The Free Encyclopedia
5 pages
Anupam Art Printers
No ratings yet
Anupam Art Printers
12 pages
VX WORKS
No ratings yet
VX WORKS
3 pages
Ibibio Dictionary
50% (2)
Ibibio Dictionary
132 pages
AE CheatSheet
No ratings yet
AE CheatSheet
8 pages
UA2000
No ratings yet
UA2000
4 pages
Pprog Skript PDF
No ratings yet
Pprog Skript PDF
130 pages
WDD 330 Web Fronted Development 2
No ratings yet
WDD 330 Web Fronted Development 2
190 pages
Affiliated Institutions
No ratings yet
Affiliated Institutions
39 pages
Exp22 Word Ch02 CumulativeAssessment - Space Instructions
No ratings yet
Exp22 Word Ch02 CumulativeAssessment - Space Instructions
3 pages
Application Manual: Additional Axes and Stand Alone Controller
No ratings yet
Application Manual: Additional Axes and Stand Alone Controller
178 pages
MCQ On Operating System
82% (11)
MCQ On Operating System
46 pages
Inspiron 15 3552 Laptop Service Manual en Us
No ratings yet
Inspiron 15 3552 Laptop Service Manual en Us
68 pages
Cisco DNA-C Getting Started Implementation Readiness
No ratings yet
Cisco DNA-C Getting Started Implementation Readiness
50 pages
C++ Language Introduction Guide
No ratings yet
C++ Language Introduction Guide
35 pages
Caller Id 1
No ratings yet
Caller Id 1
2 pages
Manual Gas Control Manager Program User Manual Floboss 107 en 132206
No ratings yet
Manual Gas Control Manager Program User Manual Floboss 107 en 132206
74 pages
Procedural Generation in VBS
No ratings yet
Procedural Generation in VBS
16 pages
Javanotes 5.1.2, Answers For Quiz On Chapter 8
No ratings yet
Javanotes 5.1.2, Answers For Quiz On Chapter 8
5 pages
SOQL & SOSL Guide for Salesforce Users
No ratings yet
SOQL & SOSL Guide for Salesforce Users
6 pages
Without Book Planner For schoolsKIS Noida - XLSX 1
No ratings yet
Without Book Planner For schoolsKIS Noida - XLSX 1
10 pages
Visual FoxPro Basics Guide
100% (2)
Visual FoxPro Basics Guide
51 pages
Lecture SP14 Functions 1
No ratings yet
Lecture SP14 Functions 1
22 pages
Jss2 Test
No ratings yet
Jss2 Test
2 pages
Mobile App UI with Fragments
No ratings yet
Mobile App UI with Fragments
43 pages
Amala Mubashira
No ratings yet
Amala Mubashira
11 pages
Aadhar Card..
100% (1)
Aadhar Card..
1 page
Report On Little Quilt Language
No ratings yet
Report On Little Quilt Language
17 pages
Unit 2 DATA TYPES OPERATORS AND EXPRESSION
No ratings yet
Unit 2 DATA TYPES OPERATORS AND EXPRESSION
34 pages
Deepfake and Beyond - A Survey of Face Manipulation and Fake Detection
No ratings yet
Deepfake and Beyond - A Survey of Face Manipulation and Fake Detection
23 pages
How To Use QPST Tool To Flash or Install Firmware
No ratings yet
How To Use QPST Tool To Flash or Install Firmware
1 page
ICT 4th Chapter
No ratings yet
ICT 4th Chapter
7 pages

Cache Memory Types and Write Strategies

Uploaded by

Cache Memory Types and Write Strategies

Uploaded by

Review

How is this cache different if… 2 bits

Index Valid Tag Data

2-way set associative implementation

Index Valid Tag Data Valid Tag Data

Set Set Set Set

direct mapped fully associative

2. How can we tell if a word is already in the cache, or if it has

3. Eventually, the small cache memory might fill up. To load a

4. How can write operations be handled by the memory

 Previous lectures answered the first 3. Today, we consider the 4th!

Index V Tag Data Address Data

Index V Tag Data Address Data

Index V Tag Data Address Data

This is simple to implement and keeps the cache and

Index V Tag Data Address Data

This is simple to implement and keeps the cache and

Buffers are commonly used when two devices run at different

Notice that the write buffer allows the CPU to continue

for (i = 0; i < M; i++)

Index V Dirty Tag Data Address Data

Subsequent reads to the same memory address will be

Only then can the cache block be replaced with data

Write-back cache discussion

When we update Mem[1101 0110], should we also load

Index V Tag Data Address Data

This is good when data is written but not immediately

for (int i = 0; i < SIZE; i++)

Index V Tag Data Address Data

If that data is needed again soon, it will be available in

CPU L1 cache L2 cache Main

Opteron Vital Statistics

L1 Caches: Instruction & Data

You might also like

  Previous lectures answered the first 3. Today, we consider the 4th!