CO: Computer Organization
Cache Memory
Dr. Praveen Kumar Alapati
Dr.Praveen (Mahindra University) CO Memory 1 / 31
Dr.Praveen (Mahindra University) CO Memory 2 / 31
I Locality of reference: If a processor accesses some data now, the
same data or neighbouring data will be accessed in near feature.
I Temporal Locality: Accesses to the same memory location that
occur close together in time.
I Spacial Locality: Accesses to the memory locations that are close
together in space.
I 90% of Execution time is spent on 10% of the code.
Dr.Praveen (Mahindra University) CO Memory 3 / 31
Cache Memory
Cache memory is a small-sized volatile memory that provides high-speed
access to a processor and stores frequently used instructions and data.
Dr.Praveen (Mahindra University) CO Memory 4 / 31
Memory Access
Dr.Praveen (Mahindra University) CO Memory 5 / 31
Direct Mapping
Main Memory
Cache Memory
Size of a Block is : 64 bytes
MM Block ’i’ maps to Block (i mod 4) of CM.
Dr.Praveen (Mahindra University) CO Memory 6 / 31
Physical Addresses
Dr.Praveen (Mahindra University) CO Memory 7 / 31
Try to answer the following
I If word length is 64-bits, how many bytes in a word.
I If block size is 64 bytes, how many words in a block.
I If a cache size is 256 bytes, how many blocks in the cache.
I If main memory size is 64KB, how many blocks in the main memory.
I How many MMBs are mapped to single CMB?.
Dr.Praveen (Mahindra University) CO Memory 8 / 31
Direct Mapping (1-Way Set Associative Mapping)
Dr.Praveen (Mahindra University) CO Memory 9 / 31
Try to answer the following
I If word length is 32-bits, how many bytes in a word.
I If block size is 32 bytes, how many words in a block.
I If a cache size is 256KB, how many blocks in the cache.
I If main memory size is 4GB( Physical Address Space is 4GB) , how
many blocks in the main memory.
I How many MMBs are mapped to single CMB?.
Dr.Praveen (Mahindra University) CO Memory 10 / 31
Average Memory Access Time (AMAT)
1 Let ’h’ be the hit ratio of a cache, ’M’ be the time to access
information from main memory(MM), and ’C’ be the time to access
information in the cache.
AMAT = Tavg = hC + (1 − h)M
2 Let us assume that there are two computers A and B.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has a cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
3 Q1: Then How much time A and B take, to execute 100 instructions
(assume that no instruction requires a read or a write operation).
4 Q1: Then How much time A and B take, to execute 100 instructions
(assume that 30 instructions require a read or a write operation).
Dr.Praveen (Mahindra University) CO Memory 11 / 31
Average Memory Access Time (AMAT)
1 Let ’h’ be the hit ratio of a cache, ’M’ be the time to access
information from main memory(MM), and ’C’ be the time to access
information in the cache.
AMAT = Tavg = hC + (1 − h)M
2 Let us assume that there are two computers A and B.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has a cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
3 Q1: Then How much time A and B take, to execute 100 instructions
(assume that no instruction requires a read or a write operation).
4 Q1: Then How much time A and B take, to execute 100 instructions
(assume that 30 instructions require a read or a write operation).
Dr.Praveen (Mahindra University) CO Memory 11 / 31
Average Memory Access Time (AMAT)
1 Let ’h’ be the hit ratio of a cache, ’M’ be the time to access
information from main memory(MM), and ’C’ be the time to access
information in the cache.
AMAT = Tavg = hC + (1 − h)M
2 Let us assume that there are two computers A and B.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has a cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
3 Q1: Then How much time A and B take, to execute 100 instructions
(assume that no instruction requires a read or a write operation).
4 Q1: Then How much time A and B take, to execute 100 instructions
(assume that 30 instructions require a read or a write operation).
Dr.Praveen (Mahindra University) CO Memory 11 / 31
Average Memory Access Time (AMAT)
1 Let ’h’ be the hit ratio of a cache, ’M’ be the time to access
information from main memory(MM), and ’C’ be the time to access
information in the cache.
AMAT = Tavg = hC + (1 − h)M
2 Let us assume that there are two computers A and B.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has a cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
3 Q1: Then How much time A and B take, to execute 100 instructions
(assume that no instruction requires a read or a write operation).
4 Q1: Then How much time A and B take, to execute 100 instructions
(assume that 30 instructions require a read or a write operation).
Dr.Praveen (Mahindra University) CO Memory 11 / 31
Average Memory Access Time (AMAT)
1 Let ’h1’ be the hit ratio of L1 cache, ’h2’ be the hit ratio of L2 cache,
’M’ be a MM access time, ’C1’ be L1-CM access time, and ’C2’ be
L2-CM access time.
AMAT = Tavg = h1.C 1 + (1 − h1).h2.C 2 + (1 − h1).(1 − h2).M
2 Let us assume that there are three computers A,B, and C.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has L1 cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
C has L1 and L2 caches. L1 cache with hit rate of 90% and its access
time is 10ns. L2 cache with hit rate of 99% and its access time is
30ns Each memory access from MM takes 130ns.
3 Q1: Then How much time A,B and C take, to execute 100
instructions (assume that no instruction requires a read or a write
operation).
Dr.Praveen (Mahindra University) CO Memory 12 / 31
Average Memory Access Time (AMAT)
1 Let ’h1’ be the hit ratio of L1 cache, ’h2’ be the hit ratio of L2 cache,
’M’ be a MM access time, ’C1’ be L1-CM access time, and ’C2’ be
L2-CM access time.
AMAT = Tavg = h1.C 1 + (1 − h1).h2.C 2 + (1 − h1).(1 − h2).M
2 Let us assume that there are three computers A,B, and C.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has L1 cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
C has L1 and L2 caches. L1 cache with hit rate of 90% and its access
time is 10ns. L2 cache with hit rate of 99% and its access time is
30ns Each memory access from MM takes 130ns.
3 Q1: Then How much time A,B and C take, to execute 100
instructions (assume that no instruction requires a read or a write
operation).
Dr.Praveen (Mahindra University) CO Memory 12 / 31
Average Memory Access Time (AMAT)
1 Let ’h1’ be the hit ratio of L1 cache, ’h2’ be the hit ratio of L2 cache,
’M’ be a MM access time, ’C1’ be L1-CM access time, and ’C2’ be
L2-CM access time.
AMAT = Tavg = h1.C 1 + (1 − h1).h2.C 2 + (1 − h1).(1 − h2).M
2 Let us assume that there are three computers A,B, and C.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has L1 cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
C has L1 and L2 caches. L1 cache with hit rate of 90% and its access
time is 10ns. L2 cache with hit rate of 99% and its access time is
30ns Each memory access from MM takes 130ns.
3 Q1: Then How much time A,B and C take, to execute 100
instructions (assume that no instruction requires a read or a write
operation).
Dr.Praveen (Mahindra University) CO Memory 12 / 31
Average Memory Access Time (AMAT)
1 Let ’h1’ be the hit ratio of L1 cache, ’h2’ be the hit ratio of L2 cache,
’M’ be a MM access time, ’C1’ be L1-CM access time, and ’C2’ be
L2-CM access time.
AMAT = Tavg = h1.C 1 + (1 − h1).h2.C 2 + (1 − h1).(1 − h2).M
2 Let us assume that there are three computers A,B, and C.
A has no cache, the processor takes 100ns (nano seconds) for each
memory access.
B has L1 cache with hit rate of 90% and its access time is 10ns. Each
memory access from MM takes 110ns.
C has L1 and L2 caches. L1 cache with hit rate of 90% and its access
time is 10ns. L2 cache with hit rate of 99% and its access time is
30ns Each memory access from MM takes 130ns.
3 Q1: Then How much time A,B and C take, to execute 100
instructions (assume that no instruction requires a read or a write
operation).
Dr.Praveen (Mahindra University) CO Memory 12 / 31
Difference between Direct and 2-Way Set Associative Mappings
Dr.Praveen (Mahindra University) CO Memory 13 / 31
2-Way Set Associative Mapping
Dr.Praveen (Mahindra University) CO Memory 14 / 31
Try to answer the following
I If word length is 32-bits, how many bytes in a word.
I If block size is 32 bytes, how many words in a block.
I If main memory size is 4GB( Physical Address Space is 4GB) , how
many blocks in the main memory.
I If a cache size is 256KB, how many blocks in the cache.
I If a cache supports 2-way set associative, how many MMBs are
mapped to a single CMB?.
I If a cache supports 4-way set associative, how many MMBs are
mapped to a single CMB?.
Dr.Praveen (Mahindra University) CO Memory 15 / 31
4-way Mapping
Dr.Praveen (Mahindra University) CO Memory 16 / 31
Cache miss is a state where the data requested for processing is not
found in the cache memory.
Types of Cache Misses
1 Compulsory or Cold Misses: The first reference to a block of
memory, starting with an empty cache.
2 Capacity Misses: The cache is not big enough to hold every block
you want to use.
3 Conflict Misses: Two blocks are mapped to the same location and
there is not enough room to hold both.
Dr.Praveen (Mahindra University) CO Memory 17 / 31
FIFO Replacement Algorithm
Dr.Praveen (Mahindra University) CO Memory 18 / 31
Least Recently Used (LRU) Replacement Algorithm
Dr.Praveen (Mahindra University) CO Memory 19 / 31
Important Points
I Valid bit says whether the cache block has a valid data or not.
I Dirty bit (modify bit) says whether the contents of the cache
line/block are different to what are there in main memory.
I Inclusive Cache: L1 ⊂ L2 ⊂ L3
I Exclusive Cache: L1 ∩ L2 ∩ L3 = ∅
I Non-inclusive Cache : (L1 ∩ L2 = ∅) and ((L1 ∪ L2) ∩ L3 = L1 ∪ L2)
Dr.Praveen (Mahindra University) CO Memory 20 / 31
All Mappings
Dr.Praveen (Mahindra University) CO Memory 21 / 31
Interleaving
Interleaving is a technique for improving the speed of access.
Our objective - transfer data blocks from MM to CM.
Dr.Praveen (Mahindra University) CO Memory 22 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Calculate the time needed to transfer a block of data from
the main memory to the cache when a read miss occurs.
Properties of H/W:
I Cache with 8-word blocks.
I On a read miss, the block contains the desired word must be copied
from the MM into the CM.
I One clock cycle to send a address to MM.
I First word is accessed in 8 clock cycles and subsequent words are
accessed in 4 clock cycles.
I Once clock cycle to send one word to the CM.
If consecutive words in a module, then how much time it
takes.
If consecutive words in consecutive modules, then how much
time it takes.
Dr.Praveen (Mahindra University) CO Memory 23 / 31
Interleaving
Interleaving is a technique for improving the speed of access.
Our objective - transfer data blocks from MM to CM.
Dr.Praveen (Mahindra University) CO Memory 24 / 31
Addressing Processor References
Dr.Praveen (Mahindra University) CO Memory 25 / 31
Virtual Memory
A technique for moving data between MM and Secondary storage
device.
Dr.Praveen (Mahindra University) CO Memory 26 / 31
Mapping from a VM-Page to a PM-Page or MM-Page
(PM-Page is also called a Page Frame)
Dr.Praveen (Mahindra University) CO Memory 27 / 31
Try to answer the following
I If size of a page is 4KB, how many bits are required to identify a byte
in the page.
I If size of a program is 2GB, how many pages are required to store the
program.
I If the size of MM is 512MB, how many pages it can accommodate.
I If the size of virtual address space is 4GB, how the VA is converted to
a physical address.
Dr.Praveen (Mahindra University) CO Memory 28 / 31
Mapping Table
index Valid Bit Physical Page Number
0 1 3
1 1 0
2 1 1
3 1 2
Table 1: Page Table.
Assume that base address of page table is available in PTBR (Page
Table Base Register).
Dr.Praveen (Mahindra University) CO Memory 29 / 31
Translation of VA to PA
Dr.Praveen (Mahindra University) CO Memory 30 / 31
Translation Lookaside Buffer(TLB)
Small Cache for a Page Table (available in MMU).
It has information about most recently accessed pages.
Dr.Praveen (Mahindra University) CO Memory 31 / 31