[go: up one dir, main page]

0% found this document useful (0 votes)
56 views53 pages

Lecture 5

Uploaded by

meekchege2006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views53 pages

Lecture 5

Uploaded by

meekchege2006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Lecture 5

Memory hierarchy
Dr. Jael Wekesa
jael.wekesa@jkuat .ac.ke
Overview
 Cache memory and associative memory
Objective
 To study the various types of memory according to size, cost
and speed
 Describe cache memory and calculate how much it speeds up
processing times using various cache structures;
 Calculate miss rate for various cache configurations; and
 Explain the importance of memory hierarchy in computer
design, and explain how memory design impacts overall
hardware performance.
What is memory hierarchy?
 The computer memory is divided into hierarchy based on speed and use.
 The processor can move from one level to another based on its requirements
 The five hierarchies in the memory are registers, cache, main memory, magnetic discs, and
magnetic tapes.
 The memories are classified as either volatile or non-volatile
i) Volatile memory: a type of storage whose contents are erased when the system's power is
turned off or interrupted e.g. random access memory (RAM), cache
ii) Non-volatile: store the data permanently e.g. read only memory (ROM), flash memory,
magnetic disks (e.g. hard disks, floppy discs and magnetic tape) and optical discs.

NOTE: – Fact: Large memories are slow, fast memories are small
– Hierarchy and parallelism are used to achieve a large, fast, cheap memory
Why memory hierarchy is important?
 Memory devices of a computer system are organized using
memory hierarchy.
 The different levels of memory have different performance
rates. The memory hierarchy was developed based on the
behavior of the program known as locality of references.
 It is an enhancement to organize the memory such that it can
minimize the access time
Memory hierarchy
Levels of memory hierarchy
 Capacity in terms of storage.
 Cost per bit of storage.
 Frequency of access of the memory.
 Access time.
Memory Hierarchy: Principle
 At any given time, data is copied between only two adjacent levels:
– Upper level: the one closer to the processor
o Smaller, faster, uses more expensive technology
– Lower level: the one away from the processor
o Bigger, slower, uses less expensive technology
Characteristics of memory hierarchy
 Capacity: the capacity in terms of storage increases from top to
bottom.
 Cost per bit: the cost per bit of storage decreases from top to
bottoms i.e. internal memory is costlier than external memory.
 Performance: frequency of access of the memory by the CPU
decreases.
 Access Time: the access time by the CPU increases from top to
bottom.
How is the hierarchy managed?
 Registers: memory
– by compiler
 Cache: memory
– by the hardware
 Memory: disks
– by the hardware and operating system
 Virtual memory –
– by the programmer (files)
Performance of various levels of storage
Level 1 2 3 4
Name Register Cache Main memory Disk storage
Typical size <1 KB > 16 MB > 16 GB > 100 GB
Implementation Custom memory On-chip or off-chip CMOS DRAM Magnetic disk
technology with multiple ports, CMOS SRAM
CMOS
Access time (ns) 0.25 – 0.5 0.5 - 25 80 - 250 5,000
Bandwidth 20,000 – 100,000 5000 – 10,000 1000 - 5000 20 - 150
(MB/sec)
Managed by Compiler Hardware Operating system Operating system
Backed by Cache Main memory Disk CD or tape
Memory hierarchy technology
 Random access:
DRAM/SRAM is “volatile”: contents lost if power lost
– Access time same for all locations Disks are “non-volatile”: contents survive power outages
– DRAM: Dynamic Random Access Memory
o High density, low power, cheap, slow
o Dynamic: need to be refreshed regularly
o Addresses in 2 halves (memory as a 2D matrix):
– RAS/CAS (Row/Column Access Strobe)
o Used by the main memory
– SRAM: Static Random Access Memory
o Low density, high power, expensive, fast
o Static: content will last (forever until lose power)
o Address not divided
o Use by the caches
Memory hierarchy design
 The memory hierarchy is divided into 2 main types:
1. External Memory or Secondary Memory
– Comprising of Magnetic Disk, Optical Disk, Magnetic Tape i.e. peripheral storage devices which are
accessible by the processor via I/O Module.

2. Internal Memory or Primary Memory


– Comprising of Main Memory, Cache Memory & CPU registers. This is directly accessible by the
processor.
Registers
 The data transfer between main memory and the CPU takes place through two CPU
registers. A register may hold a piece of data, like a storage address and computer
instruction.
1. MAR : Memory Address Register
2. MDR : Memory Data Register
Function of MDR & MAR
If the MAR is k-bit long, then the total addressable memory location will be 2k
 If the MDR is n-bit long, then the n bit of data is transferred in one memory cycle.
 The transfer of data takes place through memory bus, which consist of address bus and data
bus.
Registers
There are many types of registers like accumulator register, data register, Program Counter.
Cache
 Cache memory is a very high speed semiconductor memory which can speed
up the CPU.
 It acts as a buffer between the CPU and the main memory.
 It is used to hold those parts of data and program which are most frequently
used by the CPU.
 The parts of data and programs are transferred from the disk to cache memory
by the operating system, from where the CPU can access them.
 Two types of cache are:
o Memory cache - helps speed the processes of the computer because it
stores frequently used instructions and data e.g. L1 and L2 cache
o Disk cache -
Levels of cache memory
 L1 Cache – It is also called primary cache and is extremely fast but relatively has small
capacity ranging from 8 KB to 128 KB. It is usually embedded in the processor chip as CPU
cache.
 L2 Cache – It is also called secondary cache and is more spacious than L1, it ranges from 64
KB to 16 MB. It may be embedded on CPU or it can be on a separate chip.
 L3 Cache – It is a specialized memory developed to improve the performance of L1 and L2. It
can be slower than L1 or L2, though it is usually double the speed of DRAM.
Cont…
 To access a particular piece of data, the CPU first sends a
request to its nearest memory, usually cache.
 If the data is not in cache, then main memory is queried. If
the data is not in main memory, then the request goes to
disk.
 Once the data is located, then the data, and a number of its
nearby data elements are fetched into cache memory.

18
The Memory Hierarchy
This leads us to some definitions.
◦ A hit is when data is found at a given memory level.
◦ A miss is when it is not found.
◦ The hit rate is the percentage of time data is found at a given memory level.
◦ The miss rate is the percentage of time it is not.
◦ Miss rate = 1 - hit rate.
◦ The hit time is the time required to access data at a given memory level.
◦ The miss penalty is the time required to process a miss, including the time
that it takes to replace a block of memory plus the time it takes to deliver the
data to the processor.

19
Cache Performance
 If the processor finds that the memory location is in the cache, a cache hit has
occurred and data is read from cache
 If the processor does not find the memory location in the cache, a cache miss
has occurred.
 For a cache miss, the cache allocates a new entry and copies in data from main
memory, then the request is fulfilled from the contents of the cache.
 The performance of cache memory is frequently measured in terms of a
quantity called Hit ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
Advantages and disadvantages of cache
Advantages
• Cache memory is faster than main memory.
• It consumes less access time as compared to main memory.
• It stores the program that can be executed within a short period of time.
• It stores data for temporary use.

Disadvantages
• Cache memory has limited capacity.
• It is very expensive.
Buffer
 A buffer is a segment of memory or storage in which items are placed while
waiting to be transferred from an input device or to an output device.
Example:
 The operating system commonly uses buffers with printed documents. This
process, called spooling, documents to be printed are sent to a buffer instead of
sending them immediately to the printer.
 If a printer does not have its own internal memory or if its memory is full, the
operating system’s buffer holds the information waiting to print while the printer
prints from the buffer at its own rate of speed.
 Spooling documents to a buffer allows the processor to continue interpreting and
executing instructions while the printer prints.
Primary Memory (Main Memory)
 The main memory holds only those data and instructions which the computer is
currently working on.
 It has a limited capacity and data is lost when power is switched off.
 It is generally made up of semiconductor device.
 These memories are not as fast as registers.
 The data and instruction required to be processed resides in the main memory.
 It is divided into two subcategories RAM and ROM.
Random Access Memory(RAM)
 The RAM is referred to as a random access memory because it is possible to
access any memory location in random.
 Depending on the technology used to construct a RAM, there are two types of
RAM -
1. SRAM: Static Random Access Memory.
2. DRAM: Dynamic Random Access Memory.
RAM
Dynamic Ram (DRAM):
 A DRAM is made with cells that store data as charge on capacitors. The
presence or absence of charge in a capacitor is interpreted as binary 1 or 0.
 Because capacitors have a natural tendency to discharge due to leakage current,
dynamic RAM require periodic charge refreshing to maintain data storage. The
term dynamic refers to this tendency of the stored charge to leak away, even with
power continuously applied.
Static RAM (SRAM):
In an SRAM, binary values are stored using traditional flip-flop constructed with
the help of transistors. A static RAM will hold its data as long as power is
supplied to it.
SRAM Versus DRAM
SRAM DRAM
Cost - Expensive - Cheap: a dynamic memory cell is
simpler and smaller than a static memory
cell
Speed - SRAM cells are generally faster, therefore, to construct - DRAM cells are slower hence used in
faster memory modules (like main memory
cache memory) SRAM is used
Design - A single block of memory requires 6 transistors - Needs just one transistor for a single
block of memory
Size - Is smaller size - Is available in larger storage capacity
Density - Less dense - Highly dense
- Does not need periodic refreshment to maintain data - Needs periodic refreshment to maintain
data
Read-only memory (ROM)
 ROM is a form of data storage in computers and other electronic devices that
cannot be easily altered or reprogrammed.
 RAM is referred to as volatile memory and data is lost when the power is turned
off whereas ROM is non-volatile and the contents are retained even after the
power is switched off.
Types of ROM
1. Programmable read-only memory (PROM)/one-time programmable ROM (OTP) – can be written
to or programmed via a special device called a PROM programmer. PROM uses high voltages to
permanently destroy or create internal links within the chip. Consequently, a PROM can only be
programmed once.
2. Erasable programmable read-only memory (EPROM) - can be erased by exposure to strong
ultraviolet light (typically for 10 minutes or longer), then rewritten with a process that again needs
higher than usual voltage applied.
3. Electrically erasable programmable read-only memory (EEPROM) - is based on a similar
semiconductor structure to EPROM, but allows its entire contents (or selected banks) to be electrically
erased, then rewritten electrically, thus, they need not be removed from the computer.
4. Electrically alterable read-only memory (EAROM) - is a type of EEPROM that can be modified one
bit at a time.
5. Flash memory - is a modern type of EEPROM. It can be erased and rewritten faster than ordinary
EEPROM, and newer designs feature very high endurance
Characteristics of main memory
 These are semiconductor memories.
 Usually volatile memory.
 Data is lost in case power is switched off.
 It is the working memory of the computer.
 Faster than secondary memories.
 A computer cannot run without the primary memory.
Secondary memory
 This type of memory is also known as external memory or non-volatile.
 It is slower than the main memory.
 These are used for storing data/information permanently.
 CPU directly does not access these memories, instead they are accessed via input-output
routines. The contents of secondary memories are first transferred to the main memory, and then
the CPU can access it.
 Example: disk, CD-ROM, DVD, etc.
Magnetic Disk
 Data are represented physically as magnetization on a circular platter
 Platters as stacked atop each other on a single rotating spindle
Removable media
 For different application, we use different data. It may not be possible to keep all the information
in magnetic disk. So, which ever data we are not using currently, can be kept in removable media.
 Examples: magnetic tape, CD (an optical device).
Characteristics of secondary memory
 These are magnetic and optical memories.
 It is known as the backup memory.
 It is a non-volatile memory.
 Data is permanently stored even if power is switched off.
 It is used for storage of data in a computer.
 Computer may run without the secondary memory.
 Slower than primary memories.
 Has more capacity
 They are much cheaper per bit than main memory

33
How disks are connected to system
 Hard drives can use different interfaces to connect to the
motherboard e.g.
 IDE (PATA)
 SATA – Serial ATA
 SCSI – Small Computer Systems Interface
 Fibre Channel
 SAS – Serial Attached SCSI

34
Disk Technology Trends
 Packing density is increasing
o Linear density (bits/inch) is increasing exponentially
o Track density (tracks/inch) is increasing exponentially
o Areal density (the product of track and linear density) increases exponentially (doubles
per 18 months?)
 Increasing transfer speed
o Higher packing density
o New interconnect technologies
o Better buffering
o Some increase in rotation speed
 Decreasing form factors
o Less power/GB
o New applications (ipods, cameras?)
o Tighter packaging

35
Current technology developments - SSD
 Solid State Disks (SSDs) have gained maturity, and are replacing
magnetic disks for some application areas
o SSD’s use semiconductor memory for the physical
presentation of data
o SSD’s are “packaged” to function as a HDD replacement
o Failure modes are different

36
Current technology trends – Helium
 Helium is replacing air as internal atmosphere in HDDs
 Helium is significantly lighter than air and therefor causes less
drag and less turbulence for the moving mechanics of HDDs.
 The reductions in drag and turbulence allows for tighter stacking
of platters, tighter packaging on platters, faster movement by
motors that consumes less power
 Helium filled HDDs is yet another example of refinements that
extend the lifetime of established technologies

37
Shopping Criteria for HDDs
 Use - intended use and with which device you are going to use it
 Storage capacity - For the average user, a 1 TB disk is more than enough. If you
are going to use your computer for video games, a combination of a 250 GB
SSD and 2+ TB on an HDD disc may better suit you.
 Read/write speed - HDDs performance
 Rotation speed - read/write speed depends on the disc’s rotation speed, which is
measured in revolutions per minute (RPM)
 Cache – it is where information is temporarily stored when the disk receives an
electrical current thus, the larger it is, the faster the disk will find more data
 Connection bus -
 Brand – the brands offer different level of quality. E.g. Toshiba, HGST, TrekStor,
Seagate, Western Digital, and ExcelStor

38
Disk Access Time
 How do we retrieve data from disk?
o Position head over the cylinder (track) on which the block (consisting of one or more sectors) are
located
o Read or write the data block as the sectors move under the head when the platters rotate
 The time between the moment issuing a disk request and the time the block is resident in memory is
called disk latency or disk access time
Disk access time = Seek time + Rotational delay + Transfer time + Other delays
> Seek time - is the time to position the head the heads require a minimum amount of time to start and stop
moving the head
> Rotational delay - time for the disk platters to rotate so the first of the required sectors are under the disk
head
> Transfer Time – is time for data to be read by the disk head

39
Disk Throughput
 How much data can we retrieve per second?
𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
Throughput =
𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
Example:
 for each operation we have
- average seek - average rotational delay
- transfer time - no gaps, etc.
1. Cheetah X15 (max 77.15 MB/s)
4 KB blocks 0.71 MB/s
64 KB blocks 11.42 MB/s
2. Barracuda 180 (max 47.58 MB/s)
4 KB blocks 0.35 MB/s
64 KB blocks 5.53 MB/s
40
Efficient secondary storage usage
 There may be huge performance improvements if we reduce the number of disk
accesses.
 Several ways to optimize:
o Block size
o Disk scheduling
o Multiple disks
o Prefetching
o File management / data placement
o Memory caching / replacement algorithms

41
Disk Scheduling
 Let operating system or disk controller choose which request to serve next
depending on the head’s current position and requested block’s position on
disk (disk scheduling)
 Note: disk scheduling ≠ CPU scheduling
o A mechanical device – hard to determine (accurate) access times
o Disk accesses cannot be preempted – runs until it finishes
o Disk I/O often the main performance bottleneck
 General goals
o Short response time
o High overall throughput
o Fairness (equal probability for all blocks to be accessed at the same time)
 Tradeoff: seek and rotational delay vs. maximum response time
42
Disk Scheduling
 Several traditional algorithms
 First-Come-First-Serve (FCFS) - serves the first arriving request first
 Shortest Seek Time First (SSTF) - serves closest request first
 SCAN (and variations) - moves head edge to edge and serves requests on the
way:
o Bi-directional
o Compromise between response time and seek time optimizations
 Look (and variations) - is a variation of SCAN:
o Same schedule as SCAN
o Does not run to the edges
o Stops and returns at outer- and innermost request
o Increased efficiency
43
Prefetching and Buffering
 Prefetching is a technique for speeding up fetch operations by beginning a fetch
operation whose result is expected to be needed soon
How to do a prefetch:
1. Read-ahead:
o Read more than the requested block into memory
o Serve next read requests from buffer cache
2. Double (multiple) buffering:
o Read data into first buffer
o Process data in first buffer and at the same time read data into second buffer
o Process data in second buffer and at the same time read data into first buffer
etc.

44
Multiple Buffering
Example:
 A file with block sequence B1, B2, ...
 The program processes data sequentially, i.e., B1, B2, ...
1. Single buffer solution:
o read B1 buffer
o process data in buffer
o read B2 buffer
o process data in Buffer
if P = time to process a block
R = time to read in 1 block
n = # blocks
single buffer time = n (P+R)

45
Multiple Buffering
2. Double buffer solution:
o read B1 buffer1
o process data in buffer1, read B2 buffer2
o process data in buffer2, read B3 buffer1
o process data in buffer1, read B4 buffer2

if P = time to process a block


R = time to read in 1 block
n = # blocks
double buffer time = R + nP

46
Multiple Buffering
2. Double buffer solution:
o read B1 buffer1
o process data in buffer1, read B2 buffer2
o process data in buffer2, read B3 buffer1
o process data in buffer1, read B4 buffer2

if P = time to process a block


R = time to read in 1 block
n = # blocks
double buffer time = R + nP

47
Memory Caching
Performance of a cache can be improved by following
things:
1. Reducing Miss Rate
2. Reducing Miss Penalty
3. Reducing Time to Hit

48
Cache mapping
 Cache mapping defines how a block from the main memory is mapped to
the cache memory in case of a cache miss OR
 Cache mapping is a technique by which the contents of main memory are
brought into the cache memory.

49
Cache Mapping Techniques
1. Direct Mapping: A particular block of main memory can map only to a
particular line of the cache
2. Fully Associative Mapping: A block of main memory can map to any line of
the cache that is freely available at that moment. This makes fully associative
mapping more flexible than direct mapping
3. K-way Set Associative Mapping: Cache lines are grouped into sets where
each set contains k number of lines.
• A particular block of main memory can map to only one particular set of
the cache.
• However, within that set, the memory block can map any cache line that is
freely available

50
Cache Mapping Techniques

51
Cache write mechanism
1. Write Through - data is simultaneously updated to cache
and memory
2. Write Back - the data is updated only in the cache and
updated into the memory at a later time

52
Modern systems
Example: CPU Performance Boost via Intel Turbo Boost Technology:
 The applications that are run on multicore processors can be single threaded or multi -threaded.
 Intel introduced a new feature called Intel Turbo Boost to provide a performance boost for lightly
threaded applications and to optimize the processor power consumption
 Intel Turbo Boost features offer processing performance gains for all applications regardless of the
number of execution threads created.
 It automatically allows active processor cores to run faster than the base operating frequency when
certain conditions are met. This mode is activated when the OS requests the highest processor
performance state.
Number of Mode Base frequency Maximum
active cores turbo boost
frequency
4 Quad-Core 1.73 GHz 2.0 GHz
2 Dual-Core 1.73 GHz 2.8 GHz
1 Single-Core 1.73 GHz 3.06 GHz

53

You might also like