Lecture3
Lecture3
Lecture3
Unix V6 Filesystem
This document is copyright (C) Stanford Computer Science and Nick Troccoli, licensed under
Creative Commons Attribution 2.5 License. All rights reserved.
Based on slides and notes created by John Ousterhout, Jerry Cain, Chris Gregg, and others.
NOTICE RE UPLOADING TO WEBSITES: This content is protected and may not be shared, 1
uploaded, or distributed. (without expressed written permission)
Announcements
• Lecture credit starts with today’s lecture – you can submit PollEV polls in person during live
lecture, or complete the corresponding Canvas quiz by Monday at 1PM to get credit.
• Assign0 due Monday at 11:59PM, no late submissions accepted (except for OAE/Head TA
extensions)
• Assignments preview: assign1 and assign3 are longer than 0 and 2, 4-6 similar to assign1 & 3
2
CS111 Topic 1: Filesystems
Key Question: How can we design filesystems to manage files on disk, and what are
the tradeoffs inherent in designing them? How can we interact with the filesystem in
our programs?
Filesystems Filesystem
Case study: Unix
introduction and System calls and Crash recovery
V6 Filesystem
design file descriptors
3
Learning Goals
• Explore the design of the Unix V6 filesystem
• Understand how we can use inodes to store and access file data
• Learn about how inodes can accommodate small and large files
4
Plan For Today
• Recap: filesystems so far
• The Unix V6 Filesystem and Inodes
• Practice: reading file data
• Large files and Singly-Indirect Addressing
• Practice: singly-indirect addressing
• Large files and Doubly-Indirect Addressing
• Assignment 1
5
Plan For Today
• Recap: filesystems so far
• The Unix V6 Filesystem and Inodes
• Practice: reading file data
• Large files and Singly-Indirect Addressing
• Practice: singly-indirect addressing
• Large files and Doubly-Indirect Addressing
• Assignment 1
6
Recap: Filesystems So Far
A filesystem is the portion of the OS that manages the disk (persistent storage).
The disk only knows how to read a sector and write a sector.
• Blocks are the storage unit used by the filesystem, can be 1 or more sectors.
Designs we’ve discussed so far:
• Contiguous allocation allocates a file in one contiguous space
• Linked files allocates files by splitting them into blocks and having each block
store the location of the next block.
• Windows FAT is like linked files but stores the links in a “file allocation table” in
memory for faster access.
• Multi-level indexes instead store all block numbers for a file together so we can
quickly jump to any point in the file (but how?). Example: Unix v6 Filesystem
• Many other designs possible – many use a tree-like structure 7
Filesystem Designs
• Internal Fragmentation: space allocated for a file is larger than what is
needed. A file may not take up all the space in the blocks it’s using. E.g. block
= 512 bytes, but file is only 300 bytes. (you could share blocks between
multiple files, but this gets complex)
• External Fragmentation (issue with contiguous allocation): no single space is
large enough to satisfy an allocation request, even though enough aggregate
free disk space is available
• Wait, how do we look up / find files? (we’ll talk more about this!)
8
Plan For Today
• Recap: filesystems so far
• The Unix V6 Filesystem and Inodes
• Practice: reading file data
• Large files and Singly-Indirect Addressing
• Practice: singly-indirect addressing
• Large files and Doubly-Indirect Addressing
• Assignment 1
9
Unix V6 Inodes
This won’t work – inodes (like file data) are stored on disk, not in memory. To
access them, we must read them into memory first, sector by sector.
13
Reading Second 16 Inodes From Disk
Let's imagine that the hard disk creators provide software to let us interface
with the disk. These can operate on 1 sector, regardless of what type of data it
holds.
15
Reading Second 16 Inodes From Disk
We can pass in an array of any type to store the read-in data, but it’s easiest to
use an array of the same type as the data being read in from disk.
16
Plan For Today
• Recap: filesystems so far
• The Unix V6 Filesystem and Inodes
• Practice: reading file data
• Large files and Singly-Indirect Addressing
• Practice: singly-indirect addressing
• Large files and Doubly-Indirect Addressing
17
File Data
(For now) i_addr stores the numbers of blocks that contain payload data.
index 0 1 2 3 4 5 6 7
Inode:
size = 600 bytes
i_addr = [122, 56, …. ]
19
Practice #1: File Data
Let's say we have an inode that looks like the following (remember 1 block = 1
sector = 512 bytes):
Inode:
size = 600 bytes
i_addr = [122, 56, …. ]
• How many bytes of block 122 store file payload data? 512 (bytes 0-511)
• How many bytes of block 56 store file payload data? 88 (bytes 512-599)
Key Idea: we must read a full sector with readSector, but can ignore unused data20
Practice #2: File Data
Let's say we have an inode that looks like the following (remember 1 block = 1
sector = 512 bytes):
Inode:
size = 2000 bytes
i_addr = [56, 122, 45, 22, … ]
Which block number stores the index-1500th byte (0-indexed) of the file?
Inode:
size = 2000 bytes
i_addr = [56, 122, 45, 22, … ]
Which block number stores the index-1500th byte (0-indexed) of the file?
Bytes 0-511 reside within block 56, bytes 512-1023 within block 122, bytes
1024-1535 within block 45, and bytes 1536-1999 at the front of block 22.
23
Plan For Today
• Recap: filesystems so far
• The Unix V6 Filesystem and Inodes
• Practice: reading file data
• Large files and Singly-Indirect Addressing
• Practice: singly-indirect addressing
• Large files and Doubly-Indirect Addressing
• Assignment 1
24
File Size
Problem: with 8 block numbers per inode, the largest a file can be is 512 * 8 =
4096 bytes (~4KB). That definitely isn't realistic!
Solution: Unix V6 has two inode “modes”: small and large, that dictate how it
uses i_addr.
Key idea: the inode only fits 8. So let's store each group of 256 (512 / 2-byte-
numbers) block numbers in a block, and then store that block's number in the
inode! This approach is called singly-indirect addressing.
26
Singly-Indirect Addressing
If the inode is “large mode”, i_addr stores 7 numbers of blocks that contain
block numbers, and those block numbers are of blocks that contain payload
data. 8th block number? we'll get to that :)
0 1 2 3 4 5 6 7
28
Indirect Addressing
Let's assume for now that an inode for a large file uses all 8 block numbers for
singly-indirect addressing. What is the largest file size this supports? Each block
number is 2 bytes big.
29
Indirect Addressing: Design Decisions
We could use singly-indirect addressing in other ways, too – all design decisions!
• We could make just some of the block numbers in an inode use singly-indirect
addressing, and others still be direct addressing
• We could have just one “mode” for files, instead of 2
One argument against indirect addressing: it takes more steps to get to the data
and uses more blocks. Block 422 Block 451
Which singly-indirect block stores the block number holding the index-
150,000th byte of the file?
Bytes 0-131,071 reside within blocks whose block numbers are in block 56. Bytes
131,072 (256*512) - 199,999 reside within blocks whose block numbers are in
block 122. 31
Even Larger Files
Problem: even with singly-indirect addressing, the largest a file can be is 8 * 256
* 512 = 1,048,576 bytes (~1MB). That still isn't realistic!
Solution: let’s have the 8th entry in i_addr use doubly-indirect addressing; store
a block number for a block that contains singly-indirect block numbers.
32
Plan For Today
• Recap: filesystems so far
• The Unix V6 Filesystem and Inodes
• Practice: reading file data
• Large files and Singly-Indirect Addressing
• Practice: singly-indirect addressing
• Large files and Doubly-Indirect Addressing
• Assignment 1
33
Large File Scheme
If the file is large, the first 7 entries in i_addr are singly-indirect block numbers
(block numbers of blocks that contain direct block numbers). The 8 th entry (if
needed) is a doubly-indirect block number (the number of a block that contains
singly-indirect block numbers).
index 0 1 2 3 4 5 6 7
Block 444 Block 555 Block 1352 Block 126 Block 897
126, 98, 70, 127, 1352, 567, … 897, 4356, 6791,
1252, … …
File Part File Part
0 … 1,792 …
34
Large File Scheme
Another way to think about it: a file can be represented using at most 7 + 256 =
263 singly-indirect blocks. The numbers of the first seven are stored in the
inode. The numbers of the remaining 256 are stored in a block whose block
number is stored in the inode.
index 0 1 2 3 4 5 6 7
Block 444 Block 555 Block 1352 Block 126 Block 897
126, 98, 70, 127, 1352, 567, … 897, 4356, 6791,
1252, … …
File Part File Part
0 … 1,792 …
35
Large File Scheme
An inode for a large file stores 7 singly-indirect block numbers and 1 doubly-
indirect block number. What is the largest file size this supports? Each block
number is 2 bytes big.
36
Large File Scheme
An inode for a large file stores 7 singly-indirect block numbers and 1 doubly-
indirect block number. What is the largest file size this supports? Each block
number is 2 bytes big.
OR:
(7 * 256 * 512) + (256 * 256 * 512) = ~34MB
(singly indirect) + (doubly indirect)
Better! still not sufficient for today's standards, but perhaps in 1975. Moreover,
since block numbers are 2 bytes, we can number at most 2 16 = 65,536 blocks,
meaning the entire filesystem can be at most 65,536 * 512 ~ 32MB.
37
“Large Mode” Inodes
The Unix V6 filesystem uses indirect addressing (blocks that store payload block
numbers) just for “large mode” files.
• check flag in inode to know whether it is a “small mode” file (direct
addressing) or “large mode” one (indirect addressing)
• If small: all 8 block numbers are direct block numbers (block numbers of
blocks that store file data)
• If large: first 7 block numbers are singly-indirect block numbers (block
numbers of blocks that store direct block numbers), 8th block number (if
needed) is doubly-indirect (it refers to a block that stores singly-indirect
block numbers)
• Files only use the block numbers they need (depending on their size)
• Note: doubly-indirect is useful (and there are many other possible designs!),
but it means even more steps to access data. 38
Plan For Today
• Recap: filesystems so far
• The Unix V6 Filesystem and Inodes
• Practice: reading file data
• Large files and Singly-Indirect Addressing
• Practice: singly-indirect addressing
• Large files and Doubly-Indirect Addressing
• Assignment 1
39
Assignment 1
Implement core functions to read from a Unix v6 filesystem disk!
• inode_iget -> fetch a specific inode
• inode_indexlookup -> fetch a specific payload block number
• file_getblock -> fetch a specified payload block
• directory_findname -> fetch directory entry with the given name
• pathname_lookup -> fetch inumber for the file with the given path
40
Assignment 1
Implement core functions to read from a Unix v6 filesystem disk!
• inode_iget -> fetch a specific inode
• inode_indexlookup -> fetch a specific payload block number
can start
immediately
• file_getblock -> fetch a specified payload block
• directory_findname -> fetch directory entry with the given name will discuss
• pathname_lookup -> fetch inumber for the file with the given path next time!
41
Plan For Today
• Recap: filesystems so far Lecture 3 takeaway: The
• The Unix V6 Filesystem and Inodes Unix v6 filesystem
• Practice: reading file data represents small files by
• Large files and Singly-Indirect Addressing storing direct block
• Practice: singly-indirect addressing numbers, and larger files by
• Large files and Doubly-Indirect Addressing using indirect addressing -
• Assignment 1 storing 7 singly-indirect and
1 doubly-indirect block
number.
Next time: directories, file lookup and links
42