[go: up one dir, main page]

0% found this document useful (0 votes)
39 views78 pages

Week 11

This document discusses file systems and file operations. It covers key concepts such as file attributes, file structures like sequential access and direct access, directory structures like tree structures and access control lists. Common file operations like create, read, write and delete are also described. The document also discusses different types of file systems and provides examples of file system organization and structures.

Uploaded by

manyagarg787
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views78 pages

Week 11

This document discusses file systems and file operations. It covers key concepts such as file attributes, file structures like sequential access and direct access, directory structures like tree structures and access control lists. Common file operations like create, read, write and delete are also described. The document also discusses different types of file systems and provides examples of file system organization and structures.

Uploaded by

manyagarg787
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 78

File and I/O System

The Content is prepared with the help of existing text


books mentioned below:

References:
1.Silberschatz, Abraham, Peter B. Galvin, and Greg Gagne. Operating
system concepts. Wiley Publishing, 2013.
2.Stallings, William. Operating Systems 5th Edition. Pearson Education
India, 2006.
3.Tannenbaum, Andrew S. "Modern Operating Systems, 2009."
File Concept
• Contiguous logical address space
• Types:
• Data
• numeric
• character
• binary
• Program
• Contents defined by file’s creator
• Many types
• Consider text file, source file, executable file
File Attributes
• Name – only information kept in human-readable form
• Identifier – unique tag (number) identifies file within file system
• Type – needed for systems that support different types
• Location – pointer to file location on device
• Size – current file size
• Protection – controls who can do reading, writing, executing
• Time, date, and user identification – data for protection, security,
and usage monitoring
• Information about files are kept in the directory structure, which is
maintained on the disk
• Many variations, including extended file attributes such as file
checksum
• Information kept in the directory structure
File info Window on Mac OS X
File Operations
• File is an abstract data type
• Create
• Write – at write pointer location
• Read – at read pointer location
• Reposition within file - seek
• Delete
• Truncate
• Open(Fi) – search the directory structure on disk for entry
Fi, and move the content of entry to memory
• Close (Fi) – move the content of entry Fi in memory to
directory structure on disk
Open Files
• Several pieces of data are needed to manage
open files:
• Open-file table: tracks open files
• File pointer: pointer to last read/write location, per
process that has the file open
• File-open count: counter of number of times a file is
open – to allow removal of data from open-file table
when last processes closes it
• Disk location of the file: cache of data access
information
• Access rights: per-process access mode information
Open File Locking
• Provided by some operating systems and file
systems
• Similar to reader-writer locks
• Shared lock similar to reader lock – several
processes can acquire concurrently
• Exclusive lock similar to writer lock
• Mediates access to a file
• Mandatory or advisory:
• Mandatory – access is denied depending on locks
held and requested
• Advisory – processes can find status of locks and
decide what to do
File Types – Name, Extension
File Structure
• None - sequence of words, bytes
• Simple record structure
• Lines
• Fixed length
• Variable length
• Complex Structures
• Formatted document
• Relocatable load file
• Can simulate last two with first method by inserting
appropriate control characters
• Who decides:
• Operating system
• Program
Sequential-access File
Access• Methods
Sequential Access
read next
write next
reset
no read after last write
(rewrite)
• Direct Access – file is fixed length logical records
read n
write n
position to n
read next
write next
rewrite n
n = relative block number

• Relative block numbers allow OS to decide where file should be placed


• See allocation problem in Ch 12
Simulation of Sequential Access on Direct-access File
Other Access Methods
• Can be built on top of base methods
• General involve creation of an index for the file
• Keep index in memory for fast determination of location of
data to be operated on (consider UPC code plus record of data
about that item)
• If too large, index (in memory) of the index (on disk)
• IBM indexed sequential-access method (ISAM)
• Small master index, points to disk blocks of secondary index
• File kept sorted on a defined key
• All done by the OS
• VMS operating system provides index and relative files as
another example (see next slide)
Example of Index and Relative Files
Directory Structure

• A collection of nodes containing information about all files

Directory

Files
F1 F2 F4
F3
Fn

Both the directory structure and the files reside on disk


Disk Structure
• Disk can be subdivided into partitions
• Disks or partitions can be RAID protected against failure
• Disk or partition can be used raw – without a file system, or
formatted with a file system
• Partitions also known as minidisks, slices
• Entity containing file system known as a volume
• Each volume containing file system also tracks that file
system’s info in device directory or volume table of
contents
• As well as general-purpose file systems there are many
special-purpose file systems, frequently all within the same
operating system or computer
A Typical File-system Organization
Types of File Systems

• We mostly talk of general-purpose file systems


• But systems frequently have may file systems, some
general- and some special- purpose
• Consider Solaris has
• tmpfs – memory-based volatile FS for fast, temporary I/O
• objfs – interface into kernel memory to get kernel symbols
for debugging
• ctfs – contract file system for managing daemons
• lofs – loopback file system allows one FS to be accessed in
place of another
• procfs – kernel interface to process structures
• ufs, zfs – general purpose file systems
Operations Performed on Directory

• Search for a file

• Create a file

• Delete a file

• List a directory

• Rename a file

• Traverse the file system


Directory Organization

The directory is organized logically to obtain

• Efficiency – locating a file quickly


• Naming – convenient to users
• Two users can have same name for different files
• The same file can have several different names
• Grouping – logical grouping of files by
properties, (e.g., all Java programs, all games,
…)
Single-Level Directory
• A single directory for all users

• Naming problem
• Grouping problem
Two-Level Directory
• Separate directory for each user

 Path name
 Can have the same file name for different user
 Efficient searching
 No grouping capability
Tree-Structured Directories
Tree-Structured Directories (Cont.)

• Efficient searching

• Grouping Capability

• Current directory (working directory)


• cd /spell/mail/prog
• type list
Tree-Structured Directories (Cont.)
• Absolute or relative path name
• Creating a new file is done in current directory
• Delete a file
rm <file-name>
• Creating a new subdirectory is done in current directory
mkdir <dir-name>
Example: if in current directory /mail
mkdir count

Deleting “mail”  deleting the entire subtree rooted


by “mail”
Acyclic-Graph Directories
• Have shared subdirectories and files
Acyclic-Graph Directories (Cont.)
• Two different names (aliasing)
• If dict deletes list  dangling pointer
Solutions:
• Backpointers, so we can delete all pointers
Variable size records a problem
• Backpointers using a daisy chain organization
• Entry-hold-count solution
• New directory entry type
• Link – another name (pointer) to an existing file
• Resolve the link – follow pointer to locate the file
General Graph Directory
General Graph Directory (Cont.)
• How do we guarantee no cycles?
• Allow only links to file not subdirectories
• Garbage collection
• Every time a new link is added use a cycle detection
algorithm to determine whether it is OK
Protection
• File owner/creator should be able to control:
• what can be done
• by whom
• Types of access
• Read
• Write
• Execute
• Append
• Delete
• List
Access Lists and Groups
• Mode of access: read, write, execute
• Three classes of users on Unix / Linux
RWX
a) owner access 7  111
RWX
b) group access 6  110
RWX
c) public access 1  001

• Ask manager to create a group (unique name), say G,


and add some users to the group.
• For a particular file (say game) or subdirectory, define
an appropriate access.

Attach a group to a file


chgrp G game
Windows 7 Access-Control List Management
A Sample UNIX Directory Listing
File-System Structure
• File structure
• Logical storage unit
• Collection of related information
• File system resides on secondary storage (disks)
• Provided user interface to storage, mapping logical to physical
• Provides efficient and convenient access to disk by allowing data
to be stored, located retrieved easily
• Disk provides in-place rewrite and random access
• I/O transfers performed in blocks of sectors (usually 512 bytes)
• File control block – storage structure consisting of
information about a file
• Device driver controls the physical device
• File system organized into layers
Layered File System
File System Layers
• Device drivers manage I/O devices at the I/O control layer
• Given commands like “read drive1, cylinder 72, track 2, sector
10, into memory location 1060” outputs low-level hardware
specific commands to hardware controller
• Basic file system given command like “retrieve block 123”
translates to device driver
• Also manages memory buffers and caches (allocation,
freeing, replacement)
• Buffers hold data in transit
• Caches hold frequently used data
• File organization module understands files, logical address,
and physical blocks
 Translates logical block # to physical block #
 Manages free space, disk allocation
File System Layers (Cont.)
• Logical file system manages metadata
information
• Translates file name into file number, file handle,
location by maintaining file control blocks (inodes
in UNIX)
• Directory management
• Protection
• Layering useful for reducing complexity and
redundancy, but adds overhead and can
decrease performanceTranslates file name into
file number, file handle, location by maintaining
file control blocks (inodes in UNIX)
• Logical layers can be implemented by any coding
method according to OS designer
File System Layers (Cont.)
• Many file systems, sometimes many within
an operating system
• Each with its own format (CD-ROM is ISO 9660;
Unix has UFS, FFS; Windows has FAT, FAT32,
NTFS as well as floppy, CD, DVD Blu-ray, Linux
has more than 40 types, with extended file
system ext2 and ext3 leading; plus distributed
file systems, etc.)
• New ones still arriving – ZFS, GoogleFS, Oracle
ASM, FUSE
Allocation Methods - Contiguous

• An allocation method refers to how disk


blocks are allocated for files:
• Contiguous allocation – each file occupies set
of contiguous blocks
• Best performance in most cases
• Simple – only starting location (block #) and
length (number of blocks) are required
• Problems include finding space for file, knowing
file size, external fragmentation, need for
compaction off-line (downtime) or on-line
Contiguous Allocation

• Mapping from logical


to physical
Q

LA/512

Block to be accessed = Q
+ starting address
Displacement into block =
R
Extent-Based Systems

• Many newer file systems (i.e., Veritas File


System) use a modified contiguous allocation
scheme

• Extent-based file systems allocate disk blocks


in extents

• An extent is a contiguous block of disks


• Extents are allocated for file allocation
• A file consists of one or more extents
Allocation Methods - Linked
• Linked allocation – each file a linked list of
blocks
• File ends at nil pointer
• No external fragmentation
• Each block contains pointer to next block
• No compaction, external fragmentation
• Free space management system called when new
block needed
• Improve efficiency by clustering blocks into groups
but increases internal fragmentation
• Reliability can be a problem
• Locating a block can take many I/Os and disk seeks
Allocation Methods – Linked (Cont.)
• FAT (File Allocation Table) variation
• Beginning of volume has table, indexed by block number
• Much like a linked list, but faster on disk and cacheable
• New block allocation simple
Linked Allocation
• Each file is a linked list of disk blocks: blocks may be
scattered anywhere on the disk
block = pointer

 Mapping

Q
LA/511
R
Block to be accessed is the Qth block in the linked chain
of blocks representing the file.

Displacement into block = R + 1


Linked Allocation
File-Allocation Table
Allocation Methods - Indexed

• Indexed allocation
• Each file has its own index block(s) of pointers to its data blocks

• Logical view

index table
Example of Indexed Allocation
Indexed Allocation (Cont.)
• Need index table

• Random access

• Dynamic access without external fragmentation, but


have overhead of index block

• Mapping from logical to physical in a file of maximum


Q size of 512 bytes. We need
size of 256K bytes and block
only 1 block for LA/512
index table
R

Q = displacement into index table


R = displacement into block
Indexed Allocation – Mapping (Cont.)
• Mapping from logical to physical in a file of unbounded length (block size of
512 words)

• Linked scheme – Link blocks of index table (no limit on size)

Q1
LA / (512 x 511)
R1
Q1 = block of index table
R1 is used as follows:
Q2
R1 / 512
R2

Q2 = displacement into block of index table


R2 displacement into block of file:
Indexed Allocation – Mapping (Cont.)
• Two-level index (4K blocks could store 1,024 four-byte pointers in outer index ->
1,048,567 data blocks and file size of up to 4GB)

Q1
LA / (512 x 512)
R1

Q1 = displacement into outer-index


R1 is used as follows:
Q2
R1 / 512
R2

Q2 = displacement into block of index table


R2 displacement into block of file:
Indexed Allocation – Mapping (Cont.)
Combined Scheme: UNIX UFS
4K bytes per block, 32-bit addresses

More index blocks than can be addressed with 32-bit file pointer
Performance
• Best method depends on file access type
• Contiguous great for sequential and random
• Linked good for sequential, not random
• Declare access type at creation -> select either
contiguous or linked
• Indexed more complex
• Single block access could require 2 index block
reads then data block read
• Clustering can help improve throughput, reduce
CPU overhead
Performance (Cont.)
• Adding instructions to the execution path to save
one disk I/O is reasonable
• Intel Core i7 Extreme Edition 990x (2011) at 3.46Ghz
= 159,000 MIPS
• http://en.wikipedia.org/wiki/Instructions_per_second
• Typical disk drive at 250 I/Os per second
• 159,000 MIPS / 250 = 630 million instructions during one
disk I/O
• Fast SSD drives provide 60,000 IOPS
• 159,000 MIPS / 60,000 = 2.65 millions instructions during
one disk I/O
Free-Space Management
• File system maintains free-space list to track available blocks/clusters
• (Using term “block” for simplicity)
• Bit vector or bit map (n blocks)

0 1 2 n-1

1  block[i] free


bit[i] =
0  block[i] occupied

Block number calculation

(number of bits per word) *


(number of 0-value words) +
offset of first 1 bit
CPUs have instructions to return offset within word of
first “1” bit
Free-Space Management (Cont.)
• Bit map requires extra space
• Example:
block size = 4KB = 212 bytes
disk size = 240 bytes (1 terabyte)
n = 240/212 = 228 bits (or 32MB)
if clusters of 4 blocks -> 8MB of
memory

• Easy to get contiguous files


Linked Free Space List on Disk
 Linked list (free list)
 Cannot get
contiguous space
easily
 No waste of space
 No need to traverse
the entire list (if #
free blocks
recorded)
Free-Space Management (Cont.)

• Grouping
• Modify linked list to store address of next n-1 free
blocks in first free block, plus a pointer to next block
that contains free-block-pointers (like this one)

• Counting
• Because space is frequently contiguously used and
freed, with contiguous-allocation allocation, extents,
or clustering
• Keep address of first free block and count of following free
blocks
• Free space list then has entries containing addresses and
counts
Free-Space Management (Cont.)
• Space Maps
• Used in ZFS
• Consider meta-data I/O on very large file systems
• Full data structures like bit maps couldn’t fit in memory -> thousands
of I/Os
• Divides device space into metaslab units and manages
metaslabs
• Given volume can contain hundreds of metaslabs
• Each metaslab has associated space map
• Uses counting algorithm
• But records to log file rather than file system
• Log of all block activity, in time order, in counting format
• Metaslab activity -> load space map into memory in balanced-
tree structure, indexed by offset
• Replay log into that structure
• Combine contiguous free blocks into single entry
I/O System
Overview
• I/O management is a major component of operating
system design and operation
• Important aspect of computer operation
• I/O devices vary greatly
• Various methods to control them
• Performance management
• New types of devices frequent
• Ports, busses, device controllers connect to various
devices
• Device drivers encapsulate device details
• Present uniform device-access interface to I/O subsystem
I/O Hardware
• Incredible variety of I/O devices
• Storage
• Transmission
• Human-interface
• Common concepts – signals from I/O
devices interface with computer
• Port – connection point for device
• Bus - daisy chain or shared direct access
• PCI bus common in PCs and servers, PCI Express
(PCIe)
• expansion bus connects relatively slow devices
• Serial-attached SCSI (SAS) common disk
interface
I/O Hardware (Cont.)
• Controller (host adapter) – electronics that
operate port, bus, device
• Sometimes integrated
• Sometimes separate circuit board (host adapter)
• Contains processor, microcode, private memory, bus
controller, etc.
• Some talk to per-device controller with bus controller,
microcode, memory, etc.
A Typical PC Bus Structure
I/O Hardware (Cont.)
• Fibre channel (FC) is complex controller,
usually separate circuit board (host-bus
adapter, HBA) plugging into bus
• I/O instructions control devices
• Devices usually have registers where device
driver places commands, addresses, and data
to write, or read data from registers after
command execution
• Data-in register, data-out register, status register,
control register
• Typically 1-4 bytes, or FIFO buffer
I/O Hardware (Cont.)
• Devices have addresses, used by
• Direct I/O instructions
• Memory-mapped I/O
• Device data and command registers mapped to
processor address space
• Especially for large address spaces (graphics)
Device I/O Port Locations on PCs (partial)
Polling
• For each byte of I/O
1. Read busy bit from status register until 0
2. Host sets read or write bit and if write copies data into
data-out register
3. Host sets command-ready bit
4. Controller sets busy bit, executes transfer
5. Controller clears busy bit, error bit, command-ready bit
when transfer done
• Step 1 is busy-wait cycle to wait for I/O from
device
• Reasonable if device is fast
• But inefficient if device slow
• CPU switches to other tasks?
• But if miss a cycle data overwritten / lost
Interrupts
• Polling can happen in 3 instruction cycles
• Read status, logical-and to extract status bit, branch if not zero
• How to be more efficient if non-zero infrequently?
• CPU Interrupt-request line triggered by I/O device
• Checked by processor after each instruction
• Interrupt handler receives interrupts
• Maskable to ignore or delay some interrupts
• Interrupt vector to dispatch interrupt to correct handler
• Context switch at start and end
• Based on priority
• Some nonmaskable
• Interrupt chaining if more than one device at same interrupt
number
Interrupt-Driven I/O Cycle
Interrupts (Cont.)
• Interrupt mechanism also used for exceptions
• Terminate process, crash system due to hardware
error
• Page fault executes when memory access error
• System call executes via trap to trigger kernel
to execute request
• Multi-CPU systems can process interrupts
concurrently
• If operating system designed to handle it
• Used for time-sensitive processing, frequent,
must be fast
Latency
• Stressing interrupt management because even
single-user systems manage hundreds or
interrupts per second and servers hundreds of
thousands
• For example, a quiet macOS desktop generated
23,000 interrupts over 10 seconds
Intel Pentium Processor Event-Vector Table
Direct Memory Access
• Used to avoid programmed I/O (one byte at a time) for
large data movement
• Requires DMA controller
• Bypasses CPU to transfer data directly between I/O
device and memory
• OS writes DMA command block into memory
• Source and destination addresses
• Read or write mode
• Count of bytes
• Writes location of command block to DMA controller
• Bus mastering of DMA controller – grabs bus from CPU
• Cycle stealing from CPU but still much more efficient
• When done, interrupts to signal completion

• Version that is aware of virtual addresses can be even


more efficient - DVMA
Six Step Process to Perform DMA Transfer
Application I/O Interface
• I/O system calls encapsulate device behaviors in generic classes
• Device-driver layer hides differences among I/O controllers from
kernel
• New devices talking already-implemented protocols need no extra
work
• Each OS has its own I/O subsystem structures and device driver
frameworks
• Devices vary in many dimensions
• Character-stream or block
• Sequential or random-access
• Synchronous or asynchronous (or both)
• Sharable or dedicated
• Speed of operation
• read-write, read only, or write only

You might also like