Chapter 17
Disk Storage, Basic
File Structures, and
Hashing
Chapter 18
Indexing Structures
for Files
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Disk Storage Devices
◼ Preferred secondary storage device for high
storage capacity and low cost.
◼ Data stored as magnetized areas on magnetic
disk surfaces.
◼ A disk pack contains several magnetic disks
connected to a rotating spindle.
◼ Disks are divided into concentric circular tracks
on each disk surface.
◼ Track capacities vary typically from 4 to 50 Kbytes
or more
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Disk Storage Devices (cont.)
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Disk Storage Devices (cont.)
◼ A track is divided into smaller blocks or sectors
◼ because it usually contains a large amount of information
◼ The division of a track into sectors is hard-coded on the
disk surface and cannot be changed.
◼ One type of sector organization calls a portion of a track
that subtends a fixed angle at the center as a sector.
◼ A track is divided into blocks.
◼ The block size B is fixed for each system.
◼ Typical block sizes range from B=512 bytes to B=4096 bytes.
◼ Whole blocks are transferred between disk and main
memory for processing.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Disk Storage Devices (cont.)
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Disk Storage Devices (cont.)
◼ A read-write head moves to the track that contains the
block to be transferred.
◼ Disk rotation moves the block under the read-write head for
reading or writing.
◼ A physical disk block (hardware) address consists of:
◼ a cylinder number (imaginary collection of tracks of same
radius from all recorded surfaces)
◼ the track number or surface number (within the cylinder)
◼ and block number (within track).
◼ Reading or writing a disk block is time consuming
because of the seek time s and rotational delay (latency)
rd.
◼ Double buffering can be used to speed up the transfer of
contiguous disk blocks.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Records
◼ Fixed and variable length records
◼ Records contain fields which have values of a
particular type
◼ E.g., amount, date, time, age
◼ Fields themselves may be fixed length or variable
length
◼ Variable length fields can be mixed into one
record:
◼ Separator characters or length fields are needed
so that the record can be “parsed.”
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Blocking
◼ Blocking:
◼ Refers to storing a number of records in one block
on the disk.
◼ Blocking factor (bfr) refers to the number of
records per block.
◼ There may be empty space in a block if an
integral number of records do not fit in one block.
◼ Spanned Records:
◼ Refers to records that exceed the size of one or
more blocks and hence span a number of blocks.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Spanned and Unspanned Records
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Files of Records
◼ A file is a sequence of records, where each record is a
collection of data values (or data items).
◼ A file descriptor (or file header) includes information that
describes the file, such as the field names and their data
types, and the addresses of the file blocks on disk.
◼ Records are stored on disk blocks.
◼ The blocking factor bfr for a file is the (average) number
of file records stored in a disk block.
◼ A file can have fixed-length records or variable-length
records.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Files of Records (cont.)
◼ File records can be unspanned or spanned
◼ Unspanned: no record can span two blocks
◼ Spanned: a record can be stored in more than one block
◼ The physical disk blocks that are allocated to hold the
records of a file can be contiguous, linked, or indexed.
◼ In a file of fixed-length records, all records have the same
format. Usually, unspanned blocking is used with such
files.
◼ Files of variable-length records require additional
information to be stored in each record, such as
separator characters and field types.
◼ Usually spanned blocking is used with such files.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Unordered Files
◼ Also called a heap or a pile file.
◼ New records are inserted at the end of the file.
◼ A linear search through the file records is
necessary to search for a record.
◼ This requires reading and searching half the file
blocks on the average, and is hence quite
expensive.
◼ Record insertion is quite efficient.
◼ Reading the records in order of a particular field
requires sorting the file records.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Ordered Files
◼ Also called a sequential file.
◼ File records are kept sorted by the values of an ordering field.
◼ Insertion is expensive: records must be inserted in the correct order.
◼ It is common to keep a separate unordered overflow (or
transaction) file for new records to improve insertion efficiency;
this is periodically merged with the main ordered file.
◼ A binary search can be used to search for a record on its ordering
field value.
◼ This requires reading and searching log2 of the file blocks on the
average, an improvement over linear search.
◼ Reading the records in order of the ordering field is quite efficient.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Ordered Files (cont.)
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Average Access Times
◼ The following table shows the average access
time to access a specific record for a given type
of file
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Hashed Files
◼ Hashing for disk files is called External Hashing
◼ The file blocks are divided into M equal-sized buckets, numbered
bucket0, bucket1, ..., bucketM-1.
◼ Typically, a bucket corresponds to one (or a fixed number of) disk
block.
◼ One of the file fields is designated to be the hash key of the file.
◼ The record with hash key value K is stored in bucket i, where i=h(K),
and h is the hashing function.
◼ Search is very efficient on the hash key.
◼ Collisions occur when a new record hashes to a bucket that is already
full.
◼ An overflow file is kept for storing such records.
◼ Overflow records that hash to each bucket can be linked together.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Hashed Files (cont.)
◼ There are numerous methods for collision resolution, including the
following:
◼ Open addressing: Proceeding from the occupied position
specified by the hash address, the program checks the
subsequent positions in order until an unused (empty) position is
found.
◼ Chaining: For this method, various overflow locations are kept,
usually by extending the array with a number of overflow
positions. In addition, a pointer field is added to each record
location. A collision is resolved by placing the new record in an
unused overflow location and setting the pointer of the occupied
hash address location to the address of that overflow location.
◼ Multiple hashing: The program applies a second hash function if
the first results in a collision. If another collision results, the
program uses open addressing or applies a third hash function
and then uses open addressing if necessary.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Hashed Files - Overflow Handling
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Hashed Files (cont.)
◼ To reduce overflow records, a hash file is typically
kept 70-80% full.
◼ The hash function h should distribute the records
uniformly among the buckets
◼ Otherwise, search time will be increased because
many overflow records will exist.
◼ Main disadvantages of static external hashing:
◼ Fixed number of buckets M is a problem if the
number of records in the file grows or shrinks.
◼ Ordered access on the hash key is quite inefficient
(requires sorting the records).
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Hashed Files – External Hashing (cont.)
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Dynamic And Extendible Hashed
Files
◼ Dynamic and Extendible Hashing Techniques
◼ Hashing techniques are adapted to allow the dynamic
growth and shrinking of the number of file records.
◼ These techniques include the following: dynamic hashing,
extendible hashing, and linear hashing.
◼ Both dynamic and extendible hashing use the binary
representation of the hash value h(K) in order to access
a directory.
◼ In dynamic hashing the directory is a binary tree.
◼ In extendible hashing the directory is an array of size 2d
where d is called the global depth.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Dynamic And Extendible Hashing
(cont.)
◼ The directories can be stored on disk, and they expand or
shrink dynamically.
◼ Directory entries point to the disk blocks that contain the
stored records.
◼ An insertion in a disk block that is full causes the block to
split into two blocks and the records are redistributed
among the two blocks.
◼ The directory is updated appropriately.
◼ Dynamic and extendible hashing do not require an
overflow area.
◼ Linear hashing does require an overflow area but does
not use a directory.
◼ Blocks are split in linear order as the file expands.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Extendible
Hashing
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Indexes as Access Paths
◼ A single-level index is an auxiliary file that makes
it more efficient to search for a record in the data
file.
◼ The index is usually specified on one field of the
file (although it could be specified on several
fields)
◼ One form of an index is a file of entries <field
value, pointer to record>, which is ordered by
field value
◼ The index is called an access path on the field.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Indexes as Access Paths (cont.)
◼ The index file usually occupies considerably less disk
blocks than the data file because its entries are much
smaller
◼ A binary search on the index yields a pointer to the file
record
◼ Indexes can also be characterized as dense or sparse
◼ A dense index has an index entry for every search key
value (and hence every record) in the data file.
◼ A sparse (or nondense) index, on the other hand, has
index entries for only some of the search values
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Indexes as Access Paths (cont.)
◼ Example: Given the following data file EMPLOYEE(NAME, SSN,
ADDRESS, JOB, SAL, ... )
◼ Suppose that:
◼ record size R=150 bytes block size B=512 bytes r=30000 records
◼ Then, we get:
◼ blocking factor Bfr= B div R= 512 div 150= 3 records/block
◼ number of file blocks b= (r/Bfr)= (30000/3)= 10000 blocks
◼ For an index on the SSN field, assume the field size VSSN=9 bytes,
assume the record pointer size PR=7 bytes. Then:
◼ index entry size RI=(VSSN+ PR)=(9+7)=16 bytes
◼ index blocking factor BfrI= B div RI= 512 div 16= 32 entries/block
◼ number of index blocks b= (r/ BfrI)= (30000/32)= 938 blocks
◼ binary search needs log2bI= log2938= 10 block accesses
◼ This is compared to an average linear search cost of:
◼ (b/2)= 10000/2= 5000 block accesses
◼ If the file records are ordered, the binary search cost would be:
◼ log2b= log210000= 14 block accesses
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Types of Single-Level Indexes
◼ Primary Index
◼ Defined on an ordered data file
◼ The data file is ordered on a key field
◼ Includes one index entry for each block in the data file; the
index entry has the key field value for the first record in the
block, which is called the block anchor
◼ A similar scheme can use the last record in a block.
◼ A primary index is a nondense (sparse) index, since it
includes an entry for each disk block of the data file and the
keys of its anchor record rather than for every search value.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Primary Index
on the Ordering
Key Field
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Types of Single-Level Indexes
◼ Clustering Index
◼ Defined on an ordered data file
◼ The data file is ordered on a non-key field unlike primary
index, which requires that the ordering field of the data file
have a distinct value for each record.
◼ Includes one index entry for each distinct value of the field;
the index entry points to the first data block that contains
records with that field value.
◼ It is another example of nondense index where Insertion
and Deletion is relatively straightforward with a clustering
index.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
A Clustering
Index
Example
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Another
Clustering
Index
Example
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Types of Single-Level Indexes
◼ Secondary Index
◼ A secondary index provides a secondary means of
accessing a file for which some primary access already
exists.
◼ The secondary index may be on a field which is a candidate
key and has a unique value in every record, or a non-key
with duplicate values.
◼ The index is an ordered file with two fields.
◼ The first field is of the same data type as some non-ordering
field of the data file that is an indexing field.
◼ The second field is either a block pointer or a record pointer.
◼ There can be many secondary indexes (and hence, indexing
fields) for the same file.
◼ Includes one entry for each record in the data file; hence, it
is a dense index
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Example of
a Dense
Secondary
Index
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Example of
a Secondary
Index
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Properties of Index Types
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Summary of Single-Level Indexes
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Multi-Level Indexes
◼ Because a single-level index is an ordered file, we can
create a primary index to the index itself;
◼ In this case, the original index file is called the first-level
index and the index to the index is called the second-level
index.
◼ We can repeat the process, creating a third, fourth, ..., top
level until all entries of the top level fit in one disk block
◼ A multi-level index can be created for any type of first-
level index (primary, secondary, clustering) as long as the
first-level index consists of more than one disk block
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
A Two-Level
Primary Index
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Multi-Level Indexes
◼ Such a multi-level index is a form of search tree
◼ However, insertion and deletion of new index
entries is a severe problem because every level of
the index is an ordered file.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Dynamic Multilevel Indexes
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
A Node in a Search Tree with Pointers to
Subtrees Below It
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Dynamic Multilevel Indexes Using B-
Trees and B+-Trees
◼ Most multi-level indexes use B-tree or B+-tree data
structures because of the insertion and deletion problem
◼ This leaves space in each tree node (disk block) to allow for
new index entries
◼ These data structures are variations of search trees that
allow efficient insertion and deletion of new search values.
◼ In B-Tree and B+-Tree data structures, each node
corresponds to a disk block
◼ Each node is kept between half-full and completely full
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Dynamic Multilevel Indexes Using B-
Trees and B+-Trees (cont.)
◼ An insertion into a node that is not full is quite
efficient
◼ If a node is full the insertion causes a split into two
nodes
◼ Splitting may propagate to other tree levels
◼ A deletion is quite efficient if a node does not
become less than half full
◼ If a deletion causes a node to become less than
half full, it must be merged with neighboring
nodes
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Difference between B-tree and B+-tree
◼ In a B-tree, pointers to data records exist at all
levels of the tree
◼ In a B+-tree, all pointers to data records exists at
the leaf-level nodes
◼ A B+-tree can have less levels (or higher capacity
of search values) than the corresponding B-tree
◼ See the animations
◼ https://www.cs.usfca.edu/~galles/visualization/A
lgorithms.html
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
B-tree Structures
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
The Nodes of a B+-tree
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Example of
an Insertion
in a B+-tree
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Example of
a Deletion in
a B+-tree
Copyright © 2011 Ramez Elmasri and Shamkant Navathe