Unit III College Notes
Unit III College Notes
Tree: Definitions -Height, Depth, Order, Degree. Binary Search Tree -Operations, Traversal/Search.
AVL Tree, Heap, Applications and Comparison of Various Types of Tree. Introduction to Forest,
Multi-Way Tree, B Tree, B+ Tree, B* Tree and Red-Black Tree.
Tree: Definitions
Tree is a non-linear data structure in which data is organized in random order. A tree data structure
can be defined as follows: -
Tree is a non-linear data structure which organizes data in hierarchical structure.
In tree data structure, every individual element is called as Node. Node in a tree data structure,
stores the actual data of that particular element and link to next element in hierarchical structure.
In a tree data structure, if we have N number of nodes then we can have a maximum of N-1 number
of links.
Example
AVL Tree
AVL tree is a self-balanced binary search tree. That means, an AVL tree is also a binary search tree but
it is a balanced tree. A binary tree is said to be balanced, if the difference between the height of left
subtree and right subtree of every node in the tree is either -1, 0 or +1. In other words, a binary tree
is said to be balanced if for every node, height of its children differs by at most one. In an AVL tree,
every node maintains an extra information known as balance factor.
The balance factor of a node is calculated either height of left subtree - height of right subtree.
There are four rotations and they are classified into two types.
Heap
A heap is a complete binary tree and the complete binary tree is a binary tree in which all the levels
except the last level, i.e., leaf node should be completely filled and all the nodes should be left-
justified.
Heap application
Heap Sort: Heap Sort uses Binary Heap to sort an array in O(nlogn).
Priority Queue: It uses binary heap to efficient implement operations insert(), delete(),
extract_max(), update() in O(logn) time.
Graph Algorithms: Some of the Graph Algorithms also use heaps to reduce its time complexity like
Dijkstra’s Algorithm and Minimum Spanning Tree.
K-way merge: A heap data structure is useful to merge many already-sorted input streams into a
single sorted output stream.
Order statistics: The Heap data structure can be used to efficiently find the kth smallest (or largest)
element in an array.
Introduction to Forest
Forest is a collection of disjoint trees. In other words, we can also say that forest is a collection of an
acyclic graph which is not connected. Here is a pictorial representation of a forest.
A multiway tree is a tree that can have more than two children.
A multiway tree of order m (or an **m-way tree) is the one in which a tree can have m children.
An m-way search tree is a m-way tree in which:
Each node has m children and m-1 key fields
The keys in each node are in ascending order.
The keys in the first i children are smaller than the j-th key
The keys in the last m-i children are larger than the j-th key
In a binary search tree, m=2. So, it has one value and two sub trees.
M-way search trees give the same advantages to m-way trees that binary search trees gave to binary
trees - they provide fast information retrieval and update.
However, they also have the same problems that binary search trees had - they can become
unbalanced, which means that the construction of the tree becomes of vital importance.
In m-way search tree, each sub-tree is also a m-way search tree and follows the same rules.
An extension of a multiway search tree of order m is a B-tree of order m.
This type of tree will be used when the data to be accessed/stored is located on secondary storage
devices because they allow for large amounts of data to be stored in a node.
Introduction to B Tree
B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of order m can
have at most m-1 keys and m children. One of the main reasons of using B tree is its capability to
store large number of keys in a single node and large key values by keeping the height of the tree
relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the following
properties.
Every node in a B-Tree contains at most m children.
Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
The root nodes must have at least 2 nodes.
All leaf nodes must be at the same level.
It is not necessary that, all the nodes contain the same number of children but, each node must
have m/2 number of nodes.
1,12,8,2,25,6,14,28,17,7,52,16,48,68,3,26,29,53,55,45,67.
1) insert 1
2) insert 12
4) insert 6,14,28,17
5) insert 7,52,16,48,68,3,26,29,53,55,45.
Introduction to B+ tree
A B+ tree is a data structure often used in the implementation of database indexes. Each node of the
tree contains an ordered list of keys and pointers to lower level nodes in the tree. These pointers can
be thought of as being between each of the keys. To search for or insert an element into the tree, one
loads up the root node, finds the adjacent keys that the searched-for value is between and follows
the corresponding pointer to the next node in the tree. Recursing eventually leads to the desired
value or the conclusion that the value is not present.
B+ trees use some clever balancing techniques to make sure that all of the leaves are always on the
same level of the tree, that each node is always at least half full (rounded) of keys, and (therefore)
that the height of the tree is always at most ceiling(log(k)) of base ceiling(n/2) where k is the number
of values in the tree and n is the maximum number of pointers ( maximum number of nodes + 1) in
each block
Properties of B+ Tree
The root node points to at least two nodes.
All non-root nodes are at least half full.
For a tree of order m, all internal nodes have m-1 keys and m pointers.
B+ Tree grows upwards.
B+ Tree is balanced.
Sibling pointers allow sequential searching.
B* tree of order m is a search tree that is either empty or that satisfies three properties:
The root node has minimum two and maximum 2 floor ((2m-2)/3) +1 children.
Other internal nodes have the minimum floor ((2m-1)/3) and maximum m children.
All external nodes are on the same level.
The advantage of using B* trees over B-trees is a unique feature called the ‘two-to-three’ split. By
this, the minimum number of keys in each node is not half the maximum number, but two-thirds of it,
making data far more compact. However, the disadvantage of this is a complex deletion operation.
Red - Black Tree is another variant of Binary Search Tree in which every node is colored either red or
black. We can define a Red Black Tree as follows: -Red Black Tree is a Binary Search Tree in which
every node is colored either red or black.
In a Red Black Tree the color of a node is decided based on the Red Black Tree properties.
Properties of Red Black Tree
1. Red - Black Tree must be a Binary Search Tree.
2. The Root node must be colored black.
3. The children of Red colored node must be colored black. (There should not be two consecutive Red
nodes).
4. In all the paths of the tree there must be same number of black colored nodes.
5. Every new node must insert with red color.
6. Every leaf (i.e. NULL node) must be colored black