[go: up one dir, main page]

0% found this document useful (0 votes)
44 views50 pages

Understanding Binary Trees and Graphs

The document provides an overview of data structures, focusing on binary trees, binary search trees, graphs, and hashing. It explains the characteristics and traversal methods of binary trees, the operations of binary search trees, and the definitions and representations of graphs. Additionally, it discusses the importance of hashing for efficient data lookups, including collision handling strategies such as separate chaining and open addressing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views50 pages

Understanding Binary Trees and Graphs

The document provides an overview of data structures, focusing on binary trees, binary search trees, graphs, and hashing. It explains the characteristics and traversal methods of binary trees, the operations of binary search trees, and the definitions and representations of graphs. Additionally, it discusses the importance of hashing for efficient data lookups, including collision handling strategies such as separate chaining and open addressing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Structures

Binary Trees
Binary Trees
§ binary tree, T, is either empty or such that
§ T has a special node called the root node
§ T has two sets of nodes, LT and RT, called the left subtree and right
subtree of T, respectively, LT and RT are binary trees

§ Can be shown as Parent, left child, right child

§ Node represented as a circle


§ Circle labeled by the node

§ Every node in a binary tree


§ Has at most two children
2
Binary Trees (cont’d.)
§ Root node drawn at the top
§ Level of a node
§ Number of branches on the path
§ Height of a binary tree
§ Number of nodes on the longest path from the root to a leaf

Leaves : G, E, H Root: (A) level 0 A is parent for B, C


Tree Height = 4
D is on level 2
Traversing a Binary Search 4

Tree
§ Inorder tree walk:
§ Root is printed between the values of its left and right subtrees: left,
root, right
§ Keys are printed in sorted order
§ Preorder tree walk:
§ root printed first: root, left, right
§ Postorder tree walk:
§ root printed last: left, right, root

5 Inorder: 2 3 5 5 7 9
Preorder: 5 3 2 5 7 9
3 7
2 5 9 Postorder: 2 5 3 9 7 5
Binary Tree Traversal

§ Inorder traversal
§ Traverse the left subtree
§ Visit the node
§ Traverse the right subtree
D G B E A C H F

§ Preorder traversal
§ Visit the node
§ Traverse the left subtree
§ Traverse the right subtree

FIGURE 11-1 Binary tree


A B D G E C F H

5
Binary Tree Traversal (cont’d.)

§ Postorder traversal
§ Traverse the left subtree
§ Traverse the right subtree
§ Visit the node
G D E B H F C A

§ Each traversal algorithm: recursive


§ Listing of nodes
§ Inorder sequence
FIGURE 11-1 Binary tree
§ Preorder sequence
§ Postorder sequence

6
Binary Search Trees
§ Data in each node
§ Larger than the data in its right child
§ Smaller than the data in its left child

FIGURE 11-7 Binary search tree


7
50 60 77 33 26 45 80 65
50

33 60

26 45 77

65 80
Binary Search Trees (cont’d.)
§ class bSearchTreeType
§ Illustrates basic operations to implement a binary search tree
§ Function search
§ Function insert
§ Function delete

9
How to search a binary search 10

tree? 5

Alg: TREE-SEARCH(x, k) 3 7

2 4 9
1. if x = NIL or k = T [x]
2. then return x
3. if k < T [x]
4. then return TREE-SEARCH(left [x], k )
5. else return TREE-SEARCH(right [x], k )

Running Time: O (h),


h – the height of the tree
How to search a binary search
tree?
(1) Start at the root
(2) Compare the value of
the item you are
searching for with
the value stored at
the root
(3) If the values are
equal, then item
found; otherwise, if it
is a leaf node, then
not found
How to search a binary search
tree?
(4) If it is less than the value
stored at the root, then
search the left subtree
(5) If it is greater than the
value stored at the root,
then search the right
subtree
(6) Repeat steps 2-6 for the
root of the subtree chosen
in the previous step 4 or 5
Binary Tree Insertion
50
50

33 60

26 45 77

80
20
How to insert 48

50

33 60

26 45 77

80
20 48
How to Insert 46

50

33 60

26 45 77

80
20 48

46
How to Delete 80

50

33 60

26 45 77

80
20 48

46
How to Delete 77

50

33 60

26 45 77

80
20 48

46
How to Delete 48

50

33 60

26 45 77

80
20 48

46
How to Delete 50
Search for the previous
number that precedes 50 50

33 60

26 45 77

80
20 48

46
How to Delete 50
Replace 50 with 48
48

33 60

26 45 77

80
20 48

46
How to Delete 50
Replace 50 with 48
48
Now delete the original
48 33 60

26 45 77

80
20 48

46
Data Structures

Graphs
Graph Definitions and Notations
§ Graph G pair: G = (V, E), where V are the set of vertices and
E edges (Pairs of elements of V).
§ Directed graph (digraph): Elements in set of edges of graph
G: ordered
§ Undirected graph: Elements in set of edges of graph G: not
ordered

directed graph undirected graphs

V(G1) ={1,2,3,4,5}
V(G1) ={1,2,3,4,5} E(G1) ={(1,2),(1,4),(1,3),(2,1),(2,5),(3,1)
E(G1) ={(1,2),(1,4),(2,5),(3,1),(3,4),(4,5)} ,(3,4),(3,5),(4,1),(4,3)}
24

Graph Definitions and Notations


§ Let u and v be two vertices in G
§ u and v adjacent: If edge from one to the other exists: (u, v) Î E
§ Path from u to v
§ If sequence of vertices u1, u2, . . ., un exists
§ Vertices u and v called connected
§ If path from u to v exists
§ Loop
§ Edge incident on a single vertex
§ e1 and e2 called parallel edges
§ If two edges e1 and e2 associate with same pair of vertices {u, v}
25

Graph Definitions and Notations

§ Simple path
§ All vertices distinct (except possibly first, last)
§ No loops, no parallel edges
§ Cycle in G
§ Simple path in which first and last vertices are the same
§ Number of edges in undirected graph G is twice the
directed graph
§ If edge from u to v exists: (u, v) Î E
§ u is adjacent to v
§ v is adjacent from u
26

Graph Representation
§ Graphs represented in computer memory in Adjacency
matrices
§ Let G be a graph with n vertices where n > zero
§ Let V(G) = {v1, v2, ..., vn}
§ Adjacency matrix
27

Graph Representation
§ Graphs also can be represented in computer memory in Adjacency
Lists
§ For each vertex v: linked list exists
§ Linked list node contains vertex u: (v, u) Î E(G)

FIGURE 12-5 Adjacency list


of graph G2
28

Graph Traversals
§ Traversing a graph
§ Similar to traversing a binary tree
§ A bit more complicated

§ Two most common graph traversal algorithms


§ Depth first traversal
§ Breadth first traversal
29

Depth First Traversal


§ Similar to binary tree preorder traversal of a rooted tree
30

Depth First Traversal (cont’d.)


§ General algorithm for depth first traversal at a given node v
§ Recursive algorithm

§ We use a recursive function, dft, to implement this algorithm. The


vertex at which the depth-first traversal is to be started, and the bool
array visited, are passed as parameters to this function.
31

Breadth First Traversal


§ Similar to traversing binary tree level-by-level
§ Nodes at each level
§ Visited from left to right
§ All nodes at any level i
§ Visited before visiting nodes at level i + one
Data Structures

Hashing
Why do we need hashing?
▪ Many applications deal with lots of data
➢Search engines and web pages
▪ There are a lot of look ups.
▪ The look ups are time critical.
▪ Typical data structures like arrays and
lists, may not be sufficient to handle
efficient lookups
▪ In general: When look-ups need to
occur in near constant time. O(1)
Why do we need hashing?

▪ We need something that can do better


than a binary search, O(log N).
▪ We want, O(1).

Solution: Hashing
In fact hashing is used in:

Web searches Spell checkers Databases


Compilers passwords Many
others
Building an index using 35

Hash Maps

DOCID OCCUR POS 1 POS 2 ...

WORD NDOCS PTR


34 6 1 118 2087 3922 3981 5002
jezebel 20
44 3 215 2291 3010
jezer 3 56 4 5 22 134 992
jezerit 1
jeziah 1 566 3 203 245 287
jeziel 1
jezliah 1 67 1 132

jezoar 1 ...
jezrahliah 1
jezreel 39
jezoar
The concept

▪ Suppose we need to find a better way to


maintain a table (Example: a Dictionary)
that is easy to insert and search in O(1).
Finding a hash Function

▪ Assume that N = 5 and the values


we need to insert are: cab, bea, bad
etc.
▪ Let a=0, b=1, c=2, etc
▪ Define H such that
➢H[data] = (∑ characters) Mod N
▪ H[cab] = (2+0+1) Mod 5 = 3
▪ H[bea] = (1+4+0) Mod 5 = 0
▪ H[bad] = (1+0+3) Mod 5 = 4
Collisions

▪ What if the values we need to insert


are “abc”, “cba”, “bca” etc…
➢They all map to the same location
based on our map H (obviously H is not a good
hash map)
▪ This is called “Collision”
▪ When collisions occur, we need to
“handle” them
▪ Collisions can be reduced with a selection
of a good hash function
Choosing a Hash
Function
▪ A good hash function must
➢Be easy to compute
➢Avoid collisions
▪ How do we find a good hash function?
▪ A bad hash function
➢Let S be a string and H(S) = Σ Si where Si is the ith
character of S
➢Why is this bad?
Choosing a Hash Function?

▪ Question
➢Think of hashing 10000, 5-letter words into a table of
size 10000using the map Hdefined as follows.
➢H(a0a1a2a3a4) = Σ ai (i=0,1….4)
➢If we use H, what would be the key
distribution like?
Collisions

▪ Hash functions can be many-to-1


➢They can map different search keys to
the same hash key.
hash1(`a`) == 9 == hash1(`w`)

▪ Must compare the search key with


the record found
➢If the match fails, there is a collision
Collision Resolving strategies

▪ Separate chaining
▪ Open addressing
➢Linear Probing
➢Quadratic Probing
Separate Chaining Vs. Open Addressing
Separate Chaining Open Addressing

Keys are stored inside the hash table as well as All the keys are stored only inside the hash table.
outside the hash table. No key is present outside the hash table.

The number of keys to be stored in the hash


The number of keys to be stored in the hash table can never exceed the size of the hash
table can even exceed the size of the hash table.
table.

Deletion is easier. Deletion is difficult.

Extra space is required for the pointers to store


No extra space is required.
the keys outside the hash table.

Cache performance is poor.


Cache performance is better.
This is because of linked lists which store the
This is because here no linked lists are used.
keys outside the hash table.

Some buckets of the hash table are never used Buckets may be used even if no key maps to
which leads to wastage of space. those particular buckets.
Separate Chaining

• Collisions can be resolved by creating a list of keys that map to the same value.
• Deletion is easier in separate chaining.
• This is because deleting a key from the hash table does not affect the other keys stored
in the hash table.
Open Addressing
§ Unlike separate chaining, all the keys are stored inside the
hash table.
§ No key is stored outside the hash table.
§ Techniques used for open addressing are-
§ Linear Probing
§ Quadratic Probing
Linear Probing

▪ The idea:
➢Table remains a simple array of size N
➢On insert(x), compute f(x) mod N, if
the cell is full, find another by
sequentially searching for the next
available slot
• Go to f(x)+1, f(x)+2 etc..
➢On find(x), compute f(x) mod N, if the
cell doesn’t match, look elsewhere.
➢Linear probing function can be given by
• h(x, i) = (f(x) + i) mod N (i=1,2,….)
Figure
20.4
Linear
probing hash
table after
each
insertion

Data Structures & Problem Solving using JAVA/2E Mark Allen Weiss © 2002 Addison Wesley
Quadratic probing

▪ Resolve collisions by examining certain cells


(1,4,9,…) away from the original probe point
▪ Collision policy:
➢ Define h0(k), h1(k), h2(k), h3(k), … where
hi(k) = (hash(k) + i2) mod size

➢(Linear probing always finds a cell.)


Quadratic probing

▪ Another issue
➢Suppose the table size is 16.
➢Probe offsets that will be tried:
1 mod 16 = 1
4 mod 16 = 4
9 mod 16 = 9
16 mod 16 = 0
25 mod 16 = 9 only four different values!
36 mod 16 = 4
49 mod 16 = 1
64 mod 16 = 0
81 mod 16 = 1
Figure 20.6
A quadratic
probing hash table
after each
insertion (note that
the table size was
poorly chosen
because it is not a
prime number).

Data Structures & Problem Solving using JAVA/2E Mark Allen Weiss © 2002 Addison Wesley

You might also like