M5 - Chapter 9 1
M5 - Chapter 9 1
Greedy Technique
A Greedy Algorithm solves problems by making the choice that seems to be the best at
that particular moment. There are many optimization problems that can be determined
using a greedy algorithm. A greedy algorithm may provide a solution that is close to
optimal to some issues that might have no efficient solution. A greedy algorithm works
if a problem has the following two properties:
1.Greedy Choice Property: By creating a locally optimal solution we can reach a
globally optimal solution. In other words, by making “greedy” choices we can obtain
an optimal solution.
2.Optimal substructure: Optimal solutions will always contain optimal subsolutions.
In other words, we say that the answers to subproblems of an optimal solution are
optimal.
Minimum-cost spanning trees
• Suppose you have a connected undirected graph with a weight (or cost)
associated with each edge
• The cost of a spanning tree would be the sum of the costs of its edges
• A minimum-cost spanning tree is a spanning tree that has the lowest cost
16 16
A B A B
21 11 6 11 6
19 5 5
F C F C
33 14
10
E 18 D E 18 D
A connected, undirected graph A minimum-cost spanning tree
2
Minimum Spanning Tree (MST)
◼ the total cost associated with tree edges is the minimum among all possible spanning
trees
◼ not necessarily unique
3
https://www.youtube.com/watch?v=ZtZaR7EcI5Y
1. Prim’s Algorithm
DEFINITION A spanning tree of an undirected connected graph is its connected acyclic subgraph (i.e., a
tree) that contains all the vertices of the graph. If such a graph has weights assigned to its edges, a
minimum spanning tree is its spanning tree of the smallest weight, where the weight of a tree is defined as
the sum of the weights on all its edges. The minimum spanning tree problem is the problem of finding a
minimum spanning tree for a given weighted connected graph.
e(f, 2) d(f, 5)
d(f, 5)
Examples:
9
b
a 2 6
d
4 5
5 4
5 e
c
6
2. Kruskal’s Algorithm
• Kruskal’s algorithm looks at a minimum spanning tree of a weighted connected graph G=
V, E as an acyclic subgraph with |V | − 1 edges for which the sum of the edge weights is the smallest.
Consequently, the algorithm constructs a minimum spanning tree as an expanding sequence of subgraphs that
are always acyclic but are not necessarily connected on the intermediate stages of the algorithm. The algorithm
begins by sorting the graph’s edges in nondecreasing order of their weights. Then, starting with the empty
subgraph, it scans this sorted list, adding the next edge on the list to the current subgraph if such an inclusion
does not create a cycle and simply skipping the edge otherwise.
bc ef ab bf cf af df ae cd de
1 2 3 4 4 5 5 6 6 8
bc bc ef ab bf cf af df ae cd de
1 1 2 3 4 4 5 5 6 6 8
ef bc ef ab bf cf af df ae cd de
2 1 2 3 4 4 5 5 6 6 8
ab bc ef ab bf cf af df ae cd de
3
1 2 3 4 4 5 5 6 6 8
bf bc ef ab bf cf af df ae cd de
4 1 2 3 4 4 5 5 6 6 8
df
5
Disjoint Subsets and Union-Find Algorithms https://www.youtube.com/watch?v=h57f8HZHF7Y
• Kruskal’s algorithm is one of a number of applications that require a dynamic partition of some n element set
S into a collection of disjoint subsets S1, S2,...,Sk. After being initialized as a collection of n one-element
subsets, each containing a different element of S, the collection is subjected to a sequence of intermixed union
and find operations. (Note that the number of union operations in any such sequence must be bounded above
by n − 1 because each union increases a subset’s size at least by 1 and there are only n elements in the entire
set S.)
• A collection of disjoint subsets of a finite set with the operations:
• Makeset(x) creates a one-element set {x}. It is assumed that this operation can be applied to each of the
elements of set S only once.
• find(x) returns a subset containing x.
• union(x, y) constructs the union of the disjoint subsets Sx and Sy containing x and y, respectively, and
adds it to the collection to replace Sx and Sy, which are deleted from it.
For example, let S = {1, 2, 3, 4, 5, 6}.
Then makeset(i) creates the set{i} and applying this operation six times initializes the structure to the collection of
six singleton sets: {1}, {2}, {3}, {4}, {5}, {6}.
Performing union(1, 4) and union(5, 2) yields {1, 4}, {5, 2}, {3}, {6}, and, if followed by union(4, 5) and then
by union(3, 6), we end up with the disjoint subsets {1, 4, 5, 2}, {3, 6}.
• The implementation of makeset (x) requires assigning the corresponding element in the representative
array to x and initializing the corresponding linked list to a single node with the x value.
• The time efficiency of this operation is obviously in Θ(1), and hence the initialization of n singleton
subsets is in Θ(n).
• The efficiency of find(x) is also in Θ(1)
Note: 1. Prim's or Kruskal's algorithm is used to find the minimal spanning tree (MST) of an undirected
graph.
2. Remove loops and parallel edges from the given graph before finding the MST.
3. Dijkstra’s Algorithm (single-source shortest-paths problem)
c(b, 7) e(d, 9)
5 e
c
14
4. Huffman Trees.
• If we want to create a binary prefix code for some alphabet, it is natural to associate the
alphabet’s symbols with leaves of a binary tree in which all the left edges are labeled by
0 and all the right edges are labeled by 1. The codeword of a symbol can then be
obtained by recording the labels on the simple path from the root to the symbol’s leaf.
Since there is no simple path to a leaf that continues to another leaf, no codeword can be
a prefix of another codeword; hence, any such tree yields a prefix code.
• How to construct a tree that would assign shorter bit strings to high-frequency symbols
and longer ones to low-frequency symbols?
Huffman’s algorithm
Step 1 Initialize n one-node trees and label them with the symbols of the alphabet given.
Record the frequency of each symbol in its tree’s root to indicate the tree’s weight. (More
generally, the weight of a tree will be equal to the sum of the frequencies in the tree’s
leaves.)
Step 2 Repeat the following operation until a single tree is obtained. Find two trees with
the smallest weight. Make them the left and right subtree of a new tree and record the sum
of their weights in the root of the new tree as its weight.
Example of constructing a Huffman coding tree.
With the occurrence frequencies given and the codeword lengths obtained, the average number of bits per
symbol in this code is 2 . 0.35 + 3 . 0.1 + 2 . 0.2 + 2 . 0.2 + 3 . 0.15 = 2.25.
• Fixed-length encoding: We would have to use at least 3 bits per each symbol. Thus, for the above
example, Huffman’s code achieves the compression ratio—a standard measure of a compression
algorithm’s effectiveness—of (3 − 2.25)/3 * 100%= 25%. In other words, Huffman’s encoding of the text
will use 25% less memory than its fixed-length encoding.
Time complexity: O(nlogn) where n is the number of unique characters. If there are n nodes, extractMin() is
called 2*(n – 1) times. extractMin() takes O(logn) time as it calls minHeapify(). So, the overall complexity is
O(nlogn). If the input array is sorted, there exists a linear time algorithm.
Examples:
Construct a Huffman code for the following data:
1.
HCode: A B C D -
0 100 111 101 110
2. a b c d e f
5 9 12 13 16 45