[go: up one dir, main page]

0% found this document useful (0 votes)
17 views52 pages

Lec07 Greedy Algorithms (Part 2)

This document discusses Minimum Spanning Trees (MST) and two algorithms for finding them: Prim-Jarnik and Kruskal's algorithms. It covers definitions, properties, applications, and the implementation details of both algorithms, including time complexity and data structures used. The document also provides examples and exercises to illustrate the concepts.

Uploaded by

yijoebackup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views52 pages

Lec07 Greedy Algorithms (Part 2)

This document discusses Minimum Spanning Trees (MST) and two algorithms for finding them: Prim-Jarnik and Kruskal's algorithms. It covers definitions, properties, applications, and the implementation details of both algorithms, including time complexity and data structures used. The document also provides examples and exercises to illustrate the concepts.

Uploaded by

yijoebackup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

LECTURE 07

GREEDY ALGORITHMS (PART 2)


MINIMUM SPANNING TREES
Outline

 Minimum Spanning Trees


 Definitions
 A crucial fact
 The Prim-Jarnik Algorithm
 Kruskal's Algorithm
Minimum Spanning Tree (MST)

 Spanning subgraph
 Subgraph of a graph G ORD 10
containing all the 1
PIT
vertices of G
DEN 6 7
 Spanning tree 9
 Spanning subgraph 3 DCA
that is itself a (free) 4 STL
tree
8 5 2
 Minimum spanning tree
(MST)
DFW ATL
 Spanning tree of a
weighted graph with
minimum total edge
weight (without cycle)
Applications

 Designing Physical networks


 Communications networks
 Transportation networks
 Computer networks
 Cluster analysis
 Delete long edges leaves
 Finding clusters
 Approximate solutions to NP-hard problems
Example: Optimal Message Passing

 Distribute message to N agents


 Each agent can communicate with some of the
other agents, but their communication is
independently detected with probability
 Group leader wants to transmit message to all
agents so as to minimize the total probability that
message is detected
 MST can be used to minimize the probability
General Concept of MST
 Let A be a set of edges, T  Let A = EMPTY
is a MST, and A T
 While A does not form
 An edge (u, v) is a safe a spanning tree
edge if A  {(u, v)} is also  Find an edge (u, v)
a subset of some MST that is safe for A
 The initial values of A, is and
empty.  Add (u, v) to A
 Return A
Cycle Property
f 8
 Let T be a MST of a weighted 4
graph G C 9
2 6
 Let e be an edge of G that is not in 3 e
8 7
T and let C be the cycle formed by
e with T 7
 For every edge f of C, weight(f)  Replacing f with e yields
weight(e) a better spanning tree

 Proof: By contradiction f 8
4
 If weight(f) > weight(e) we can get C 9
2 6
a spanning tree of smaller weight 3 e
by replacing e with f 8 7

7
Partition Property
f 7
 Consider a partition of the vertices of G 4
into subsets U and V 9
 Let e be an edge of minimum weight 2 5
8
across the partition
8 e 3
 There is a minimum spanning tree of G
containing edge e 7

 Proof: Replacing f with e yields


another MST
 Let T be an MST of G. If T does not
contain e, consider the cycle C formed U V
f 7
by e with T and let f be an edge of C 4
across the partition 9
 By the cycle property, 2 5
8
weight(f)  weight(e)
8 e 3
 Thus, weight(f) = weight(e)
7
 We obtain another MST by replacing f
Various Version of Prim’s Algorithm
 In terms of time complexity:
 If adjacency matrix (classic), the time complexity is O(v2)
 If adjacency list, the time complexity is O(e log v)
 If min heap, then it is O(log v)
 Based on graph, whether it is sparse or dense
 2000 vertices, 1 million edges – heap is 2 – 3 times
slower
 100,000 vertices, 1 million edges – heap is 500 times
faster
 1 million vertices, 2 million edges – heap is 10,000 times
faster
 Classic Prim is optimal for dense graphs
 Heap is better for sparse graphs
Prim-Jarnik’s Algorithm
 Similar to Dijkstra’s algorithm (for a connected graph)
 Developed in 1930 by Czech mathematician Vojtech Jarnik and
rediscovered by Robert Prim in 1957
 We pick an arbitrary vertex s and we grow the MST as a cloud of
vertices, starting from s
 S store with each vertex, v a label d(v) = the smallest weight of an edge
connecting v to a vertex in the cloud
 At each step:
 We add to the cloud the vertex u outside the cloud with the smallest
distance label
 We update the labels of the vertices adjacent to u
 In other words:
 Starts with empty spanning tree
 Maintain 2 sets of vertices:
 Contains vertices in MST
 Contains vertices not yet included
 Consider the edges connect these 2 sets, and always pick the
Prim-Jarnik’s Algorithm (cont.)

 A priority queue stores the vertices outside the cloud


 Key: distance
 Element: vertex
 Locator-based methods
 insert(k,e) returns a locator
 replaceKey(l,k) changes the key of an item
 We store three labels with each vertex:
 Distance
 Parent edge in MST
 Locator in priority queue
Priority Queue

 It is a queue, but the arrangement/sequence of


elements are put into the queue according to certain
orders, i.e. priority
 In C++, by defaults, numbers are sorted in the order
that maximum element print first
 Support by 2 operations: remove the maximum
priority task and insert new task
 For strings, it follows lexicographic (dictionary) order.
For human understanding, it is known as alphabetical
order, but start in decreasing order, from z, y, x and so
on to a.
Heap
 Heap: is a binary tree
 Shape properties:
 Tree is perfectly balanced
 Elements (leaves) at the bottom level are pushed as far
to the left as possible
 By default, heap is a max-heap where:
 Every element of the tree is larger than any of its
descendants if they exists
 Means that largest element is the root
 Another type is min heap: parent’s key is less than its
children
 To solve the MST, we can use min heap as the binary tree
has time complexity that is O(log n)
Insert

 When a new element is inserted, it is inserted in the


left most if the position is available

5 4

2 1 3
Left
Insert

 Let say for e.g., we want to insert 8, but the left most
is full, so put in the right most

5 4

2 1 3 8
 However, 8 is larger than 4, violate the max heap
principle, we swim the element up
Insert

 Let say for e.g., we want to insert 8, but the left most
is full, so put in the right most

5 4

2 1 3 8
 However, 8 is larger than 4, violate the max heap
principle, we swim the element up, by swapping with
the precedence root, i.e. 4
Insert

 After we do that, 8 is > 6, violate the rule, so swap


again.
 Now, the rules are hold.
6 8

5 8 5 6

2 1 3 4 2 1 3 4
Remove

 To remove, the top element will be removed and


replace with the right-most element in the tree

8 4

5 6 5 6

2 1 3 4 2 1 3
Remove

 The new top element violate the max heap property.


Choose the largest child, i.e. 6 and swap.
 Check the property, if it fulfils, then done
4 6

5 6 5 4

2 1 3 2 1 3
Priority Queue by using Heap in array
[0] [1] [2] [3] [4] [5] [6]

- 6 5 4 2 1 3

The use of array and


6 its indexes fulfils the
property principle of
5 heap
4

2 1 3

Assume start from array[0]:


If a node is in array[k], its If we start from Array[1], the
left child is in array[k*2+1], formula is left child = A[k*2]
and its right child is in and right child = A[k*2+1],
array[k*2+2] parent node remains as
A[floor (k/2)]
If a node is in array[k], its
Priority Queue by using Heap
[0] [1] [2] [3] [4] [5] [6]

- 6 5 4 2 1 3

6  Assume we have the above array


and those elements
2 4  N = number of elements, we have
N=6
5 1 3  Check all the left/right children by
start from k=N/2, i.e. 3
 When k = 2
 If Array[2k=4] > Array[k=2],
swap
Priority Queue by using Heap
[0] [1] [2] [3] [4] [5] [6]

- 6 5 4 2 1 3

6  When k = 1
 Array[2] > Array[1], swap
5 4

2 1 3
Priority Queue by using Heap
maxheap(Array[])
for k = N/2 till k >= 1 Max_Swap(Array[], k, N)
then left= 2*k
max_Swap(Array, right = 2*k+1
k)
if(left<=N and Array[left]>Array[k])
largest = left
else largest = k
if(right<=N and
Array[right]>Array[largest])
largest = right
if(largest!=i)
swap(Array[k], Array[largest])
max_Swap(Array[], largest, N)
Algorithm PrimJarnikMST(G)
Q  new heap-based priority queue
s  a vertex of G
for all v  G.vertices()
if v = s
setDistance(v, 0)
else
setDistance(v, )
setParent(v, )
l  Q.insert(getDistance(v),
v)
setLocator(v,l)
while Q.isEmpty()
u  Q.removeMin()
for all e  G.incidentEdges(u)
z  G.opposite(u,e)
r  weight(e)
if r < getDistance(z)
setDistance(z,r)
setParent(z,e)
Q.replaceKey(getLocator(z),r)
Analysis
 Graph operations
 Method incidentEdges is called once for each vertex
 Label operations
 We set/get the distance, parent and locator labels of vertex z
O(deg(z)) times
 Setting/getting a label takes O(1) time
 Priority queue operations
 Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
 The key of a vertex w in the priority queue is modified at most deg(w)
times, where each key change takes O(log n) time
 Prim-Jarnik’s algorithm runs in O((n + m) log (n)) time provided the graph
is represented by the adjacency list structure
 Recall that ∑v deg(v) = 2m
 The running time is O(m log n) since the graph is connected
Example

 Find MST starting from A


 7
7 D 7 D
2 2
B 4 B 4
8 9  5 9 
2 5 F 2 5 F
C C
8 8
8 3 8 3
E E
A 7 A 7
0 7 0 7

7 7
7 D D
2 2 7
B 4 B 4
5 9  9 4
2 5 F 5 5
C 2 F
8 C
3 8
8 8 3
E E
A 7 A
0 7 7 7
0
7
7 D
2
B 4
5 9 4
2 5 F
C
8
8 3
E
A 3 7
0 7
7 D
2
B 4
5 9 4
2 5 F
C
8
8 3
E
A 3
0 7

Total weight = 21
Exercise

 Use Prim Jarnik’s algorithm to find the minimum


spanning tree for the following graph. The start vertex
is a. Show the intermediate steps in your answer by
using the graph. What is the total weight for this
MST?
Kruskal’s Algorithm

 A priority queue stores the edges outside the cloud


 Key: weight
 Element: edge
 At the end of the algorithm
 We are left with one cloud that encompasses the
MST
 A tree T which is our MST
Algorithm KruskalMST(G)
for each vertex V in G do
define a Cloud(v) of  {v}
let Q be a priority queue.
Insert all edges into Q using their weights as the key
T
while T has fewer than n-1 edges do edge e =
Q.removeMin()
Let u, v be the endpoints of e
if Cloud(v)  Cloud(u) then
Add edge e to T
Merge Cloud(v) and Cloud(u)
return T
Data Structure for Kruskal’s Algorithm

 The algorithm maintains a forest of trees


 An edge is accepted it if connects distinct trees
 We need a data structure that maintains a partition,
i.e., a collection of disjoint sets, with the operations:
 find(u): return the set storing u
 union(u,v): replace the sets storing u and v with
their union
Representation of a Partition

 Each set is stored in a sequence


 Each element has a reference back to the set
 operation find(u) takes O(1) time, and returns the
set of which u is a member.
 in operation union(u,v), we move the elements of
the smaller set to the sequence of the larger set
and update their references
 the time for operation union(u,v) is min(nu,nv),
where nu and nv are the sizes of the sets storing u
and v
 Whenever an element is processed, it goes into a set
of size at least double, hence each element is
processed at most log n times
Partition-Based Implementation

 A partition-based version of Kruskal’s Algorithm


performs cloud merges as unions and tests as finds.
Algorithm Kruskal(G):
Input: A weighted graph G.
Output: An MST T for G.
Let P be a partition of the vertices of G, where each vertex forms a separate set.
Let Q be a priority queue storing the edges of G, sorted by their weights
Let T be an initially-empty tree
while Q is not empty do
(u,v)  Q.removeMinElement() Running time:
if P.find(u) != P.find(v) then O((n+m)log n)
Add (u,v) to T
P.union(u,v)
return T
Kruskal’s Algorithm Example

 Find the MST between the cities shown in the flight


plan
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
2704
867 BOS

849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
2342
Kruskal Algorithm

Tree Edges Sorted List of Edges

1. JFK-PVD JFK-PVD JFK-BWI JFK-BOS SFO-LAX ORD-BWI ORD-JFK ORD-DFW


(144)
144 184 187 337 621 740 802
2. JFK-BWI ORD-PVD ORD-BOS BWI-MIA JFK-MIA DFW-MIA DFW- BOS-MIA
(184) LAX
849 867 946 1090 1121 1235 1258
3. JFK-BOS DFW-BWI SFO-DFW SFO-ORD LAX-MIA SFO-BOS
(187)
1391 1464 1846 2342 2704
4. SFO-LAX
(337)
5. ORD-BWI
(621)
6. ORD-DFW
(802)
7. BWI-MIA
(946)
8. DFW-LAX
(1235)
Exercise
 Use Kruskal's algorithm to find the minimum spanning
tree of the following graph. Show the intermediate
steps in your answer by using a table. What is the
total weight for this minimum spanning tree?
Greedy Approaches

 Helps us to survive and achieve goal


 Useful in various optimisation
 Not always give the best solution
 Different solution suitable for different situation

You might also like