AA Exam 2022 Answers
AA Exam 2022 Answers
January/February 2022
Faculty of Engineering
COMS30041
Advanced Algorithms
TIME ALLOWED:
2 Hours
root
) 0) (2 (7,1
,10 ,2
(3
)
0)
(10 ,1
,9
(0 )
,3
(9
)
10 0 7
0)
(7
0)
(4,1
0)
(2,1
(6,6)
0,1
,1
(5,1
(3,1
0)
(1
0)
0)
9 6 4 3 2 1
10)
(7,1
(10,
0)
Solution: 8 5
(b) (2 points) How many leaves, internal nodes and edges does the compact suffix tree of the string
“bookkeepee” have?
(c) (4 points) A repeat in string S is a substring of S that occurs at least twice in S. A repeat is
maximal if it occurs in at least two locations where it is not extensible on either side. That is if you
were to extend it to the right or left it would no longer be a repeat. Which parts of its compact
suffix tree do the maximal repeats of the string “bookkeepee” correspond to and what are those
maximal repeated substrings?
Solution: The four non-root branching internal nodes corresponding to “e, o, k, ee”.
(d) (2 points) What is the suffix array of “mississippis”? Recall that the string “ab” is lexicographically
earlier than the string “abc”.
Solution: 7,10,4,1,0,9,8,11,6,3,5,2
(e) (2 points) Consider the linear time construction algorithm for suffix arrays called the DC3 method.
In this algorithm we first look at the indices that are non-zero mod 3. For the string “mississippis”,
what is the suffix array of the suffixes corresponding to indices that are non-zero mod 3?
(f) (4 points) For the string S =“mississippis”, assume that we are given the suffix array of the suffixes
corresponding to indices that are non-zero mod 3. Show how in the DC3 algorithm we can in linear
time find the ordering of the suffixes whose indices are 0 mod 3.
Solution: For each index i where i mod 3 = 0 we make a pair whose first value is the symbol
at index i and second value is that rank of that suffix i + 1. This rank information is given to
use by the suffix array of the suffixes corresponding to indices that are non-zero mod 3. We
can now sort this list of pairs in linear time using radix sort.
2. (a) Consider a set H = {h1 , h2 } of hash functions. Each function in H maps a key from the set
{1, 2, 3, 4} to a hash value in {1, 2, 3}. The explicit definitions of the hash functions are given below:
h1 (1) = 4, h2 (1) = 2
h1 (2) = 3, h2 (2) = 1
h1 (3) = 2, h2 (3) = 3
h1 (4) = 2, h2 (4) = 3
Page 2
i. (2 points) What is the probability that h(1) = h(4), i.e. Pr(h(1) = h(4)), when h is picked
uniformly at random from H?
ii. (2 points) Is H a weakly universal set of hash functions? Justify your answer.
(b) (4 points) Suppose that we have a set of n fixed keys from a universe U . We want to be able
to look up these keys in constant time and O(n) space using the static perfect hashing scheme of
Fredman, Komlós & Szemerédi. Describe how to achieve this. You may assume that we have access
to a weakly universal set of hash functions h : U → {0, . . . , n − 1}.
(c) Consider the following variant on cuckoo hashing where we use two tables T1 and T2 of the same
size and hash functions h1 (x) = x mod 4 and h2 (x) = 2x mod 4.
• Both tables have size 4.
• The hash function h1 is used to insert elements into T1 and hash function h2 is used to insert
elements into T2 .
• When inserting a new key x, we first try to put x at position h1 (x) in T1 . If this leads to a
collision, then the previously stored key y is moved to position h2 (y) in T2 .
• If this leads to another collision, then the next key is again inserted at the appropriate position
in T1 , and so on.
At the start T1 = (4, , 14, ) and T2 is empty (underscore indicates that entry is currently empty).
i. (2 points) What is the result of inserting 12 and 10 into the hash table.
Solution: The insertion will loop forever. First 12 replaces 4 and 4 is placed at position
8 mod 4 = 0 in T2 . Then 10 replaces 14 and h2 (14) = 28 mod 4 = 0 so it replaces 4 in T2 .
4 is then inserted in T1 where it replaces 12 which then replaces 14 in T2 and we are in a
loop.
ii. (2 points) If 12 is inserted first into the initial tables as given in this question, what is the
smallest non-negative key that causes a potentially infinite loop if inserted after 12?
Solution: Key 0.
(d) (2 points) What are the space requirements for a van Emde Boas tree in terms of the size of the
universe u?
Solution: O(u)
(e) (2 points) If one were to really create a van Emde Boas tree that used this much space, what is
the smallest number n of integers the tree would have to contain so that insert, lookup and delete
operations took Θ(log log u) time (where u is the universe size).
Page 3
(f) (4 points) Construct a Cartesian tree for the input 1, 7, 3, 0, 8, 2.
Solution:
0
/ \
/ \
/ \
/ \
/ 2
1 /
\ 8
3
/
7
3. (a) (8 points) For this question we will use only weighted, undirected graphs. In graph theory, the
Hamiltonian Cycle problem for a graph G asks if there exists a cycle in G that passes through each
vertex in G exactly once. A cycle is a path that start and ends at the same vertex. This problem
is known to be NP-complete and we therefore assume for the purposes of this question that there
is no polynomial time solution for it.
The Travelling Salesman Problem asks us to find the shortest distance tour (or cycle) that passes
through each vertex of the graph exactly once.
Given a weighted undirected graph G, we say that G satisfies the triangle inequality if for all
connected vertices i, j, k, wi,j ≤ wi,k + wk,j .
This question asks you to give a proof of the following statement by showing a reduction from
Hamiltonian Cycle. “Assuming P̸= NP, then for any constant ρ ≥ 1, there is no polynomial time
ρ-approximation for the Travelling Salesman Problem that applies to all graphs without the triangle
inequality.”
To start you off in the reduction, let G′ = (V, E ′ ) be the complete graph of V , so that every vertex
is connected to every other vertex. We will assign weights to the edges in E ′ as follows:
(
1 if (u, v) ∈ E
w(u, v) =
otherwise
APPROX-TSP (G′ )
By choosing a suitable value for the blank box above together with this new graph, give a proof of
the claim.
Solution: Set the blank values to ρ|V | + 1. If had a Hamiltonian cycle, then due to the high
cost of any edge which was not in E the minimum weight Hamiltonian cycle of G′ would be the
same cycle, and would use only the edges with cost 1. If G did not have a Hamiltonian cycle,
then the minimum weight Hamiltonian cycle of G′ would include at least one of the new edges
with cost ρ|V | + 1.
So, if G did contain a Hamiltonian cycle, there would exist an (optimal) Hamiltonian cycle in
G′ with weight |V |. If G did not contain a Hamiltonian cycle, then a Hamiltonian cycle in G′
would have at least 1 edge with cost ρ|V | + 1, and |V | − 1 edges with cost 1. Therefore, a
Hamiltonian cycle in G′ would have a minimum cost of:
Page 4
(ρ|V | + 1) + (|V | − 1) = ρ|V | + |V |
Note that there is a gap of at least size ρ|V | between the cost of a Hamiltonian cycle which
was in G, and the cost of a Hamiltonian cycle which was not in G. We stated at the start of
the proof that there is a polynomial-time ρ-approximation for the travelling salesman problem.
Consider applying this algorithm to (G′ , w). By definition, the ρ-approximation means that
the algorithm will return a path which is no more than a factor of ρ away from the optimal
solution. If G did have a Hamiltonian cycle, then the optimal solution has a cost of |V |, which
means that the algorithm returns a solution less than or equal to ρ|V |. But as stated above,
all non-optimal paths will have a cost of at least ρ|V | + |V |, which means that the algorithm
can only return the optimal path in order for it to be a ρ approximation. On the other hand,
if G has no Hamiltonian cycle, we know from above that the algorithm will return a solution
with a cost of at least ρ|V | + |V |. This distinction means that we can use the approximation to
solve the Hamiltonian cycle problem in polynomial time simply by applying it and determining
whether the result is greater than ρ|V | or not. Solving the Hamiltonian cycle problem in
polynomial time, which is known to be NP-complete, gives that P= NP. This contradicts our
original assumption, thereby proving that a polynomial-time ρ-approximation for the travelling
salesman problem without the triangle inequality cannot exist unless P= NP.
(b) (6 points) Show how in polynomial time we can transform any instance of the travelling salesman
problem on a graph which does not satisfy the triangle inequality into another instance whose edge
weights do satisfy the triangle inequality. The two instances must have the same set of optimal
tours.
Solution: Let k be the sum of the weights of all the edges in the graph. Add k to all the edge
weights. The graph now satisfies the triangle inequality as k dominates the sum. The tours
remain the same as they all contain the same number of edges and so all add nk to their value.
(c) (6 points) Explain why such a polynomial time transformation does not contradict the statement
in the first part of this question, assuming P̸= NP.
Page 5