f22 Midterm1 Sol
f22 Midterm1 Sol
f22 Midterm1 Sol
Suppose we are given a sorted array A[1 .. n] of n distinct integers, which are not necessarily
positive.
(a) Describe a fast algorithm that either computes an index i such that A[i] = i or correctly
reports that no such index exists.
(b) Suppose we know in advance that A[1] > 0. Describe an even faster algorithm that either
computes an index i such that A[i] = i or correctly reports that no such index exists.
Solution (part (a)): We can solve the problem in O(log n) time using a variant of (or a
reduction to) binary search. Here are two pseudocode descriptions of the algorithm, one
recursive and one iterative.
〈〈Find index j such that i ≤ j ≤ k and A[ j] = j〉〉 FindIndex(A[1 .. n]):
FindIndex(i, k): lo ← 1; hi ← n
if i > k while lo ≤ hi
return None mid ← ⌈(lo + hi)/2⌉
j ← ⌈(i + k)/2⌉ if A[mid] = mid
if A[ j] = j return mid
return j else if A[mid] > mid
else if A[ j] > j hi ← mid − 1
return FindIndex(i, j − 1) else
else lo ← mid + 1
return FindIndex( j + 1, k) return None
The key observation is that because A is a sorted array of distinct integers, we have
A[ j] ≥ A[i] + ( j − i) for all indices i < j. In particular, if A[i] > i, then A[ j] > j for all j > i.
Equivalently, suppose we (implicitly) define a new array B[1 .. n] by setting B[i] = A[i]−i
for all i. Then the elements of B are sorted in non-decreasing order (but they are not
necessarily distinct), and we are looking for an index i such that B[i] = 0. ■
Rubric: 8 points max. For an explicit algorithm: 1 for binary search idea + 1 for base case + 4 for recursive
cases + 2 for time analysis. −1 for each off-by-one error. −1 for returning True/False instead of index.
−1 for stating running time as a recurrence without solving it. A reduction to binary search is worth full
credit. Max 3 points for a Θ(n)-time algorithm; max 2 points for anything slower; scale partial credit.
Solution (part (b)): If A[1] = 1, we can clearly return 1 immediately. On the other hand,
if A[1] > 1 then A[i] > i for all i, so we can return None immediately. So we can solve this
problem in O(1) time! ■
1
CS/Math 473 = Fall 2022 Name:
Midterm 1 Solutions Problem 2
Describe and analyze an algorithm to find a matching with maximum total weight, in a binary
tree with weighted edges.
Solution: Let T be the input tree, and let r denote its root node. For any vertex v, we
define two functions:
• MWM∗ (v) is the weight of a maximum weight matching in the subtree rooted at v,
where v is not incident to a matching edge.
We need to compute MWM(r). These functions satisfy the following mutual recurrences.
Here v.l and v.r denote the left and right children of v, respectively.
if v = Null
0
if v is a leaf
0
MWM(v) = ∗
w(v, v.l) + MWM (v.l) + MWM(v.r) if v.l ̸= Null
max w(v, v.r) + MWM(v.l) + MWM∗ (v.r) if v.r ≠ Null otherwise
∗
MWM (v)
¨
0 if v is a leaf
MWM∗ (v) =
MWM(v.l) + MWM(v.r) otherwise
We can memoize these functions into two new fields at each node of T , and we can evaluate
the functions in postorder.
The resulting dynamic programming algorithm runs in O(n) time, where n is the number
of vertices in T . ■
Rubric: 10 points: standard dynamic programming rubric. These are not the only correct solution.
−1 for assuming every interior node has both left and right children.
2
CS/Math 473 = Fall 2022 Name:
Midterm 1 Solutions Problem 3
Suppose we are given two bit strings P[1 .. m] (the “pattern”) and T [1 .. n] (the “text”), where
m ≤ n. Describe and analyze an algorithm to find the minimum Hamming distance between P
and a substring of T of length m. For full credit, your algorithm should run in O(n log n) time.
We can evaluate Both1(s) for every index s using convolution as follows. Define two
sequences p and t by setting pi = P[m − i] and t i = T [i] for each index i. Then for all s we
have X
Both1(s) = pm−i · t s+i = (p ⋆ t)m+s
i
We can construct the sequences p and t in O(n) time, and then compute their convolution
in O(n log n) time using fast Fourier transforms.
We can similarly express Both0(s) as a convolution by defining pi′ = 1 − P[m − i] and
t i′ = 1 − T [i] for each index i. Then for all s we have
X
′ ′
Both0(s) = pm−i · t s+i = (p′ ⋆ t ′ )m+s
i
We can construct the sequences p′ and t ′ in O(n) time, and then compute their convolution
in O(n log n) time using fast Fourier transforms.
After computing both convolutions, we can compute mins (m − Both0(s) − Both1(s)) in
O(n) time. The entire algorithm runs in O(n log n) time. ■
Solution (powers of −1): For any integer 0 ≤ s ≤ n − m (“shift”), let H D(s) denote the
Hamming distance between P[1 .. m] and T [s + 1 .. s + m]; we need to compute mins H D(s).
First we modify the arrays to make Hamming distance behave more like a vector dot-
product. For any index i, define
¨ ¨
′ 1 if P[i] = 1 ′ 1 if T [i] = 1
P [i] = T [i] =
−1 if P[i] = 0 −1 if T [i] = 0
3
Then for any shift 0 ≤ s ≤ n − m, we have
m
X m
X
P ′ [i] · T ′ [s + i] =
1 − 2 P[i] ̸= T [s + i] = m − 2 · H D(s)
i=1 i=1
Now define two sequences p and t by setting pi = P ′ [m − i] and t i = T ′ [i] for each index i.
Then for all s we have
m
X X
P ′ [i] · T ′ [s + i] = pm−i · t s+i = (p ⋆ t)m+s
i=1 i
Solution (squared differences): For any integer 0 ≤ s ≤ n − m (“shift”), let H D(s) de-
note the Hamming distance between P[1 .. m] and T [s + 1 .. s + m]; we need to compute
mins H D(s).
We can express the Hamming distance H D(s) as follows:
m
X
H D(s) = (P[i] − T [i + s])2
i=1
m
X m
X s
X
2
= P[i] − 2· P[i] · T [i + s] + T [i + s]2
|i=1 {z } |i=1 {z } |i=1 {z }
SumP Both1(s) SumT(s)
(We can remove the squaring because 02 = 0 and 12 = 1!) We compute each of the terms
SumP, Both1(s), and SumT(s) for all s as follows:
• We can compute SumP in O(m) time by brute force, once for all s.
• We can compute the third term SumpT(s) for all s in O(n) time using the recurrence
SumpT(s) = SumpT(s − 1) − T [s] + T [s + m].
• Finally, we compute Both1(s) for all s using convolution. Specifically, we define two
sequences p and t by setting pi = P[m − i] and t i = T [i] for each index i. Then for
all s we have
m
X X
Both1(s) = P[i] · T [s + i] = pm−i · t s+i = (p ⋆ t)m+s
i=1 i
We can construct the sequences p and t in O(n) time, and then compute their convo-
lution in O(n log n) time using fast Fourier transforms.
Finally, we compute maxs (SumP − 2 · Both1(s) + SumT(s)) in O(n) time by brute force. The
entire algorithm runs in O(n log n) time. ■
3 (continued)
Rubric: 10 points. These are not the only correct solutions.
• 1 for using FFT/convolution at all
• 2 for correctly dealing with both 0s and 1s (separately considering 00 and 11 matches, separately
considering 01 and 10 matches, mapping (0, 1) to (−1, 1), squared differences, etc.)
• 3 for correctly setting up convolutions (reversing either T or P )
• 3 for correctly reading the minimum Hamming distance from the convolution(s)
• 1 for time analysis (if the algorithm is mostly correct)
A correct algorithm that runs in O(mn) or O(mn log n) time is worth at most 4 points.
3 (continued)
CS/Math 473 = Fall 2022 Name:
Midterm 1 Solutions Problem 4
Solution: Let A[1 .. n] be the input array. For any indices i ≤ k, let MinAve(i, k) denote the
smallest possible value that can be obtained from the interval A[i .. k] by adding parentheses.
We need to compute MinAve(1, n). This function satisfies the following recurrence:
¨
A[i] if i = k
MinAve(i, k) =
min MinAve(i, j) @ MinAve( j + 1, k) i ≤ j < k
otherwise
Rubric: 10 points: standard dynamic programming rubric. This is not the only correct evaluation order.
The memoization arrays don’t appear to have the right structure for a monotonicity speedup via
SMAWK. As far as I know, this is the fastest algorithm for this problem (up to logarithmic factors).
Non-solution: Consider the following greedy algorithm: Merge the adjacent pair of numbers
with the largest average (breaking ties arbitrarily), replace them with their average, and
recurse. For example:
8@6@7@5@3@0@9
7@7@5@3@0@9
7@5@3@0@9
6@3@0@9
6 @ 3 @ 4.5
4.5 @ 4.5
4.5
With the right data structures, this algorithm can be implemented to run in O(n log n) time;
the only real bottleneck is maintaining a priority queue of adjacent pairs.
Unfortunately, this greedy algorithm does not always compute the optimal expression.
Consider the input 2 @ 5 @ 0 @ 6. The greedy algorithm outputs (2 @ 5) @ (0 @ 6) = 3.25,
but the optimal expression is 2 @ (5 @ (0 @ 6)) = 3. ♣