Assignment 2
Assignment 2
Assignment 2
29, 11:59pm
INSTRUCTIONS:
2. Submit your solution to Gradescope: make sure only one of your group members submits. After
submitting, make sure to add your group members.
Here you are required to design a DP algorithm to compute the edit penalty of the optimal alignment of two
give strings, but with a higher gap penalty. The biological implication is that mutations (i.e. mismatches)
happens much more frequently than insertions/deletions (i.e. gaps) in a DNA sequence. Also, extending
a gap (i.e. a longer gap) is more favored than opening a new gap (i.e. two short gaps). More specifically,
in the edit distance problem, the penalty (i.e., the edit distance) is incremented by 1 for a mismatch or a
gap between S1 and S2 . In this problem with refined gap model, the penalty is still incremented by 1 for a
mismatch, but by a when opening a new gap, and by b when extending an existing gap (a > b ≥ 3). For
example, the total penalty for alignment AA CC × AAGGCT is (a + b + 1) where a is for opening the gap at
position 3, b for extending the gap at position 4, and 1 for a mismatch at position 6.
Design a dynamic programming algorithm for above problem. Show correctness and complexity analysis
of your algorithm. Your algorithm should run in O(mn) time.
Hint: Use three DP tables. One for ending with a gap in S1 , one for ending with a gap in S2 , one for
match/mismatch only.
1
Assignment 2 CSE 565 Fall 2021 (Instructor: Mingfu Shao) Due: Sep. 29, 11:59pm
Here are some hints: we introduce a new array D, where D[i] represents the minimum value of the last
element in the increasing subsequence of length i. For example, if we have S = [2, 1, 4, 5], then we’ll have
D[1] = 1, D[2] = 4 and D[3] = 5. (There are four increasing subsequences of length 1, and 1 is the smallest;
among the four increasing subsequences of length 2, i.e., [2, 4], [2, 5], [1, 4] and [1, 5], 4 is the minimum
last element; and both the two increasing subsequences of length 3 end with 5.) In your algorithm, you can
enumerate the index i from 1 to n, and use the value of S[i] to update the array D. Meanwhile, we also need
to keep updating the length of LIS. Further hint: binary search might be needed.
For example, in the convex polygon given below, one possible way to cut it into triangles is illustrated with
red dashed lines, and in this case the total length of cutting trace is BG + BF +CF + DF.
2
Assignment 2 CSE 565 Fall 2021 (Instructor: Mingfu Shao) Due: Sep. 29, 11:59pm
ri, j , and two currencies s and t, find the most advantageous sequence of currency exchanges for converting
currency s into currency t. (Assume that, there does not exists sequence of currencies c1 , c2 , · · · , ck such that
r1,2 · r2,3 · r3,4 · · · rk−1,k · rk,1 > 1.)
Occasionally the exchange rates satisfy the following property: there is a sequence of currencies c1 , c2 , · · · , ck
such that r1,2 · r2,3 · r3,4 · · · rk−1,k · rk,1 > 1. This means that by starting with a unit of currency c1 and then
successively converting it to currencies c2 , c3 , · · ·, ck , and finally back to c1 , you would end up with more
than one unit of currency c1 . Give an efficient algorithm for detecting the presence of such an anomaly.