Dsa2023 HW3 0518
Dsa2023 HW3 0518
Homework 3
Purple Clarification Date: 05/18/2023 10:00
Blue Clarification Date: 05/14/2023 16:00
Red Correction Date: 05/06/2023 11:00
Due: 13:00:00, Thursday, May 25, 2023
TA E-mail: dsa_ta@csie.ntu.edu.tw
• Any form of cheating, lying, or plagiarism will not be tolerated. Students can get zero
scores and/or fail the class and/or be kicked out of school and/or receive other punish-
ments for those kinds of misconducts.
• In Homework 3, the problem set contains a special Problem 0 (see below) and 5 other
problems. The set is divided into two parts, the non-programming part (Problems 0, 1,
2, 3) and the programming part (Problems 4, 5).
• For problems in the non-programming part, you should combine your solutions in one
PDF file. Your file should generally be legible with a white/light background—using
white/light text on a dark/black background is prohibited. Your solution must be as
simple as possible. At the TAs’ discretion, solutions that are too complicated can be
penalized or even regarded as incorrect. If you would like to use any theorem which
is not mentioned in the classes, please include its proof in your solution. If you use
pseudocode to describe your algorithm, you need to briefly explain how your
pseudocode works.
• The PDF file for the non-programming part should be submitted to Gradescope as in-
structed, and you should use Gradescope to tag the pages that correspond to each sub-
problem to facilitate the TAs’ grading. Failure to tag the correct pages of the subproblem
can cost you a 20% penalty.
• For the programming part, you should have visited the DSA Judge (https://dsa2023.
csie.org/) and familiarized yourself with how to submit your code via the judge system
in Homework 0.
• For problems in the programming part, you should write your code in C programming
language, and then submit the code via the judge system. Each day, you can submit
up to 5 times for each problem. To encourage you to start early, we allow 10 times of
1
submissions per day in the first week (from 2023/05/04 to 2023/05/10). The judge
system will compile your code with
• For debugging in the programming part, we strongly suggest you produce your own test-
cases with a program. You can also use tools like gdb or compile with flag -fsanitize=address
to help you solve issues like illegal memory usage. For gdb, you are strongly encouraged to
use compile flag -g to preserve symbols after compiling. Note: For students who use IDE
(VScode, DevC++...) you can modify the compile command in the settings/preference
section. Below are some examples:
• Discussions on course materials and homework solutions are encouraged. But you should
write the final solutions alone and understand them fully. Books, notes, and Internet
resources can be consulted, but not copied from.
• Since everyone needs to write the final solutions alone, there is absolutely no need
to lend your homework solutions and/or source codes to your classmates at
any time. In order to maximize the level of fairness in this class, lending and borrow-
ing homework solutions are both regarded as dishonest behaviors and will be punished
according to the honesty policy.
• The score of the part that is submitted after the deadline will get some penalties according
to the following rule:
(( ) )
5 × 86400 − Delay T ime(sec.)
Late Score = max × Original Score, 0
5 × 86400
• If you have questions about HW2, please go to the discord channel and discuss (strongly
preferred, which will provide everyone with a more interactive learning experience). If you
really need an email answer, please follow the rules outlined below to get a fast response:
– The subject should contain two tags, "[hw2]" and "[Px]", specifying the problem
where you have questions. For example, "[hw2][P3] Is k in subproblem 3 an
integer". Adding these tags allows the TAs to track the status of each email and
to provide faster responses to you.
2
– If you want to provide your code segments to us as part of your question, please
upload it to Gist or similar platforms and provide the link. Please remember to pro-
tect your uploaded materials with proper permission settings. Screenshots or code
segments directly included in the email are discouraged and may not be reviewed.
3
Non-Programming Part
• ...
Listing the references in this problem does not mean you could copy from them. We cannot
stress this enough: you should always write the final solutions alone and understand them fully.
4
Problem 1 - Alexander loves sweet dumplings I (100 pts)
In the following problems, assume the size of the character to be k, which is not constant in
the following problems. You can assume k < n and no collision in the Rabin-Karp algorithm.
1. (15 pts) Given a string s of length m. Explain how you can construct a string t of length
n, such that s appears the most number of times in t, utilizing the prefix function of
string s.
2. (15 pts) Given a string s of length n and corresponding q queries. Each query contains
a set of four numbers: {l1 , r1 , l2 , r2 }. Derive an algorithm to determine whether the
substring s[l1 ..r1 ] is equal to the substring s[l2 ..r2 ] with time complexity O(n + q). Write
down your algorithm in detail (pseudo code is not necessary), and analyze the time
complexity of the algorithm. Please modify the original Rabin-Karp algorithm to solve
this problem.
3. (30 pts) Given a string s of length n. Please design an algorithm with time complexity
O(n) to find all integers c, such that s can be divided into c equal substrings. Write down
your algorithm in detail (pseudo code is not necessary), prove the correctness, and
analyze the time complexity of the algorithm.
4. (30 pts) Given two strings s and key with lengths n and m respectively, find the smallest
index i such that the substring s[i..i + m − 1] has a lexicographic order strictly smaller
than key, with time complexity O(n + m). Write down your algorithm in detail (pseudo
code is not necessary), and analyze the time complexity of the algorithm. Your algorithm
should be based on the original KMP algorithm, but with a modified definition of the
prefix function.
5. (10 pts) Please provide a set of strings s and key that would produce incorrect result from
the GPT’s_Algorithm. Besides, explain how and where in GPT’s_Algorithm they
would introduce incorrect results.
5
GPT’s_Algorithm
1 int f ind_string(char *s, char *key) {
2 int m = strlen(key);
3 int n = strlen(s);
4 int idx_s = 0;
5 while (idx_s <= n − m) {
6 int idx_key = 0;
7 while (idx_key < m && s[idx_s] <= key[idx_key]) {
8 if (s[idx_s] < key[idx_key]) {
9 int idx_ans = idx_s − idx_key;
10 return idx_ans;
11 // the starting index of the answer
12 // that is, s[idx_ans] ∼ s[idx_ans + m − 1] is the answer
13 }
14 idx_key++;
15 idx_s++;
16 }
17 if (idx_key == 0) {
18 idx_s++;
19 }
20 }
21 return −1; // not found
22 }
Appendix
The original problem description with a story can be found at the link below. Enjoy!
https://hackmd.io/@IE85Ixx3QjC19iPEZUp1dA/ryeJGJeVn
However, please base your answer on the problem description in this PDF.
6
Problem 2 - Sorting in Linear Time (100 pts)
1. (15 pts) You are given an array A with 6 elements [152, 234, 57, 8, 601, 310]. Please perform
radix sort on A. Show your progress by presenting the content of A after each of the three
rounds of radix sort.
For the problems listed below, we define an array to have a range of K if the array elements
only have values within the interval [0, K].
2. (25 pts) Given an array B of length N and range K, an inversion exist with two indices
i and j, i < j, and B[i] > B[j]. Now, assume that you can make use of a data structure
called Magic Counter. The magic counter C supports two functions: insert(x) and
sum(R). Let M be the maximum value that will be inserted into the Magic Counter C 1 .
Design an algorithm that counts the total number of inversions in B in O(N log K)-time.
You can use the two functions of Magic Counter C directly in your pseudo code.
3. (25 pts) Given a string of length N and a character σ, describe an algorithm to calculate
the number of its substrings that do not contain the specific character σ in the character
set, and show that it runs in O(N )-time. For example, if the string is ”APPLE” and the
character σ = ‘P’, the string has four substrings “A”, “L”, “E”, and “LE” that do not
contain the character ‘P’.
1
In fact, the magic counter can be implemented by data structures such as Fenwick Tree, you can find more
details here.
7
4. (35 pts) Given a string of length N and a character set of size C (which include all
the possible characters that are in the string), describe an algorithm to calculate the
number of its substrings that do not contain each character in the character set and
show that it runs in O(C + N )-time. That is, complete the task specified in the previous
problem for each of the C characters in the set, but in O(C + N )-time. (Hint: You can
match any character in the character set into an index in O(1)-time.)
8
Problem 3 - Fans of H7Lin (100 pts)
Hsin is a super big fan of H7Lin. After she learned the interesting data structure, disjoint set,
in school, she would like to optimize the disjoint set in her own way. Specifically, she creates
the disjoint set data structure that supports the following functions:
• UNION(x,y): Merge the set that contains x and the set that contains y into one set. If x
and y are in the same set originally, then nothing changes. Note that x can be equal to
y.
Her implementation can be found here in this link, in which she applies a “randomized” way
to union two sets in line 23. Besides, she uses a variable cnt to estimate the time it takes to
run her code.
1. (15 pts) Try to generate a test case that makes the variable cnt printed in line 52 more
than 120. Give the results of your test data when the program is executed (e.g. screen-
shots). Briefly explain how you achieve this using at most three lines of text. The
test case should consist of 15 UNION(x,y), while 0 ≤ x ≤ 15, 0 ≤ y ≤ 15 ∀x, y. Note that
you cannot change the random seed, so all the values of rand() can be predicted. For
example, the test case shown in the code can be expressed by:
UNION(0, 1)
UNION(1, 2)
UNION(2, 3)
UNION(3, 4)
UNION(4, 5)
UNION(5, 6)
UNION(6, 7)
UNION(7, 8)
UNION(8, 9)
UNION(9, 10)
UNION(10, 11)
UNION(11, 12)
9
UNION(0, 0)
UNION(1, 1)
UNION(2, 2)
After you generate the test case, Hsin is so angry, as she thinks running her code takes too
much time. Therefore, she applies a more powerful randomized technique to union the sets.
The implementation can be found in this link.
2. (20 pts) Try to generate a test case that makes the variable cnt printed in line 53 more
than 120. Give the results of your test data when the program is executed (e.g. screen-
shots). Briefly explain how you achieve this with at most five lines of text. The test
case should consist of 15 UNION(x, y), while 0 ≤ x ≤ 15, 0 ≤ y ≤ 15 ∀x, y.
Note: In fact, the average case time complexity of this algorithm is much better than
that of the previous one.
One day, as H7Lin has so many fans around the world, she would like to invite her fans to
her country and play a game altogether. Hsin, of course, joins in the game. The game is held
in a big cornfield, but there is no corn in it. The cornfield is a rectangle, and it consists of
n × m grids, where n, m is the number of rows and the number of columns, respectively. The
grid in the r-th row and the c-th column is denoted by (r, c), 0 ≤ r < n, 0 ≤ c < m. All the
participants are in the grid (0, 0) at the beginning.
In the following two problems, you can directly call these functions without any explanation:
MAKE-SET(x), FIND-SET(x), and UNION(x,y) with the linked list representation, the union-by-
size technique, and the path compression technique. Let α(N ) is the inverse of the Ackermann
function. From the textbook, we know that a sequence of M MAKE-SET(x), FIND-SET(x), and
UNION(x,y) operations, N of which are MAKE-SET(x), can be performed in worst-case time
O(M α(N )), when applying the union-by-size technique and the path compression technique.
10
• QUERY(x,y): H7Lin wants to know whether the participants can move from (0, 0) to
(x, y) through an arbitrary number of bounces with the installed bouncing devices
(possibly 0).
There are Q operations in total. Please implement the two operations such that the Q
operations run in O((n + m + Q)α(n + m))-time.
However, Hsin feels that it’s so boring. To pursue extra excitement, whenever she bounces,
she only wants to double-bounce with two bouncing devices. However, both of the two bouncing
devices must bounce vertically or horizontally. For instance, in one “double-bounce”, Hsin may
bounce from (0, 0) to (4, 0) and bounce from (4, 0) to (2, 0). However, she cannot bounce from
(0, 0) to (4, 0) and bounce from (4, 0) to (4, 1).
Nevertheless, the severe weather in the country where H7Lin lives makes the bouncing
devices break down after a period of time.
11
To be able to implement these operations, we can leverage an additional function for the
disjoint set data structure:
• UNDO(): Remove the change caused by the last UNION(x,y). You can assume that
it runs in O(log N )-time after N MAKE-SET(x).
12
Programming Part
Problem Description
Remember the magical fairy in the house of Demeter Sphynx Abyssinian (DSA)? The fairy’s
magic spell can be represented by a string S, consisting of uppercase English alphabets of length
of N . Each substring of S is called a segment of the spell, and a segment generates magic effect
if it is exactly the same as the pattern P , which has a length of M . However, a segment may
produce the opposite effect if it is a rotation of P . For example, given a string A(A = c1 c2 ...cn )
consisting of n alphabets, ci ...cn + c1 ...ci−1 is considered a rotation of A, for any i ∈ [2, n].
To help the fairy debug zir magic spell, can you calculate the total number of segments in
zir spell that may generate either the original or the opposite magic effect?
Input
The first line contains two integers N and M . The second line contains the magic spell string
S, which has length N . The third line contains the pattern P , which has length M .
Output
Output the answer as an integer in a single line.
Constraints
• 1 ≤ M ≤ N ≤ 106
• S and P consist of only uppercase English letters.
• 1 ≤ N ≤ 104
• 1 ≤ M ≤ 102
• N =M
13
Subtask 3 (25 pts)
• 1 ≤ N × M ≤ 2 × 107
• No other constraints.
Sample Testcases
Sample Input 1
7 5
DEABCDE
ABCDE
Sample Output 1
Hints
Explanation of Sample 1:
Substrings of length M in S are [DEABC, EABCD, ABCDE].
The first two are rotations of P and the last one is equal to P .
14
Problem 5 - Mega Knights (100 pts)
Problem Description
There are n mega knights initially located at n distinct positions. Each mega knight Ki (1 ≤
i ≤ n) can cause damage ai (i.e., his attack points) if he attacks, and has hi health points,
representing the maximum damage the knight can bear.
When a mega knight Ka receives an instruction to attack another mega knight Ks (i.e., the
target), he will gather all knights at the same position as Ka to move to the position of Ks to
perform the attack. Because the attack has an area of effect, all knights at the same position
as Ks suffer the attack, after which their health points are all reduced by the sum of the attack
points of all attacking knights. Any mega knight whose health points is less or equal to 0 after
the attack is considered dead. After the attack, the attacking mega knights stay at the position
where Ks is located.
A dead mega knight cannot perform an attack, nor can it be targeted, even when an
instruction is given to do so. When such an instruction is given, no attack or movement
will happen. That is, in the above example, if either Ka or Ks is dead, nothing happens
subsequently. In addition, when the attacking mega knight is located at the same position as
the target, even when such an instruction is given, no attack or movement will happen either.
That is, in the above example, if Ka and Ks are located at the same position before the attack,
nothing happens subsequently.
You will be given m attacking instructions. Calculate how many times each mega knight
successfully attacks, including those that the knight is on the instruction, and those that the
knight is following others.
Input
The first line contains two positive integers n and m separated by a space – the number of
mega knights and the number of attack rounds.
The second line contains n positive integers h1 , h2 , . . . , hn separated by spaces – the health
points of each mega knight.
The third line contains n positive integers a1 , a2 , . . . , an , separated by spaces – the attack points
of each mega knight.
The next m lines each contain two positive integers Ka and Ks separated by a space – the
indices of attacking and the target mega knight.
15
Output
Output a line containing n integers separated by spaces. The i-th integer represents the total
number of successful attacks made by the i-th mega knight.
Constraints
• 1 ≤ n, m ≤ 2 × 105
• 1 ≤ ai , hi ≤ 109
• 1 ≤ Ka , K s ≤ n
• Ka ̸= Ks
• 1 ≤ n, m ≤ 103
• No other constraints
Sample Testcases
Sample Input 1
6 5
7 7 7 7 7 14
3 1 4 1 5 9
2 3
2 1
4 3
1 6
6 3
Sample Output 1
1 3 2 2 0 0
16
Sample Input 2
6 5
7 7 7 7 7 14
3 1 4 1 5 9
1 6
3 2
4 5
5 3
2 6
Sample Output 2
1 0 1 2 1 0
17