CS293 Tutorial Solutions
CS293 Tutorial Solutions
CS 213/293 UG TAs
2023
Contents
Tutorial 1 2
Tutorial 2 7
Tutorial 3 12
1
Tutorial 1
1. The following is the code for insertion sort. Compute the exact worst-case running time of the
code in terms of n and the cost of doing various machine operations.
Solution: To compute the worst-case running time, we must decide the code flow leading to
the worst case. When the if-condition evaluates to false, the control breaks out of the inner
loop. The worst case would have the if-condition evaluating false only when the iterator i
reaches 0. This will happen when the input is in decreasing order.
Counting the operations for the outer iterator j (which goes from 1 to n − 1):
2. if-condition will evaluate to true j − 1 times (all elements to the left are larger) and
false 1 time (along with 1 comparison, 1 assignment and 1 decrement for each inner
iteration)
3. For the true case, we have 2 memory accesses and 1 jump; for the false case, we have
just 1 jump.
The total cost would then be (terms are for outer loop, inner loop, if and else):
(C + 4A + 2M) × (n − 1) + (C + J + A) × (n × (n − 1)/2)
+(C + A + 2M + J ) × ((n − 1) × (n − 2)/2) + ( J ) × (n − 1).
Solution: Note that time complexity is measured as a function of the input size since we
aim to determine how the time the algorithm takes scales with the input size.
Binary Addition
Assume two numbers A and B. In binary notation, their lengths (number of bits) are m
and n. Then the time complexity of binary addition would be O(m + n). This is because
we can start from the right end and add (keeping carry in mind) from right to left. Each
bit requires an O(1) computation since there are only 8 combinations (2 each for bit 1,
bit 2, and carry). Since the length of a number N in bits is log N, the time complexity is
O(log A + log B) = O(log( AB)) = O(m + n).
Binary Multiplication
Similar to above, but here the difference would be that each bit of the larger number
(assumed to be A without loss of generality) would need to be multiplied by the smaller
number, and the result would need to be added. Each bit of the larger number would take
O(n) computations, and then m such numbers would be added. So the time complexity
would be O(m × O(n)) = O(mn)1 . Following the definition above, the time complexity is
O(log A × log B) = O(mn).
Side note: How do we know if this is the most efficient algorithm? Turns out, this is NOT
the most efficient algorithm. The most efficient algorithm (in terms of asymptotic time
complexity) has a time complexity of O(n log n) where n is the maximum number of bits in
the 2 numbers.
Unary Addition
The unary addition of A and B is just their concatenation. This means the result would have
A + B number of 1’s. Iterating over the numbers linearly would give a time complexity of
O( A + B) which is linear in input size.
Solution: Let us begin by assuming the proposition is False, ergo, f (n) ∈ O( g(n)). By
definition, then, there exists a constant c such that there exists another constant n0 such that
∀n ≥ n0 , f (n) ≤ cg(n)
. Hence, we have
∀n ≥ n0 , a0 n0 + . . . + ad nd ≤ cb0 n0 + . . . + be ne
e
∀n ≥ n0 , ∑ ( ai − cbi )ni + ai+1 ni+1 + . . . + ad nd ≤ 0
i =0
By definition of limit
e
lim
n→∞
∑ (ai − cbi )ni + ai+1 ni+1 + . . . + ad nd ≤ 0
i =0
=⇒ ad ≤ 0
Assuming ad > 02 , this results in a contradiction; thus, our original proposition is proved.
4. What is the difference between “at” and “..[..]” accesses in C++ maps?
Solution: The at function can throw an exception when the accessed element is not in range.
The [] operator does not check the range, and the behaviour upon access is undefined when
the element is not in range. The at function can throw an exception when the accessed
element is not in range. The [] operator calls the default constructor of the value type and
then returns the allocated object. This may be considered bad behaviour because a read
action causes changes in the data structure, which is undesirable. The map will be filled with
many dummy entries if there are many out-of-range reads to the map. On the other hand,
throwing exceptions is not ideal from a programmer’s perspective. They have to constantly
add code that handles exceptions. If such a code is not added, then the exceptions may
cause failure of the entire system. See the figure below:
5. C++ does not provide active memory management. However, smart pointers in C++ allow us
the capability of a garbage collector. The smart pointer classes in C++ are
• auto˙ptr
• unique˙ptr
• shared˙ptr
• weak˙ptr
Write programs that illustrate the differences among the above smart pointers.
Solution: Memory leak occurs when a program allocates some memory and stops referencing
it. C++ does not automatically deallocate memory when a program does not reference a
part of memory. However, the language supports smart pointers. Smart pointers are classes
that count references to an object. If the number of references hits zero, the memory is
deallocated. shared˙ptr allows one to allocate memory without worrying about memory leaks.
unique˙ptr is like shared˙ptr but it does not allow a programmer to have two references to an
address. weak˙ptr allows one to refer to an object without having the reference counted. This
kind of pointer is needed in case of cycles among pointers. Due to the cycle, the reference
counter never reaches zero even if the programs stop referencing the memory. weak˙ptr is
used to break the cycle. auto˙ptr is now a deprecated class. It was replaced by shared˙ptr
Please refer to the code snippets shown below
Tutorial 2
1. The span of a stock’s price on ith day is the maximum number of consecutive days (up to ith
day) the price of the stock has been less than or equal to its price on day i.
Example: for the price sequence 2 4 6 5 9 10 of a stack, the span of prices is 1 2 3 2 5 6.
Give a linear−time algorithm that computes si for a given price series.
Solution: The idea in this problem is to check the history till the ith day and compute the
span based on the blocks of past sets of consecutive days. Lets define a valid stretch to be
“a stretch of consecutive days so far where the price never exceeded the current price”. The
invariant that we need to maintain here is:
To do this, at the position i, we make a linear pass over the prices till ith day and do the
following:
• we check all the positions where the price did not exceed the current price and take
the maximum value out of these
• we also maintain a counter for the length of the most recent valid stretch
Based on the above computations, we can conclude that the span of the ith day is the
maximum of the span value obtained in linear iteration (maximum among spans of past days
with non-exceeding prices) and the length of the most recent valid stretch (including the
current day).
The base case is that at day 1 the price is the only encountered price, and hence is the largest,
so the span is 1.
Alternate interpretation: This is the more common version of the problem, where we start
from current day and go backwards.
The idea here is to find out the latest (most recent) day till the ith day where the price was
greater than the price on day i. To do this, we will maintain a stack of indices, and for each
index i, we will keep the stack in a state such that the following invariant holds for each day
i:
If the stack is not empty, then the price on the day whose index is at the top of the stack is
the most recent price that was strictly greater than the current price.
With this invariant, the updates to the stack and the stock span follow as:
• we remove the top index in the stack till the price corresponding to the top index
becomes strictly greater, or the stack becomes empty
• if the stack is empty, the current price is the largest, so the span is the number of days
so far
• otherwise the span is the number of days since the index at the top of the stack
The base case is that at day 1 the price is the only encountered price, and hence is the largest,
so the span is 1.
Both implementations can be found here.
2. There is a stack of dosas on a tava, of distinct radii. We want to serve the dosas of increasing
radii. Only two operations are allowed:
(i) serve the top dosa
(ii) insert a spatula (flat spoon) in the middle, say after the first k, hold up this partial stack and
flip it upside-down and put it back
Design a data structure to represent the tava, input a given tava, and to produce an output in
sorted order. What is the time complexity of your algorithm?
This is also related to the train-shunting problem.
Solution: Read the first line of the question with the emphasis being on stack. We will use
the array representation of a stack for the tava.
Our stack-tava has the following abstraction:
• Initialization and input: Initialize an array S of size n, to represent the stack. Using
the given input, fill the array elements with the radii of dosas on the tava in the initial
stack order, with the bottom dosa at index 0 and the top at n − 1. Maintain a variable
sz which contains the current size of the stack, and initialize it to n.
• Serving the top dosa: This method is essentially carrying out the “pop” operation
on our stack S. Return the top dosa in the stack, S[sz − 1], and decrement sz by 1.
Clearly, this method takes a constant O(1) time for every call.
• Flipping the top k dosas: This method is equivalent to reversing the slice of the array S
from index sz − k to sz − 1. This is simple enough: initialize an index i to sz − k and
another index j to sz − 1, swap elements at indices i and j, increment i and decrement
j by 1, and repeat the swap and increment/decrement while j > i. This method takes
O(k) time for every call.
Note that in this implementation, the serve operation is the only way by which the problem
can be reduced in size (number of dosas reduced by 1). The flip operation, if used wisely,
can give us access to dosas that can help us reduce the problem size.
With this implementation, the algorithm to serve the dosas in increasing order of the radii
can be designed as follows. While the stack is not empty, repeat the following:
• Iterate over the array S and find the index of the minimum element in the array. Let
this index be m. This is the position of the dosa having the smallest radius
• Flip over the top sz − m dosas in the stack, using the flipping operation. This brings
the smallest dosa from index m to index sz − 1, id est, the top of the stack
• Serve the top dosa, using the serving operation. This pops the smallest dosa off the
stack
It is easy to argue the correctness. The invariant is that we keep serving the minimum radii
dosa from the remaining stack. The resulting order has to be sorted.
For computing the time complexity: in a stack of size z, finding the index of the minimum
element takes O(z) time, flipping over the top k ≤ z elements takes O(k) time (and thus
O(z) time) and serving off the top dosa takes O(1) time. Thus serving the smallest dosa
takes O(z) time overall. This has to be repeated for all dosas, so the stack size z goes from
n to 1, decrementing by 1 every time. Therefore the algorithm has a time complexity of
O(n2 ).
3. (a) Do the analysis of performance of exponential growth if the growth factor is three instead of
two? Does it give us better or worse performance than doubling policy?
(b) Can we do the similar analysis for growth factor 1.5?
Solution: Let us do the analysis for a general growth factor α. Suppose initially N = 1 and
there are n = αi consecutive pushes. So, total cost of expansion is (refer to slides for detailed
explanation):
4. Give an algorithm to reverse a linked list. You must use only three extra pointers.
Solution: Let us initialise three pointers - prev, curr, and next, initialised to null, head and
head− > next respectively. If head is null, then it is an empty linked list and we return.
Otherwise, we will be iterating over each element in the linked list. In each iteration, perform
the following updates.
• prev = curr
• curr = next
The loop terminates when next is null, in which case, set head to curr.
• Continue this process until end reaches the end of linked list
• return null if the head itself was null (empty linked list)
Note: When there is an odd number of elements, the above algorithm returns the element at
the median location. What happens when there is an even number of elements?
6. The mess table queue problem: There is a common mess for k hostels. Each hostel has some
N1 , . . . , Nk students. These students line up to pick up their trays in the common mess. However,
the queue is implemented as follows:
If a student sees a person from his/her hostel, she/he joins the queue behind this person.
This is the “enqueue” operation. The “dequeue” operation is as usual, at the front. Think about
how you would implement such a queue. What would be the time complexity of enqueue and
dequeue?
Do you think the average waiting time in this queue would be higher or lower than a normal
queue? Would there be any difference in any statistic? If so, what?
Solution: First, note that such a queue would consist of at most K sub-queues, each one
containing students from the same hostel. New students who show up to join the queue get
attached to the end of one of these hostel sub-queues.
We implement this with the following:
• A global linked list of all students currently in the queue (each student is a node in the
list). This linked list is initially empty.
• Pointers head and tail, to the front and end of the queue respectively. Both are initially
null.
• An array of pointers sqend of size K. sqend[i], for 0 ≤ i ≤ K-1, has a pointer to the
last student in the sub-queue of hostel i + 1, if that sub-queue currently exists in the
queue; otherwise, sqend[i] is null. (The hostel number is taken as i + 1 here and not i
only because of 1-indexing for hostel numbers but 0-indexing in arrays.) All sqend[i]’s
are initially null. Note that enqueueing requires pointers to all the sub-queue ends,
because it may occur at a number of places in the queue, whereas dequeueing only
requires a single pointer at the front.
Note that enqueueing requires pointers to all the sub-queue ends, because it may occur at a
number of places in the queue, whereas dequeueing only requires a single pointer at the front.
Enqueueing: When a new student s shows up to join the queue:
• Obtain the hostel number h of s. This is a constant O(1) time operation if the hostel
numbers are stored within the student nodes themselves.
• If sqend[h-1] is not null, i.e., there is a sub-queue for hostel h in the current queue, then
insert s in the mess queue, immediately after the student to which sqend[h-1] points.
• If sqend[h-1] is null, then there is no sub-queue for the student’s hostel, and he/she
must be attached at the end of the entire queue. Using the tail pointer, insert s at the
end. It may so happen that tail is null too. This means the entire queue is empty; in
this case, simply make both head and tail point to s.
• If s has been inserted at the end of the queue, then we have to change tail to point at
s. This should be done if either sqend[h-1] is null (no sub-queue already present for s),
or sqend[h-1] and tail point to the same student (sub-queue for s present at the end).
All the above operations take constant O(1) time, so enqueueing is a constant time operation.
Dequeueing: Dequeueing a student only happens at the front:
• Use the head pointer to obtain the student s to be removed, and his/her hostel number
h.
• Change head to point to the student immediately after s in the queue. If there are no
more students in the queue, head becomes null.
• If the new head of the queue is either null, or has a hostel number different from h, that
means the sub-queue of hostel h no longer exists after dequeueing s; set sqend[h-1] to
null.
• If the new head of the queue is null, i.e., it’s empty, set tail to null too.
We make the (reasonable) assumption that dequeueing at the front of the queue takes place
at a constant rate, say 1 person per unit time. Then, the waiting time for any student in the
queue is equal to the number of students in front of him/her
• In an ordinary queue of size n, a newcomer gets attached at the end. Then, the waiting
time for the newcomer is n, and the waiting times of all the other students remain
unchanged. The sum of waiting times (total load of the queue) increases by n.
• In a mess queue of size n which operates as in this problem, a newcomer may get
attached at one of many positions in the queue. Suppose the newcomer is inserted
after the first k students in the queue. Then, the waiting time for the newcomer is k,
and there is an increase of 1 in the waiting times of all the n k students behind the
newcomer. Thus, the increase in sum of waiting times in the queue is k + (n k) = n.
Because the increase is the same in both cases, it can be easily seen that the average waiting
time would also be the same; that is, the average waiting time in a queue is not influenced
by the arrival process distribution. For more on this, see Little’s Law.
Tutorial 3
1. Given that k elements have to be stored using a hash function with target space n. What is
the probability of the hash function having an inherent collision? What is an estimate of the
probability of a collision in the insertion of N elements?
Solution: We assume that the hash function is perfect - that is, it hashes each element to a
uniformly random element of the target space. The probability of such a hash function not
incurring a collision is simpler - every element hashes to a different element of the space: the
probability is simply
k!(nk)
q= (1)
nk
The probability of a collision is then p = 1 − q.
Alternately: We write the probability as a product of conditionals, basically, imagine picking
the objects one by one. The probability of no collision when the ith element is inserted is
(n − (i − 1))/n since the previous (i − 1) elements occupy different places. The probability
of no collision is then the product of these numbers -
k−1 k!(n)
1 2
q = 1· 1− · 1− ... 1− = kk (2)
n n n n
as before.
For the second question, one notes that if a collision happened during the insertion of the ith
of N elements, there is a collision even at the end. And if there was no collision throughout
the insertion, there will not be any at the end too. Thus the probability of a collision occurring
during insertion is simply the probability above with k = N.
We estimate it by bounding it suitably from above and below. On the one hand, since
1 − x ≤ e− x for x ≥ 0, we get from Equation (2) that
!
k −1 2
k −k
i
q ≤ exp − ∑ = exp −
i =0
n 2n
so that
k ( k − 1)
p ≥ 1 − exp −
2n
√
If k < 2n, the term in the exponent is less than 1 and we can use 1 − e− x ≥ x/2 (holds
good for 0 ≤ x ≤ 1 to deduce
k ( k − 1)
p≥
4n
Next, to upper-bound the probability. Let Aij denote the event that elements i and j collided.
It is easy to see P( Aij ) = 1/n. The probability, then, of a collision existing is
(2k ) k ( k − 1)
∑
[
p = P( Aij ) ≤ P( Aij ) = =
1≤ i < j ≤ n 1≤ i < j ≤ n
n 2n
k ( k −1)
Finally, we get the desired bounds: Let λ := 2n . Then
λ/2 ≤ p ≤ λ
or p = Θ(λ).
Remark. This is, in essence, the birthday problem.
2. Let C (i ) be the chain of array indices that are queried to look for a key k in linear probing where
h(k ) = i.
1. How does this chain extend by an insertion, and how does it change by a deletion?
2. A search for a key k ends when an empty cell is encountered. What if we mark the end of
C (i ) with an end marker? We stop the search when this marker is encountered. Would this
work? Would this be efficient?
3. Is there a way of not using tombstones?
Solution:
1. Suppose key k with hash value j = h(k ) was (successfully) inserted into the table.
Suppose C ′ (i ) is the new chain starting at i. Then of course, no element of C (i ) was
removed which means that C (i ) ⊆ C ′ (i ). Also, C ′ (i ) can contain at most one more
element than C (i ), because only one element was inserted. This will be the case iff the
chain C ( j) ends in an element of C (i ) (think!). Otherwise, C ′ (i ) = C (i ).
In the case of a deletion - say k was deleted from index j of the hash table - C ′ (i )
is C (i ) if j ̸∈ C (i ) or if j is not the last entry in C (i ) (the latter because we insert a
tombstone and will still continue to look for the key in the cells ahead). If j is the last
entry in C (i ), the size of C (i ) reduces by 1 - C ′ (i ) = C (i ) − { j}.
2. Yes. We get rid of tombstones, and instead of looking for an empty cell to end our
search, we stop when we reach an end marker. We do not use tombstones. An end
marker is always empty.
• When we need to insert an element, we reach this end during probing and insert
the element there. If the next element is also empty, we put an end marker
there. Otherwise, we leave it as is.
• During searches, we search till we reach the desired key or an end marker.
• During deletion (of a key with hash value j, say), we find the element and turn
the cell into an empty cell. We then iterate backwards till j until to find an
element with hash value j, maintaining the index of the latest empty spot
we found. We stop when we locate such an element or reach j, and we
put up an end marker in the latest empty spot found.
The bold text describes how we maintain the invariant that an end marker describes
the end of a chain and that we search till we reach an end marker.
Efficiency: Deletions are slower due to iteration over a chain. But there are no
tombstones, and so searching is faster. Thus, if searches are priority with not many
deletes, this may prove faster.
3. The method of end markers described above is one such way. Another is as follows:
Suppose we delete key k with hash value j. When we empty a cell in chain C ( j) (on
deletion), we ”reel in” the rest of the chain C ( j). We search ahead of the now-empty
cell in the cluster for the first key k′ that has the same hash value j and move that
into our empty cell. That leaves us with another empty cell, and repeat the process
(we search ahead, and try to find a key k′′ , and so on) till we get to the end of the
chain. We return here and the last cell that became empty can remain empty (why
does this work?). This has linear (O(m)) complexity in the worst case for deletion, but
is generally faster. Of course, later searches are faster because there are no tombstones
lying around.
Remark. Tombstones are very useful in hash table implementations. To get partially
around the problem of too many tombstones, what is done is the following: Once α
(the load factor) becomes large (we count the tombstones too), we remove all the
elements, empty the array and then rehash and insert the elements into the array.
Solution: Let us trace the execution of inserting the above elements into a hash table.
• For element 41, we have the hash index as (41 mod 11) which equals 8 (unused), hence
41 gets stored at index 8.
• For element 22, we have the hash index as (22 mod 11) which equals 0 (unused), hence
22 gets stored at index 0
• For element 44, we have the hash index as (44 mod 11) which equals 0 (occupied).
Hence, our update step is 6 - (44 mod 6) which equals 4. Hence, we check (44 mod
11) + 4 = 4 (unused) and thus, 44 is stored at index 4.
• For element 59, we have the hash index as (59 mod 11) which equals 4 (occupied).
Hence, our update step is 6 - (59 mod 6) which equals 1. Hence, we check (59 mod
11) + 1 = 5 (unused) and thus, 59 gets stored at index 5.
• For element 32, we have the hash index as (32 mod 11) which equals 10 (unused),
hence 32 gets stored at index 10.
• For element 31, we have the hash index as (31 mod 11) which equals 9 (unused), hence
31 gets stored at index 9.
• For element 74, we have the hash index as (74 mod 11) which equals 8 (occupied).
Hence, our update step is 6 - (74 mod 6) which equals 4. Hence, we check (8 + 4)
mod 11 which equals 1 (unused), hence 74 gets stored at index 1.
4. Suppose you want to store a large set of key-value pairs, for example, (name, address). You have
operations, which are addition, deletion, and search of elements in this set. You also have queries
whether a particular name or address is there in the set, and if so then count them and delete all
such entries. How would you design your hash tables?
Solution: Idea is to use double-hashing. Instead of hashing just on basis of name or address,
we will use both of them to hash. Consider a 2-d array arr[i][j], where i represents the index
corresponding to hash value of name and j represents index corresponding to hash value of
address. We will use the technique of chaining in the case of collisions. All the steps will be
same as that of one-dimension, just the hash value has changed to a tuple. Time complexity
of various operations will be:
• Insertion : O(1)
• Deletion : O(1 + α)
• Search : O(1 + α)
For a particular name or address, we will iterate over the corresponding row or column
accordingly and check for all the elements for counting or deleting all entries.