Chapter 02
Chapter 02
Stewart Weiss
Introduction to Recursion
Introduction to Recursion
1 Recursion
Recursion is a powerful tool for solving certain kinds of problems. Recursion breaks a problem into smaller
problems that are identical to the original, in such a way that solving the smaller problems provides a solution
to the larger one.
It can be used to dene mathematical functions, languages and sets, and algorithms or programming language
functions. It has been established that these are essentially equivalent in the following sense: the set of all
functions that can be computed by algorithms, given some reasonable notion of what an algorithm is, is
the same as the set of all functions that can be dened recursively, and each set (or language) for which
membership can be determined by an algorithm corresponds to a function that can be dened recursively.
We are interested here mostly in the concepts of recursive algorithms and recursion in programming lan-
guages, but we also informally introduce recursive denitions of functions.
2 Recursive Algorithms
2.1 Example: The Dictionary Search Problem
Suppose we are given a problem to nd a word in a dictionary. This is known as a dictionary search.
Suppose the word is yeoman. You could start at the rst page and look for it and then try the second page.
then the third, and so on, until nally you reach the page that contains the word. This is called a sequential
search. Of course no one in their right mind would do this because everyone knows that dictionaries are
sorted alphabetically, and that therefore there is a faster way that takes advantage of the fact that they are
sorted. A dictionary by denition has the property that the words are listed in it in alphabetical order,
which in computer science means it is a sorted list.
One more ecient solution might be to use binary search, which is described by the following recursive
algorithm :
The recursion in this algorithm occurs in lines 8 and 10, in which the instructions state that we must use
this algorithm again, but on a smaller set of pages. The algorithm basically reduces the problem to one in
which it compares the word being sought to a single word, and if it is smaller, it looks for the word in the
rst half, and if it is larger, it looks in the second half. When it does this looking again, it repeats this
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 1
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
exact logic. This approach to solving a problem by dividing it into smaller identical problems and using
the results of conquering the smaller problems to conquer the large one is called divide-and-conquer.
Divide-and-conquer is a problem-solving paradigm, meaning it is a model for solving many dierent types
of problems.
The binary search algorithm will eventually stop because each time it checks to see how many pages are in
the set, and if the set contains just one page, it does not do the recursive part but instead scans the page,
which takes a nite amount of time. It is easier to see this if we write it as a pseudo-code function:
Observations
2. When it calls itself recursively, the size of the set of pages passed as an argument is at most one-half
the size of the original set.
3. When the size of the set is 1, the function terminates without making a recursive call. This is called
the base case of the recursion.
4. Since each call either results in a recursive call on a smaller set or it terminates without making a
recursive call, the function must eventually terminate in the base case code.
5. If the keyword is on the page checked in the base case, the algorithm will print its denition, otherwise
it will say it is not there. This is not a formal proof that it is correct!
• that each of its recursive calls diminishes the problem size, and
• that the function takes a nite number of steps when the problem size is less than some xed constant
size,
then it is guaranteed to terminate eventually. Later we will see a more general rule for guaranteeing that a
recursive function terminates.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 2
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
Although the factorial function (n!) can be computed by a simple iterative algorithm, it is a good example
of a function that can be computed by a simple recursive algorithm. As a reminder, for any positive integer
n, factorial(n), written mathematically as n! is the product of all positive integers less than or equal to n.
It is often dened by n! = n · (n − 1) · (n − 2) · ... · 2 · 1. The "..." is called an ellipsis. It is a way of avoiding
writing what we really mean because it is impossible to write what we really mean, since the number of
numbers between (n − 2) and 2 depends on the value of n. The reader and the author both agree that the
ellipsis means "and so on" without worrying about exactly what and so on really means.
n! = n · (n − 1) · (n − 2) · ... · 2 · 1
and
(n − 1)! = (n − 1) · (n − 2) · ... · 2 · 1
By substituting the left-hand side of the second equation into the right-hand side of the rst, we get
n! = n · (n − 1)!
This would be a circular denition if we did not create some kind of stopping condition for the application
of the denition. In other words, if we needed to nd the value of 10!, we could expand it to 10 · 9! and then
10 · 9 · 8! and then10 · 9 · 8 · 7! and so on, but if we do not dene what 0! is, it will remain undened. Hence,
this circularity is removed by dening the base case, 0! = 1.
The denition then becomes:
(
1 if n = 0
n! =
n · (n − 1)! if n > 0
This is a recursive denition of the factorial function. It can be used to nd the value of n! for any non-
negative number n, and it leads naturally to a recursive algorithm for computing n!, which is written in C
below.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 3
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
Observations
1. This does not result in an innite sequence of calls because eventually the value passed to the argument
of factorial is 0, if it is called with n >= 0, because if you unwind the recursion, you see each successive
call is given an argument 1 less than the preceding call. When the argument is 0, it returns a 1, which
is the base case and stops the recursion.
2. This function does not really compute n! because on any computer, the number of bits to hold an int
is always nite, and for large enough n, the value of n! will exceed that largest storable integer. For
example; 13! = 6,227,020,800 which is larger than the largest int storable on a 32-bit computer.
There are elementary functions of the non-negative integers that are so simple that they do not need to be
dened recursively. Two examples are
n(x) = 0
s(x) = x+1
The rst has the value zero for all numbers, and the second is the successor of the number. If we allow only
two methods of dening new functions, recursion and function composition, what functions can we dene?
a(x, 0) = x
a(x, y + 1) = s(a(x, y)) = a(x, y) + 1
The box method is a way to organize a trace of a recursive function. Each call is represented by a box
containing the value parameters of the function, the local variables of the function, a place to store the
return value of the function if it has a return value, placeholders for the return values of the recursive calls
(if any), and a place for the return address. The steps are as follows.
Steps
1. Label each recursive call in the function (e.g., with labels like 1,2,3,. . . or A,B,C,...). When a recursive
call terminates, control returns to the instruction immediately following one of the labeled points. If
it returns a value, that value is used in place of the function call.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 4
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
(c) a placeholder for each value returned by the recursive calls from the current call,
3. Now you start simulating the function's execution. Write the instruction that calls the recursive
function with the given arguments. For example, it might be
4. Using your box template, create a box for the rst call to the function. Draw an arrow from the
instruction you wrote in step 3 to this rst box.
5. Execute the function by hand, updating values of local variables and reference parameters as needed.
For each recursive call that the function makes, create a new box for that call, with an arrow from the
old box to the box for the called function. Label the arrow with the label of the function being called.
7. Each time a function exits, if it has a return value, update the value in the box that called it, i.e., the
one on the source side of the arrow, and then cross o the exiting box. Resume execution of the box
that called the function at the instruction immediately following the label of the arrow.
We can trace the factorial function with this method to demonstrate the method. The factorial function has
a single value parameter, n, and no local variables. It has a return value and a single recursive call. Its box
should be
n = ______
A: fact(n-1) = _____
return = _____
There are three placeholders, one for n, one for the return value of the recursive call, which is labeled A, and
one for the return value of the function. The name of the function is abbreviated.
We trace the function for the call when the argument is 3. The gure below illustrates the sequence of boxes.
Each row represents a new step in the trace. The rst row shows the initial box. The value of n is 3. The
other values are unknown. The function is called recursively so the next line shows two boxes. The box in
bold is the one being traced. In that box, n=2, since it was called with n-1. That function calls factorial
again, so in the next line there are three boxes. Eventually n becomes 0 in the fourth line. It does not make
a recursive call. Instead it returns 1 and the box is deleted in the next line and the 1 sent to the fact(n-1)
placeholder of the box that called it. This continues until the return values make their way back to the rst
box, which returns it to the calling program.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 5
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
n=3
A: fact(n-1) = _____
return = _____
n=3 n=2
A: fact(n-1) = _____ A: fact(n-1) = ___
return = _____ return = _____
4 Other Examples
4.1 Fibonacci Numbers
Recursion is not usually the most ecient solution, although it is usually the easiest to understand. One
example of this is the Fibonacci sequence. The Fibonacci numbers are named after Leonardo de Pisa, who
was known as Fibonacci. He did a population study of rabbits in which he simplied how they mated and
how their population grew. In short, the idea is that rabbits never die, and they can mate starting at two
months old, and that at the start of each month every rabbit pair gives birth to a male and a female (with
very short gestation period!)
From these premises, it is not hard to show that the number of rabbit pairs in month 1 is 1, in month 2 is
also 1 (since they are too young to mate), and in month 2, 2, since the pair mated and gave birth to a new
pair. Letf (n) be the number of rabbits alive in month n. Then, in month n, where n > 2, the number of
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 6
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
pairs must be the number of pairs alive in month n − 1, plus the number of new ospring born at the start
of month n. All pairs alive in month n−2 contribute their pair in month n, so there aref (n − 1) + f (n − 2)
rabbit pairs alive in month n. The recursive denition of this sequence is thus
(
1 if n ≤ 2
f (n) =
f (n − 1) + f (n − 2) if n > 2
This will generate the sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, and so on. A recursive algorithm to
compute the nth Fibonacci number, for n > 0, is given below, written as a C/C++ function.
Although this looks simple, it is very inecient. If you write out a box trace of this function you will see
that it leads to roughly 2n function calls to nd the nth Fibonacci number. This is because it computes
many values needlessly. There are far better ways to compute these numbers.
A well-known and common combinatorial (counting) problem is how many distinct ways there are to pick k
objects from a collection of n distinct objects. For example, if I want to pick 10 students in the class of 30,
how many dierent sets of 10 students can I pick? I do not care about the order of their names, just who is
in the set. Let c(n, k) represent the number of distinct sets of k objects out of a collection of n objects.
The solution can be dicult to nd with a straight-forward attack, but a recursive solution is quite simple.
Let me rephrase the problem using the students in the class. Suppose I single out one student, say student
X. Then there are two possibilities: either X is in the group I choose or X is not in the group.
How many solutions are there with X in the group? Since X is in the group, I need to pick k − 1 other
students from the remaining n − 1 students in the class. Therefore, there are c(n − 1, k − 1) sets that contain
student X.
What about those that do not contain X? I need to pick k students out of the remaining n−1 students in
the class, so there are c(n − 1, k) sets.
It follows that
when n is large enough. Of course there are no ways to form groups of size k if k > n, so c(n, k) = 0 if
k > n. If k = n, then there is only one possible group, namely the whole class, so c(n, k) = 1 if k = n. And
if k = 0, then there is just a single group consisting of no students, so c(n, k) = 1 when k = 0. In all other
cases, the recursive denition applies. Therefore the recursive denition with its base cases, is
1 k=0
1 k=n
c(n, k) =
0 k>n
c(n − 1, k − 1) + c(n − 1, k) 0<k<n
Once again it is easy to write a C/C++ function that computes this recursively by applying the denition:
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 7
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
The interesting question is how to show that this always terminates. In each recursive call, n is diminished.
In some, k is not diminished. Therefore, eventually either k >= n or k = 0.
In any case, this is again a very inecient way to compute c(n,k) and it should not be used.
The binary search algorithm was presented in pseudo-code earlier. Now we can work out some of the
programmatic details.
• First we consider an arbitrary array, not a dictionary with pages and words on pages. The algorithm
is given an array of values.
• Second we remove printing from the algorithm. An algorithm should return its results to its caller, not
print them on a device. In general, functions whose purpose is not to perform I/O should not perform
any I/O, as this makes them less portable, cohesive, and reduces their performance. The return value
should be either the index in the array where the item is found or an indication that it is not in the
array at all. Since array indices are always non-negative, we can use -1 to indicate that the search
failed.
• Third is the issue of how to nd the middle of the array. The middle is the index halfway between the
top index and the bottom index of the part of the array being searched. Since the part of the array
being searched must vary depending on the results of comparisons, the top and bottom of the search
range will be parameters of the function.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 8
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
Notes.
1. The array parameter is declared constant in order to prevent the code from accidentally modifying it,
since it is passed by reference in C++ and passed as a pointer in C.
2. The type of the array must be a type for which comparison operations are dened; the type ordered_type
represents an arbitrary ordered type such as int, char, or double.
3. The comparisons are ordered so that rst it checks if the keyword is less than the array element, then
larger, and if both fail, it must be equal. This is more ecient than checking equality rst. Why?
/**
* @precondition 0 <= bottom && top < ARRAYSIZE && for every i 0 <= i <= top-1
* theArray[i] <= theArray[i+1] && ARRAYSIZE is the size of the array
* @postcondition none (the array is unchanged )
* @param theArray the array to search
* @param bottom low index of range to search in the array
* @param top high index of range to search in the array
* @param keyword value to look for (the search key)
* @return if keyword is in the array between bottom and top, its index
* otherwise -1
*/
int binary_search( const ordered_type theArray[],
int bottom,
int top,
ordered_type keyword ) ;
Exercise 1. Suppose the array contains the integers 5, 10, 16, 17, 19, 30, 32, 33, 34, 67, 68, 69, 81, 83, 87,
91, 92 and suppose the keyword is 10.
1. What is the set of array values against which the keyword will be compared?
3. What if the keyword is 3? What is the sequence of array values against which it will be compared, and
how many comparisons are executed?
The last problem we will consider is the famous Towers of Hanoi problem. You are given three pegs, labeled
A, B, and C. Each peg can hold n disks. Initially, n disks of dierent sizes are arranged on peg A in such a
way that above any disk are only smaller disks. The largest is therefore at the bottom of the pile and the
smallest at the top. The problem is to devise an algorithm that will move all of the disks to peg B moving
one disk at a time subject to the following two rules:
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 9
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
• Only the top disk can be moved from its stack; and
This is a problem with no obvious, simple, non-recursive solution, but it does have a very simple recursive
solution.
1. Ignore the bottom disk and solve the problem for n−1 disks, moving them to peg C instead of peg B.
3. Solve the problem for moving n−1 disks from peg C to peg B.
If we write towers(count, source, destination, spare) to represent the algorithm that moves count
many disks from source to destination using the spare as needed, subject to the rules above, then the
algorithm we just described can be written as follows:
towers(n-1, A, C, B);
towers(1, A, B, C);
towers(n-1, C, B, A);
It is astonishingly simple. The base case is when there is a single disk. Putting this together, we have
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 10
CSci 235 Software Design and Analysis II Prof. Stewart Weiss
Introduction to Recursion
This is an example of a recursive algorithm with three recursive calls. In each call the problem size is
smaller. The problem size is the number of disks. The recursion stops when the problem size is 1. The
function produces, as its output, a sequence of statements showing which disk was moved. Alternatively,
we could just store this sequence in an array of strings that could be passed back via a parameter and then
printed by the calling code.
What is not clear is how many steps this takes. If we were to measure the steps by how many times a disk
is moved from one peg to another, then the interesting question is how many moves this makes when given
an initial set of N disks on a peg. We will return to this when we discuss recurrence relations a bit later.
Exercise 2. Write up this algorithm and run it to verify that it works. Add statements to see how many
recursive calls were made. Then, add the pre- and post-conditions to make the contract for the function's
prototype.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Int'l License. 11