Overview

In this unit we will be introducing 2 seemingly separate topics that actually have a deep connection: the induction proof technique, and recurrences.

Both of these are vital tools in computer science, and we will see some examples that demonstrate this idea. We will also be using these ideas directly in the next unit.

Mathematical Induction

To start off, let’s talk about the induction proof technique. This is the last main proof technique that we had left out from Unit 1 because it really deserves that much attention. This is one of the most handy tools that we use in order to prove statements we care about.

Revisiting An Example: The Arithmetic Series

To show you what I mean, let’s re-visit an example that I had mentioned when introducing and motivating proofs in the first place:

Can we show that , ?

(Recall that )

You might notice, that based on the tooling we’ve covered so far, you might be tempted to start a proof by saying:

Let , arbitrarily chosen.

And after that point, you’ll need to show that . But even to me, this sounds difficult. If anything, perhaps we were all told this fact in high school so we can take it to be true. But what if I told you there was a way to prove it?

Here’s the high-level strategy:

  1. (Base Case) Firstly, we will prove that when , the left-hand side is the same right-hand side. I.e.
  2. (Inductive Case) Secondly, we will assume that for , the left hand side is the same as the right hand side, then prove that for , the left hand side is the same as the right hand side.

Let me show you what I mean, then after that explain why this makes sense.

Proof

  1. (Base Case) Let . Then [Basic Algebra]
  2. (Inductive Case) Assume that for , where , .
    1. [Basic Algebra]
    2. [By assumption on line 2]
    3. [Basic Algebra]
    4. [Basic algebra using lines 2.1, 2.2, 2.3]
  3. [Principle of mathematical induction]

There are quite a few features of this proof, and let’s talk about the most important one first: let’s focus on what happened on line 2.

We assumed the equality works for , and we need to prove that the equality works for .

And what the induction principle does, is the following, given a statement (in our example, states that ):

  1. If you have proven the statement for a base case (in our example, when ).
  2. You have proven the statement for the inductive case (in our example, we assumed that is true, then showed that is true). Then the induction principle allows you to conclude that , the statement is true. In other words, the statement is true no matter the natural number we give it.

Formally:

Induction Rule

If we have a proven statement , and for some arbitrarily chosen , then we may conclude .

Why does this work? Here’s the intuition:

We know that the statement works when . We also know that if the statement works for , it also works for . So we can conclude we know that it also works for . Similarly, we know that if the statement works for , it also works for . Since we do know that it works for , we can now also happily conclude that it also works for . And so on and so forth.

Proof By Induction Template

So in general, to prove some statement , we use the following idea:

  1. Prove
  2. Prove that if is true, then is also true.

Or, more formally:

  1. Prove
  2. Prove that .

In the very first example we gave just now, the statement is defined to be .

One more example: Bernoulli’s inequality

Let’s do one more example, here’s an inequality we sometimes use in computer science:

Theorem

Let such that . Then, for all ,

To be clear, from a high-level overview, we want to show that:

  1. The statement is true when
  2. Assuming that the statement is true when , then the statement is also true when .

Once we do these two things, we can conclude that for all ,

Proof

  1. Let such that .
  2. (Base Case) Let . Then .
  3. (Inductive Case) Assume that for ,
    1. [Basic algebra]
    2. [By assumption on line 3]
    3. [Basic algebra]
    4. [Combining lines 3.1, 3.2, 3.3]
  4. [Principle of mathematical induction]

Again, pay special attention to line 3.2, we used the assumption that in order to prove that:

What Happens If: Did Not Prove It For A Base Case

Let’s talk about a common issue that happens in induction proofs. It is quite common that people forget to include base cases in their proofs, and they end up thinking some statement is true, because they thought they had a proof for it.

Here’s an example, let’s say we wanted to prove that:

For all , is odd.

Or a little more mathematically:

Consider the following faulty proof:

  1. (Inductive case) Assume that for ,

    1. [Existential Generalisation]

Have we actually proven the statement? Well, no! In fact, the exact opposite is true, for all , is actually even, not odd.

So it’s very important to remember to cover the base case.

What Happens If: The Base Case Was Not 0?

At some point you might come across statements that look something like this:

Which in English, states that:

For all natural numbers such that , .

(To be clear, )

Let’s test this for a few values to see when it starts being true:

121
242
386
41624
532120

Notice how seems to overtake only around onwards. So how do we prove this? We change the base case!

Proof Attempt

  1. (Base case) Let ,
  2. (Inductive Case) Let , assume that
    1. [Basic Algebra]
  3. [Principle of mathematical induction]

Again, pay attention to the how the base case has changed in on line 1 in order to do our proof. Basically, we are only claiming that only from onwards, and not saying anything about when .

Proof By Strong Induction Template

Let’s modify the “proof by induction” template a little bit more: instead of assuming for and trying to prove for , let’s tweak this instead to the following:

Strong Induction Template

  1. Prove for some base case value .
  2. Prove that assuming then is also true.

But why would we do this? Here’s an example:

Example Using Strong Induction:

Let’s say that we lived in a country, where the only denominations are the -dollar and the -dollar bills. Your friend, coming from Singapore, is doubtful that such a system would work. Let’s convince them that we can use 4-dollar and 5-dollar denominations to make any dollar value that is or larger.

This seems true right? For example:

8 itself uses two of the 4-dollar bills. 9 uses one 4-dollar bill, and one 5-dollar bill. 10 uses two 5-dollar bills.

Formally, we want to prove the following statement:

Again, take note, we wish to prove this for from onwards, which hints to us that our base case should be from onwards. Pay special attention to the fact it is 8 onwards. This will become very important later. (Subtle foreshadowing…)

So let’s look at a proof sketch:

Proof Sketch

  1. (Base case) Let , then
  2. (Inductive case) Let , assume that , such that
    1. Since , and we assumed that for all , it was true that such that , then we can say that
    2. Thus,
    3. such that
  3. Therefore, ,

Okay, so the proof looks reasonable. What if I said there’s an issue?

Let’s think about 11, can we actually use only 4-dollar notes and 5-dollar notes to make the dollar amount of 11? We actually can’t!

So where did we go wrong in our proof? It was our assumption. We assumed that for all values , we can express using 4-dollar and 5-dollar notes. So in our proof that it was possible for , we had to assume that it was true for 11 - 4 = 7. Did we prove this? No we didn’t, and that was the issue.

So let’s look back at our proof and see how we can fix this. To prove it for , we needed to assume it for . Since in our base case, we only proved it for 8, this means we can only know for sure that values like 12, 16, 20, 24, … and so on can be expressed using 4-dollar and 5-dollar denominations.

So how do we prove this for all numbers? Notice that if (emphasis on if) in our base case, we also proved it for starting values 8, 9, 10, 11. Then we can prove it for all numbers, because if we know it works for 8, we know it works for 12. If we know it works for 9, then it works for 13. If it works for 10, it works for 14. And if it works for 11, then it works for 15. And we can keep repeating this to prove it for all the numbers from 8 onwards. Pictorially speaking, it looks like this:

Then why are we convinced that 16, 17, 18, 19 is true? Similarly, because if we know 12, 13, 14, 15 are true, then we can say 16, 17, 18, 19 are also true.

Okay, let’s take a step back, we just said that we cannot do this for 11. Which means that we actually cannot say “All dollar values from 8 onwards can be made using 4-dollar and 5-dollar denominations.”

Okay, but what if I said that we could actually do this for dollar values 12 onwards? Let’s see the proof of this. This time the proof is correct. We have also shortened the proof a little bit, so all the important parts remain, but it is less verbose than in the previous weeks.

Proof

  1. (Base cases) , , ,
  2. (Inductive Case) Let , assume that for , 3. Since and , 4. Let , be such that [Existential instantiation on line 2.1] 5. [Basic Algebra] 6. Since , [Basic algebra] 7. [Existential generalisation] 3.8. forall n \geq 12, \exists a \in \mathbb{N}, \exists b \in \mathbb{N}[k = 4a + 5b]$ [Principle of mathematical induction]

Pay special attention to the following:

  1. There are now 4 base cases, because in the inductive proof, we are taking exactly 4 steps back.
  2. In our inductive case, we start from 16, which is 4 above the base case.
  3. We also, for induction, assume that the statement we wish to prove holds true for values from up until .

Strong induction is especially handy when it comes to analysing recurrences. So we will see more examples after the second topic (Recurrences) in this unit, and next week.

Recurrences

Let’s talk about another concept that is commonly used in computer science: Recurrences.

Let’s start with a motivating example. You might have seen this snippet of code before, for an algorithm called binary search that looks for key in an array of elements:

def binary_search(arr, left, right, key):
	if left + 1 == right:
		return arr[left] == key
 
	mid = (right + left) // 2
	if arr[mid] <= key:
		return binary_search(arr, mid, right, key)
	else:
		return binary_search(arr, low, mid, key)

How do we analyse the running time of such an algorithm? While this is an advanced topic that we will not cover too comprehensively in this module, I think it serves as a good example of why we need to know about recurrences and the proof by induction: It will help us analyse and understand recursive algorithms (among many other concepts in computer science).

Recurrence Relations

To put simply: A recurrence relation is a way to describe a sequence of numbers recursively.

Example 1: A Simple Recurrence

For example, we say is a recurrence defined as:

Notice that . But what about ? Well . Since we also know that , this means that .

Perhaps you have spotted the pattern, is actually just , as long as is a natural number that is at least . Perhaps this is obvious, but even something like this can be proven by induction! In fact, this baby example is the perfect practice question!

Try proving the following:

Theorem

Let be defined as above, then .

Admittedly, our very first example of a recurrence was probably not very exciting. But I think it is a simple example to point at some features of recurrences. Pay attention to how this example of recurrence defines using 2 cases:

  1. When , is defined to be . This is our base case. It does not refer to any other values of .
  2. When , is defined to be . This is our inductive case. Notice how refers to .

For a recurrence to make sense, it needs to have at least one base case. And the inductive cases have to eventually reach a base case. For example, to compute , we need to know what is. To know what is, we need to know what is. is a base case, so we definitely know what it is. Which then means we know that is, and what is.

Let’s look at more advanced examples of recurrences to show you more interesting concepts we can solve.

Example 1.5: Factorial

Recall that (pronounced factorial) is defined to be:

So for example, , , ,

Can we make a recurrence such that ? It’s yet another good example you might want to think about before reading how to define it. Think about what is the base case, and what is the inductive case.

Anyway, here’s the recurrence!

So let’s run through some examples. is . . . . Lastly, .

Let’s run through an example of how to compute the value:

Also, here’s a proof that the recurrence matches what we want.

Theorem

As expected, we are going to do this by induction.

Proof

  1. (Base case) Let . Then .
  2. (Inductive Case) Let , assume that .
    1. [Definition of ]
    2. [Assumption on line 2]
    3. [Combining lines 2.1, 2.2, 2.3]

Example 2: Climbing Stairs

Let’s say that there are a flight a flight of stairs with steps. And we want to reach the top of the stairs. But we can either take 1 step at a time, or 2 steps at a time. How many possible ways are there to reach the top?

For example, if we had steps, then the number of ways would be , because:

  1. Either we only take single steps.
  2. We take one single step, then a double step.
  3. We take a double step, and then a single step.

So in general, what is a recurrence such that tells us how many ways there are to climb stairs with steps? To solve this, we should think about a few base cases first.

For , there is only one way: Take a single step. For , there are two ways: Take 2 single steps, or take 1 double step.

What about general ? Think about it this way: To reach the step, we just need to count the number of ways to reach the step, and then take a single step after that to reach the step. Or count the number of ways to reach the step, and then take a double step after that to reach the step.

So the recurrence looks like this:

Pay attention to how this time around, we have 2 base cases, , and . What happens if we only had a single base case of ? Are there any issues with this?

Think about it, how many ways are there to climb 5 steps? Well, the number of ways you reach the step plus the number of ways you reach the step. Why? Because for each way you reach the third step, you can take a double step. For each way you reach the step, you can take a single step.

Notice that in the left box, that is essentially (the number of ways to reach the 3rd step), and in the right box, that is actually (the number of ways to reach the 4th step). Notice that we need to take a double step after the reaching the third step. Or taking a single step after reach the 4th step.

And is really just the addition of the two ways!

You can even write a python program that does this for you:

def stair_climbing(n):
	if n == 1:
		return 1
	if n == 2:
		return 2
	return stair_climbing(n - 1) + stair_climbing(n - 2)

You might notice that for large enough values of , the program is substantially slow. This is something we will talk about in the tutorial and the next unit!

Example 3: Binary Search Recurrence

Let’s end on bringing it back to the motivating example of binary search. In terms of the big picture, recurrences are a tool used in program/algorithm analysis, among many things. So how about we ask ourselves, how many array accesses does the binary search program make?

Let’s look at the code again:

def binary_search(arr, left, right, key):
	if left == right:
		return arr[left] == key
 
	mid = (right + left) // 2
	if arr[mid] <= key:
		return binary_search(arr, mid, right, key)
	else:
		return binary_search(arr, low, mid - 1, key)
 
 
binary_search(arr, 0, len(arr) - 1, key)

How do we begin? Well let’s look at the base case of our algorithm, it’s basically saying when the sub-array that we care about has only one element left, then we access arr[left] and compare it against key. So when our sub-array is of size , only one array access is made.

What about in general? Well then our array is split into two sub-arrays: , and .

In either case, if we started with an array of length , we are going to recurse on an array of length at most .

So if we let be the number of array accesses our binary search makes on an array of length , we can then say that:

How do we analyse this recurrence? This is something we will study in detail in the next unit. But for now can we get a sense of its long-term behaviour?

We want to say this algorithm does not need to access too many elements in the array, only about many elements. How do we do this?

Let’s prove that

Proof

  1. (Base case). Let , then
  2. (Inductive case). For , assume that
    1. Since , then . Therefore, our assumption applies to .
  3. For all , it holds true that

So what does this mean? This means that we know for arrays of size onwards, then the function is upper bounded by some curve. This is a hint to us that we are not using many array accesses. And therefore the program is actually efficient!