Overview
This unit introduces the notion of proofs, propositional and first order logic, and certain proof techniques. Along the way we will talk about proofs in the context of both discrete mathematics and computer science. The unit is split into 3 sub-parts:
- Motivation and introduction to propositional logic, first order logic and proofs.
- Propositional logic
- First order logic.
- Proof techniques (and patterns).
Part 0: Unit Introduction
Let’s begin by asking ourselves how exactly it is that people decide what “facts” are true and which are false in reality. Even this is a little philosophical, but it’s a great starting point to contrast with how other fields do it:
Example
- How do physicists agree that certain statements like or is true or not?
- How do pharmacists decide if a certain drug is effective in treating a certain disease or not?
- How does the criminal justice system decide if a certain person has committed a crime?
These are examples of statements whose truth has to be ascertained. In the case of science, the answer is usually that experiments are run, and in light of the data available, one might infer with reasonable certainty that a statement is true (or otherwise). In the case of the justice system, two lawyers argue their case in front of a judge. Every party presents their interpretation of the laws in human language and the “truth” is decided by the judge (or jury) involved. In a manner of speaking, we can consider this process as giving proofs.
What about in mathematics? In its purest form, there are no experiments to run, and ideally we do not appoint random people as arbiters of truth in math (like judges in the legal system). Because of the nature of the statements (as you will see) we want something more robust and universal. That said, the job is pretty much the same: We want to figure out which statements are true, and which statements are not.
Formulating statements themselves is a bit of a tricky task (perhaps as compared to other fields) and in this unit we will be talking about how to express ourselves in this language that is somewhat free from human language. Besides that, we will need to talk about how do we go about determining if certain statements are indeed true or not. As it turns out, that will be the bulk of the work for this course.
Let’s start being a little more concrete and a little less abstract! Say I came up to you and said something like “The sum of the numbers , , and so on until is exactly .” You might first of all wonder “Why do we even care?“. As it turns out, even such a benign statement has uses in algorithm and program analysis. For example if we had the following for loop:
for i in range(n):
for j in range(i + 1):
do_something()
How many times is do_something()
called? Perhaps we want to know because do_something()
takes very long to execute, and we want to know how often are are doing it.
Looking at this program and being able to reason about this sort of behaviour is something discrete math is about (in computer science). For example, we know that when , the inner loop runs time. When , the inner loop runs times, and so on. So the inner loop is run , which is written as . But how do we know that this is the same as ?
Sure we could try it for and realise that the two are the same. But what if there is some magical funny number for which they are different? How do we know for a fact that it works for all numbers?
This is the kind of question whose truth we wish to determine or ascertain in discrete math: “Is it true that for all numbers , adding numbers up until is the same as ?”
Reformulating the statement to look more mathematical, we would instead write something like:
Is it true that for all numbers , is ?
Here, means .
There’s a lot of these statements that we can ask. And in due time, we will ask, and answer a lot of these. But first! We also need to learn how to speak in a new language. Why? There are a few reasons:
- We want something that isn’t 100% bound to any particular language.
- Human language is quite ambiguous at times. And different languages are ambiguous in different ways.
We can’t eliminate the use of human language completely without making our jobs entirely miserable, but hopefully as a huge plus you’ll see that after we pick up this new system, it actually makes it easier to reason about statements in computer science and math.
Why must we do this?
It is a little hard to explain the ease of use this new skillset would bring to someone who hasn’t ever used it before. Perhaps the best way is to show an example and how unwieldy plain human language is for the task.
Let’s say we had the following code:
def find_max(list_of_numbers):
max_value = list_of_numbers[0]
for idx in range(1, len(list_of_numbers)):
max_value = max(max_value, list_of_numbers[idx])
return max_value
How do we argue it is correct? I know the code probably looks correct and most people would not really question it. But for the sake of example, how does one really look at this and convince themselves the code is correct? You might say something like:
Well we start from the first element of the list, set that to
max_value
, then go through every element of the list, and we keep updatingmax_value
to the larger so it has to be the maximum of the entire list.
Which pretty much matches our intuition. And intuition isn’t a bad thing! But what happens if we start looking at more complicated programs, bigger algorithms? We might want something a little more reliable.
The other issue is that some terms here in English are often vague. For example, here’s a common sign that we see in SoC: “No food and drink allowed in the computer lab.”
So which of these options does it mean?
- No (food and drink) allowed in the computer lab.
- (No food) and drink allowed in the computer lab.
- (No food) and (no drink) allowed in the computer lab
And another question you might ask yourself is whether option 1 and option 3 are the same. Do they mean the same thing? Some might interpret them to be the same. So what happens to the following people?
- The one that brings both food and drink into the lab.
- The one that brings only drink into the lab.
- The one that brings only food into the lab.
Which ones get banned and which ones do not? This quirk in human language is something we will want to avoid.
If you’ve ever noticed that legal documents tend to be verbose, it has in part to do with the fact that they aim to eliminate all forms of confusion as possible. Sometimes legal arguments are about how to reasonably interpret agreed upon literature. In computer science we take a slightly different approach of having a slightly modified language where the interpretation is mostly agreed upon by everyone.
The reason for this difference is that unlike in the legal field, we are more concerned with statements and truths that are inherently mathematical in nature. For that reason, almost no amount of verbosity in English will be helpful here. Furthermore, it is far more efficient and straightforward to pick up a new way of speaking and reading. But don’t worry! You will have an entire semester to practice this and I promise your journey in formal CS will be smoother because of it.
In English, the grand scheme of this unit is this: We want to be able to write proofs. To do this, there are a few things you will need to know. How to read and understand basic sentences in proofs, the rules that are applicable in proofs and how to avoid errors. And how to “say what you want to say” in proofs. The roadmap laid out for you is the following: Learning the basic sentences of proofs. Learning how to connect sentences together. Learning how to make deductions starting from assumptions and how to avoid errors when doing so. Then higher level proof strategies.
But to sum up:
Unit Learning Outcomes
- Part 1: Understanding propositional logic
- Knowing and forming propositions and propositional formulae
- Knowing and using logical connectives on propositions to form propositional formulae
- Evaluating propositional formulae
- Understanding logical equivalences
- Part 2: Understanding first order logic
- Understanding Predicates
- Understanding quantifiers
- Evaluating first order logic statements
- Part 3: Understanding basic proofs
- Knowing what assumptions, premises, and conclusions are
- Knowing what a proof needs to do in order to prove a statement
- Inference/Deduction rules for proofs
- General Proof Strategies
Just remind yourself, big picture: we want to do proofs.
Part 1: Propositional logic
Basic Propositions
Our starting point is forming propositions. Think of this as part 1 of the grammatical system. Let’s re-examine that sentence again and see how we might view this in a new light:
Instead of “No food and drink allowed in the computer lab.”, let’s first re-write it into “No food allowed in the computer lab and no drink allowed in the computer lab.”
Now in propositional logic, we will consider “No food allowed in the computer lab” as a proposition. We will also consider “no drink allowed in the computer lab” to also be a proposition.
Think of a proposition as a statement that is either true or false. So these are more examples of propositions:
- (20 + 25)(20 + 25) = 2025
- 50 - 20 = 1
- Birds can fly.
In physics there are atoms the widely-assumed-to-be indivisible objects of this world. Here in propositional logic, propositions are the atoms (in that these are the smallest things).
Now, the first statement is true, the second is false, but what about the last one? One might think yes, of course birds can fly. But what about penguins, ostriches, and emus? So is the statement false? Is it true? Depending on who you ask, the answer might be different but thankfully, this is not the type of statement that we are concerned with in CS and math.
Logical Connectives
Next thing to note is that we can form bigger and bigger formulae from smaller ones by using the following logical connectives:
- and
- or
- if, then
- not
Again, using our example, we have the propositions “No food allowed in the computer lab”, “No drink allowed in the computer lab”, we can make the following formula:
No food allowed in the computer lab and no drink allowed in the computer lab.
Likewise, we could also make the following other examples:
- No food allowed in the computer lab or no drink allowed in the computer lab.
- If no food allowed in the computer lab then no drink allowed in the computer lab.
- Not (no food allowed in the computer lab).
Perhaps the last one is a little clunky, and over time we’ll start avoiding English and these clunky issues will stop persisting.
In general, we expect to do the following with logical connectives:
- [Proposition 1] and [Proposition 2]
- [Proposition 1] or [Proposition 2]
- If [Proposition 1] then [Proposition 2]
- Not [Proposition]
To be clear, here some other examples of simple propositional formulae that we might see in math and CS:
- 1 + 1 = 3 and 25 + 1 = 26
- not 1 + 1 = 3
- 1 + 1 = 3 or 25 + 1 = 26
- if 1 + 1 = 3 then 25 + 1 = 26
We can also chain these to make even bigger ones, like so:
- and and
- if and then or
Rounding Up
- Propositions are statements that are only either true or false, and never both at the same time.
- We can create even bigger formulae by connecting other propositions using either and, or, if, then, and not.
The next thing to ask is how do we determine the truth values of the bigger propositions? This is a key step in eliminating vagueness from our language: the wider math and CS community has agreed on these meanings and this behavior.
Logical symbols
One more thing before we proceed, in line with understanding conventions, let be propositions (substitute them with anything you like), we write the following:
logical connective | respective symbol |
---|---|
and | |
or | |
if then | |
not |
Behaviour of logical connectives
Let’s define the behaviours of the operations now.
true | true | true | true | false | true |
true | false | false | true | false | false |
false | true | false | true | true | true |
false | false | false | false | true | true |
Okay that table might be a little overwhelming, but it’s a good summary of what you need to understand in this section. Let’s get into it.
The And Connective
Let’s focus on the and operation, here’s the table containing only those relevant columns.
true | true | true |
true | false | false |
false | true | false |
false | false | false |
So what’s going on here? If we have two propositions , then is true only when both of them are true. Otherwise, if at least one of them is false, then is false. This might be the most intuitive one.
The Not Connective
Moving on, let’s talk about the not operation.
true | false |
false | true |
So this one might be a little intuitive too, when a proposition is true, applying the not operation makes it false, and vice versa.
The Or Connective
Let’s move on to something slightly more unintuitive, the or operation.
true | true | true |
true | false | true |
false | true | true |
false | false | false |
Let’s get the obvious stuff out of the way, when and are both false, has to be false. And when at least one of is true, then is true. But what about when both are true? By convention we have chosen to say that is also true. It might seem slightly unintuitive, in English it is common we think or means that only one of or is true but not both. But in mathematics, this is the more wieldy definition.
The Implication Connective
Lastly, the most unintuitive of the bunch. Let’s spend some time on this one:
true | true | true |
true | false | false |
false | true | true |
false | false | true |
Let’s use this statement as an analogy:
If “Sam is a cat”, then “Sam has paws”.
When is the statement true? What is the statement false?
Notice that if Sam is indeed a cat, but Sam does not have paws, then the entire statement is false. Think of this as a promise that has been broken.
Here’s the intuition in English: The statement is a commitment. A promise that as long as Sam is a cat, then in return we know Sam must have paws. So the moment we find out it is otherwise, the statement is false.
Okay, so far we’ve been considering when Sam is a cat. But what about if Sam is not a cat? Do we care if Sam has paws? Should we expect the promise to hold? No. It is not applicable anymore, so in this case we consider the statement to be true regardless.
So let’s move back to the bigger picture, given a statement , here are the important things to note:
- We call the antecedent.
- We call the consequent.
- When the antecedent (which is here) is true, then must be true.
- When the antecedent (which is here) is false, then it does not matter what is, is always true.
You might wonder at this point “why is it defined this way?” and you will see the answer when we start doing mathematical proofs with them. The answer is like an onion, there’s many layers to it:
- At a beginner level, the answer is that “It is actually quite intuitive to think of it that way.”
- At an intermediate level, the answer is that “A lot of the proofs line up and things work out.”
- At an even higher level, there’s nothing too special about the “if, then” connective… but it does work out nicely for what we want it to do.
Here’s yet another example of how to use :
Which reads:
If is , then is .
What happens if is not ? Then can we say is not ? We can’t! After all, consider when . Then is , but is still true.
Anyway! Don’t worry too much about it, my recommended way of viewing it right now is just that these are very common logical operations we wish to perform, and therefore we have chosen to give these a name.
Evaluating formulae
So given a formula, for example, , a truth value assignment to this formula is when we set each proposition (which we can also call a variable) to either true or false. Here’s an example, means that: is now because of the values of . Furthermore, is now because of the values of . Now, the entire formula is basically , which is, as we know, .
In plainer terms: we are just substituting variables for truth values then seeing what the resulting truth value is. Maybe in high school you might have seen something like , and if you substitute then we know .
Truth tables, logical equivalences
At this point it might be a good thing to talk about when two formulae are the same. Let’s consider these two as an example:
Are these two the same? Perhaps we could work intuitively first and see what it means. When is the first formula true? When is it false? One way to figure that out is work it out by hand, and using something called a truth table.
true | true | true | false |
true | false | true | false |
false | true | true | false |
false | false | false | true |
In the table, we write out all possible truth value assignments (we have so there are 4 possible truth value assignments) and work out each intermediate step. The last column is the last one we care about. And here notice that the third column depends on the first two, and the final column depends only on the third column.
Let’s do the same for the second formula:
true | true | false | false | false |
true | false | false | true | false |
false | true | true | false | false |
false | false | true | true | true |
This time, the third column depends on the first. The fourth column depends on the second, and the final column depends on the third and fourth.
Oh look! The final column is the same. This means the two formulae are logically equivalent. The other way of seeing this, is that no matter how we set in the first formula, if we also set the same way in the second formula, it evaluates to the same truth value.
Consider instead these 2 examples:
Are these equivalent? Let’s try making another truth table:
true | true | true | true |
true | false | false | true |
false | true | true | false |
false | false | true | true |
Notice how when is set to true, and is set to false, is false, but is true. So these two formulae are not logically equivalent.
Is there a different way for us to tell if two formulae are equivalent or not?
Yes, there are a few ways, but for now this is the most reliable way. To be clear: If for all possible truth value assignments, the two formulae always evaluate to the same value, they are logically equivalent, otherwise, they are not.
Rounding Up
- We can evaluate the truth value of formulae.
- We can create truth tables from formulae.
- We can compare to see if two formulae are logically equivalent.
What
In due time you’ll see that this is one of the few tools we have to understand and navigate this new language. It is a little hard to see the forest for the trees right now. But trust that understanding this and getting used to it is builds a strong foundation for everything we will be doing throughout the semester.
A sneak peek into how we might use this: Keeping sight of the goal
We have not gone into yet but here’s a rough idea of how we might expect to use this tool.
- If Socrates is a human, then Socrates is mortal.
- Socrates is a human.
- Therefore, Socrates is mortal.
Think of this as a proof that Socrates is mortal. It consists of a few parts that we have not yet covered, and will see at the end of this unit. But for now, what you can understand is that each line is made of propositions. The propositions here are “Socrates is a human”, “Socrates is mortal”. Then re-writing this, we can think “Socrates is a human” as , and “Socrates is mortal” as .
Then instead the above proof looks like:
- Therefore, .
For now we have not talked about how to take lines 1 and 2 and create the conclusion on line 3. But the first step is being able to build up the sentences. Perhaps one thing we can take away right now is the following: “If is true then is true. is true, therefore is true.”.
Let’s move onto part 2 where we want slightly more sophisticated sentences.
Part 2: First Order Logic
Let’s compare the two following proofs:
Proof 1:
- If Socrates is a human, then Socrates is mortal.
- Socrates is a human.
- Therefore, Socrates is mortal.
Proof 2:
- All humans are mortal.
- Socrates is a human.
- Therefore, Socrates is mortal.
The two proofs seem to have the same idea, they want to conclude that Socrates is mortal. Yet, the way that they do it is very different. Based on part 1, we can clearly mark out what the propositions are. If we tried to do the same with Proof 2, we would instead get the following: “All humans are mortal” is a proposition, “Socrates is a human” is a proposition, and “Socrates is mortal” is a proposition.
So we would end up seeing this:
- Therefore .
At this level, the 3 lines of proof 2 look unrelated to each other. And that seems to be an issue. After all, we would be very happy to accept something like Proof 2. It does look quite reasonable. So what are we missing? Remember that when talking about basic propositions, we said that propositions were like “atoms”, indivisible and we were not allowed to pick apart the internals or the meanings of it. The other thing that will be useful will be to say “everything” or “something”. So to do this, we have 2 new concepts to introduce to our “sentences”: Predicates, and Quantifiers.
Predicates
The first thing to do, is to create a new kind of “word” in our sentences, called predicates. In proof 2, we want to create a predicate called . You can think of like a function that takes objects, and outputs either true or false. So based on this, instead of saying “Socrates is a human.”, we will instead say ” is true”. Here, is an object, and the predicate evaluates to true when given as input. Perhaps is another kind of object, and evaluates to false.
Very similarly, instead of “Socrates is mortal”, we will instead write .
With that, we have changed lines 2, and 3 of Proof 2. But what about line 1?
Quantifiers
The second thing to do, is to introduce quantifiers. We want to be able to say “every human is mortal”. In order to do so, we will write the following:
Okay, maybe a little intimidating. How do we read this? Let’s begin with "". is mathematical notation that means ” is in the set of humans”. Think of a set here as just a collection of objects. Here we are basically saying that is a human. The symbol means “every”. So putting it together, the first part is essentially saying “Every human”.
The second part is probably a little more readable: ” is mortal.”. So putting that together, we have:
This is what we call the universal quantifier. Think of this as a quantifier that says something about the entire universe. Here, the universe is the set of humans.
To elaborate a little bit more, let’s for now pretend that all the only two humans in the world were John and Sam. Then we would write the following:
Then coming back to , since (in other words, comes from the set of humans), we have it so that will take value , and then will also take value . So is true, and is also true.
There is another quantifier we have not mentioned, the existential quantifier. What if we instead wanted to say “some humans are mortal”? We write the following:
How do we read this? Now the means “there exists is some human that we will call “. And the second part says that ” is mortal.”. In English:
Coming back to Proof 2, here is how we will write it:
- Therefore
Again, we have not talked about how to tell this proof is valid (or even what is a proof), but the goal of this part is to make sure you are able to at least read back each line to yourself in English and be convinced of its meaning.
Here’s a diagram that roughly explains the format:
Another example: Expressing Even Numbers
How should we say a number is even? In English we might say something like “A number is even if it is divisible by 2.” What does it mean here to be “divisible by 2”? After all, we can divide by , we just get . Perhaps what we mean to say is that a whole number is even when is also a whole number. We will also need to take our numbers from a set. For this, the symbol denotes the set of all whole numbers (integers).
In discrete math, we say say that is an even number if:
Let’s read this back in English, and see what it means:
There exists a value from the set called that we will call . For this value , times is equal to .
When we write , this means is in the set of whole numbers. That is to say: is a whole number. (It can be negative, it can be , it can be positive, but it is a whole number). We call the set of integers.
Rounding Up
- We have seen uses of predicates.
- We have seen uses of existential and universal quantifiers.
For now, perhaps when and how we can make predicates is a little vague but the best way to understand them is via seeing them in action in Part 3 (and the rest of the semester). For now, take them to be the way we give “properties” to objects, like how we can say “Socrates” (as an object) has both the property of being human, and also mortal.
Certain manipulations and properties about quantifiers
There is an interesting aspect about quantifiers we need two talk about: up to the previous part, we have been able to say things like:
“Every person is mortal.”
“There exists an even number.”
What about if we wanted to say something like:
“Every car has a steering wheel.”
“There is a planet that everyone lives on.”
Then, we need to use more than one quantifier.
What happens if we have more than a few of them? Let’s see some examples relating to numbers that does that. We will need a set to work with, let’s consider the set of all non-negative integers, i.e. the set that contains and so on. This set is denoted by the symbol . We call these natural numbers.
We will use the symbol to mean “smaller than or equals to”, and to mean “greater than or equals to”. What if we wanted to write the following mathematically?
- There exists a natural numbers that is smaller than or equals to all natural numbers.
- It is not the case that there exists a natural numbers that is greater than or equals to all natural numbers.
Let’s begin with the first one, this is a prime example in nesting quantifiers. That is to say, using more than one.
Very succinct right? Reading it back, here’s how we should parse it:
There exists a non-negative number that we will call , fix this . For this , for all non-negative numbers , is less than or equals to .
One thing to take note of here is that is chosen before considering all values of . Do we believe this statement to be true? To prove that this statement is true, we need to pick a value for . What should the value be? It should be !
Let’s look at the second statement.
Notice that we have surrounded the entire statement with a "". This is done to say that we want the negation of the inner statement. What is the inner statement saying? It is saying “there exists a non-negative number that we will call . Fix this , for every non-negative number , is greater than or equals to “. Since we want the opposite of that statement, we added the negation on the outside.
Alternating Quantifiers
The first question we might want to ask is: Do the order of the quantifiers matter? For example, for the first statement, what if we had instead written:
Reading this back, this now says:
For every possible value, call it , we can find at least one for which is smaller than or equals to .
Do they mean the same thing? The original is saying we can find a value that is smaller than or equals to all other values. The latter is saying that no matter the value we pick, we can always find something smaller than or equals to it. These do not mean the same thing!
Here’s an analogy, are these two statements the same?
“Every car has a steering wheel.” vs. “There is a steering wheel, that every car has.”
See how they don’t mean the same thing?
Negating Quantifiers
Let’s also take a look at what it means to negate a statement that has quantifiers in it. Let’s think about when a number is not even. We know that we can write this as:
That’s simply by negating it. But we can also write this as:
Which basically says, “for every integer , is it not the case that is equal to .” In plainer terms: it means we cannot write as , where is an integer. Let’s go through on more example, the second statement from the previous section:
Can we think of a way to write this where we do not have a negation on the outside? We’re trying to say “It is not the case that there is a single value that is greater than or equals to all values”. Why is this the case? We can think of this statement equivalently in the following way:
For every value , it is not the case that is greater than or equal to all values.
Mathematically:
Take a while sitting on this and reading it to convince yourself it makes sense.
We can go a little further, and say:
For every value , we can find a value , for which is it not the case that is greater than or equals to .
Mathematically:
Again, take a while to sit on this and convince yourself that they are the same.
In general: We can move a further to the right past a quantifer by changing it from a to a , and vice versa.
So for example, all of the following are equivalent:
Variable Naming does not matter
The last thing you might wonder is whether the variable names matter. It does not! So, for example, these are all the same:
Pay special attention to lines 2 and 3 and notice that we have swapped the names for and , but we have also swapped how we use them in the predicate .
Implications, and Equivalences
Let us end part 1 and 2 on 2 important concepts: Implications, and Equivalences.
Implications of Statements
We have been talking a lot about forming statements, and it’s time to start talking about two potential relationships between statements.
For example, let’s say we had the two following fictional statements:
Statement 1: All blargs have paws. Statement 2: All zorps and blargs have paws.
What can we say about Statement 1 vs Statement 2? Let’s say we believe in statement 1. Can we then say “Therefore statement 2 is true.”?
What about the other way around? If we believe statement 2, can we then say “Therefore statement 1 is true.”?
We have yet to talk about how to formally make these deductions (in Part 3), but let’s appeal to your sense of intuition for now. It makes more sense that Statement 2 follows from Statement 1. Because of this, we will say that “Statement 2 implies Statement 1”. Reminder that we can write this as “Statement 2 Statement 1”.
Okay that was the intuitive direction. But can we also say “Statement 1 Statement 2”? Let’s think about whether that seems reasonable. A good counter-argument might be the following:
If we believed statement 1, we are only convinced that all blargs have paws. From this statement alone, we actually don’t know anything about zorps.
In the case that it turns out that zorps did not actually have paws, we cannot believe that Statement 2 is true.
So we should not be able to say “Statement 1 Statement 2”.
Contrapositivity
This covers the idea of when a statement implies another statement. There are also a few other key features we should talk about implications in general. Let’s say we knew "". I.e. if is true, then must be true.
We can also argue that if is false, is false. I.e. . This is called the contrapositive form of the first statement.
Equivalences
The last thing to round up on is talking about when two statements are equivalent. Consider the two following statements:
I like ice-cream or I like cake
It is not the case that (I do not like ice-cream and I do not like cake)
The idea behind the first statement is that either the person likes ice-cream, or likes cake, or likes both ice-cream and cake.
What about the second statement? It might look a bit confusing, let’s take this step by step. Reading the inner portion, “I do not like ice-cream and I do not like cake”. This means that there is only one possible case, the person dislikes both. But once we add the outer “not”, it means that this is the case that is impossible. So what are the possible cases then? If the person in the second statement either:
- Likes only cake
- Likes only ice-cream
- Likes both cake and ice-cream
Then we can say it is not the case that they dislike both ice-cream and cake. But wait! Isn’t that the same as the first statement? We used different words to say the same thing.
This might already look familiar to you, earlier on we talked about how these two formulae are logically equivalent: and .
This is part of a general phenomenon. Here are some other intuitively equivalent statements:
- and are equivalent
- and are equivalent
- and are equivalent
- and are equivalent
How do we tell? One way is to use the method used in the section: Truth tables, logical equivalences.
Part 3: Proofs in First Order Logic
Okay! We are finally in place to start making proofs! Now that we know what the words and sentences look like, the next and final step in this unit is how we are to go about deducing statements that we want. For us to do this, we need to recognise the form a proof, what it is, what are steps that we can take in proofs.
Our plan of attack for Part 3 is roughly the following:
- We will look at informal proofs in English with a hint of math.
- We will talk about proofs and how they correspond to the statements they prove.
- We will talk about what rules we use in math in our proofs.
- We will end on looking at bigger, and bigger proofs.
As for point 3, it will be a bit overwhelming, but my reasoning is that I would like the page to also be a reference that you can come back to, to look at all the rules that are allowed. Over the semester we will try to get you more and more accustomed to the rules by doing proofs.
Going into this part, it’s helpful to take into the mindset that we are trying to understand a systematic way to form arguments. And to do this, the broad idea is that we start with our assumptions, and we make step-by-step logical deductions.
First example of a proof:
Let’s re-visit the example we had just now:
- [Premise 1]
- [Premise 2]
- Therefore [Conclusion]
I understand it takes a little getting used to reading symbols, but the more you do it, the sooner you get used to it. The first feature of a proof are what we call the premises. In the above proof, lines and are premises. Think of premises as statements that we assume to be true. After all, we do believe every human is mortal, and we do believe the Socrates was a human.
What about line ? Line is the conclusion of the proof. This is the final statement that we wish to conclude. To be clear, we are not assuming that Socrates is mortal, we want to be able to conclude it. To do so, we must deduce line using lines and .
In order to do so, we will use rules of deductions. We will exhaustively list them out later. But for this current introductory example, we are using a rule called universal modus ponens.
It’s a very fancy name, but what it means is that if you see any line that looks like:
where is some set, like . And is any statement about , like ,
and you also see a line like:
like when we said ,
then the rule modus ponens allows you to deduce that is true in your proof.
Okay this is a little abstract, what does universal modus ponens mean in English? Let’s take a step back and try to think about it. If we have a line that says:
we are essentially saying “For every possible object from set , holds true.”
Furthermore, the line
is saying that is from set .
So since we know comes from set , we can happily conclude that “A-ha! satisfies predicate !“.
Notice here that the rule doesn’t care about what we said about humanity or mortality. As long as it matches the pattern, it will be allowed. That means we can also write something like this:
- Therefore .
The first line is saying that Tabby is a cat (or rather that Tabby is in the set of all Cats). The second line effectively is saying all cats have paws, and the last line is saying Tabby has paws.
So let’s re-cap a little at this point what has gone on. Lines 1 and 2 are premises (notice we didn’t prove lines 1 and 2, we are assuming they are true on good faith). Line 3 is a deduced line using lines 1 and 2, and the deduction rule used modus ponens.
Now, very importantly, what have we done here? We have written a proof that effectively has proven the following statement:
“Assuming that Tabby is a cat and assuming that all cats have paws, then we conclude that Tabby has paws”
Formally, we will write the proven statement in the following way:
What is the above proven statement? The above statement says that if is true, and is true, then is true. See how this matches what we have in quotes? Take some time to appreciate the similarities between what we have in English, and what we have written out here in the formula.
Okay, that was our first example. To do more involved things, we need to first look at some rules of inferences. In the later parts, we will show examples of proofs that we want to do. Focus on the following:
- What the premises are
- How we obtain the intermediate steps using rules of inferences
- What is the conclusion
Correspondence between proofs and statements
Bear in mind that very importantly, if we have given a proof that starts with premises , and we derive statement as our conclusion, then we have the following proven statement.
Assume , then it follows that .
We call proven statements as theorems.
Look at the example again the two things: (1) the proof that Tabby has paws, and also (2) the proven statement that we obtained due to the proof. Look at how it corresponds.
An Aside
Take some time to appreciate that what we are doing is actually making formal, rigorous arguments using first order logic.
Why do we do this? The idea is that we want a systematic approach in telling us what is true and what is not. In some sense, in the future when we are concerned with whether our algorithms/programs are correct, whether we can apply our concurrency guarantees, whether our database schemas are doing what we want, we want something better than having an arbitrary human be the arbiter of truth.
In other words, whether an algorithms works should not be based on gut feeling, or based on our subjective moods. Having an intuition and being convinced that something works is important, yes. But the tools that we are about to present to you are say that you may derive truth in a more objective manner.
Allowable Rules of Deductions/Inferences
In this section, we are going to list almost all of the allowable rules in proofs. We will show one or two proofs that try to demonstrate how each rule is used. In general, the example proofs will be really tiny to try to use only that rule in isolation. (But sometimes that might not be possible.)
At the end of the unit we will show bigger proofs that use some of these rules in combination.
Rule: Definition Unpacking
Throughout discrete math, we like giving common and important concepts names. Again, a formal way of saying is even is to write:
Formally, we can say:
The predicate is defined to be .
It is hard to demonstrate this rule in isolation so we will see it being used later on in the subsequent rules.
Definition Unpacking Rule
Given a definition, e.g. , and a line of the proof , we may derive the line on the other side of the , which is .
Similarly, if we are given the line , we may derive the line .
Throughout the course we will see more and more definitions (also in the tutorial). For now let us use this one for our remaining examples for this unit.
Rule: Logical Equivalences
Remember that we talked about how to check if two statements are logically equivalent? This is a step that we will allow in our proofs! Here’s an example:
Theorem
Assuming is true, then is true.
Now this probably looks familiar, these was one of the examples we actually used to talk about equivalences. So here’s how the proof step goes.
Proof
- Assume .
- [Logically equivalent to line 1]
To be clear, we know that and are logically equivalent because before this, we verified it with a truth table. You can find it again in section Truth tables, logical equivalences.
Here’s another example, i.e. the opposite direction:
Theorem
Assuming is true, then is true.
How do you think you should prove this? Try writing it down by hand first if you wish. We have the solution here, you can click on it to reveal the answer.
Solution
- Assume .
- [Logically equivalent to line 1]
So to be clear, when can we use this rule? We can, if we have separately checked/verified their logical equivalence via a truth table.
Logical
Given a statement, we may derive a new statement from the previous if the new statement it is logically equivalent. Note that we may verify if two statements are logically equivalent via truth tables.
In the tutorial sheet, we will cover some special equivalences that are very useful and handy.
Rule: Basic Algebra
Example usage:
Theorem
Assuming , then .
Proof
- Assume that .
- Then [By Basic Algebra from line ]
- Then [By Basic Algebra from line ]
Here, line is our premise. Line is our conclusion. And the justifications are laid out in square brackets. Basic algebra is something we are happy for you to use (for free)! You can think of line as an intermediate step. It is neither a premise nor a conclusion, but we can write line because it is a derivation from line . Similarly, line is derived from line .
One other thing to take note of is the theorem statement vs the proof. The statement starts with “Assuming “. This must be the very first line of our proof. Secondly, the proof ends with “then “. This is the conclusion we must prove. So this must be the very last line of our proof. Every other intermediate line must be justified.
Don’t worry too much about how much algebra you need to know. If you know how to add, multiply, divide, square root, exponentiate, and logarithms, that is all the algebra you need to know.
Rule: Specialisation
Example usage:
Theorem
Assume (), then .
Proof
- Assume that ().
- Then . [By Specialisation on line ]
Again, line is our premise, line is our conclusion. How did we derive our conclusion? We used the rule of specialisation on line . What is specialisation? In English, it takes a statement like , and says that you are allowed to conclude . Let’s think about what it means. Intuitively, if you are convinced that both and are both true. We should be able to say that is true.
Specialisation Rule
Given statement , we are able to derive statement . Furthermore, given statement , we are able to derive statement .
Rule: Conjunction
Example usage:
Theorem
Assuming then .
Proof
- Assume that .
- Then . [Basic Algebra from line 1]
- Then . [Basic Algebra from line 1]
- [Conjunction on lines 2 and 3]
This time, the notice that lines 2, and 3 followed from line 1. Since we derived both of those statements, we know both of them to be true. Therefore we can say line 2 and line 3 are true. So, we can use the connective on both lines.
Conjunction Rule
Given statement , and separately . We are able to derive statement .
Rule: Generalisation
Example usage:
Theorem
Assume , then .
Proof
- Assume that .
- Then . [By Generalisation on line 1]
This looks a little different. Let’s think about what this means intuitively in English: “If we are convinced that statement is true, then we are convinced that statement is true.”.
Generalisation Rule
Given statement , we are able to derive statement . Furthermore, given statement , we are able to derive statement .
Rule: Proof By Cases
Example usage:
Theorem
Assume , then .
Proof:
- Assume .
- Case 1: Assume
- Then [Basic algebra]
- Case 2: Assume
- Then [Basic algebra]
- [Proof by cases on lines 1, 2.1, 3.1]
What is going on here? We are saying that if is either or , then in both cases they are bigger than . We prove this for each case separately (in this small example this was pretty straightforward). Importantly, if we had more than cases, we need to prove more things. Here’s yet another example:
Theorem
Assume , then .
And notice here how the proof changes:
Proof:
- Assume .
- Case 1: Assume
- Then [Basic algebra]
- Then [Basic algebra]
- Case 2: Assume
- Then [Basic algebra]
- Case 3: Assume
- Then [Basic algebra]
- Then [Basic algebra]
- Therefore [Proof by cases on lines 1, 2.2, 3.1, 4.2]
In general:
Proof by cases rule
Given a statement , and if we can assume to prove , and if we can assume to prove , then we can conclude .
Must we handle each case?
Yes. Here’s an example of how you could go wrong if you don’t. Consider this faulty statement:
Which says that if is , or is , then . So let’s consider setting . Then evaluates to true, but , which means is false.
Here’s a faulty proof that skips a case:
Faulty Proof:
- Assume .
- Case 1:
- Then [Basic algebra]
- In all cases, it is shown that .
Rule: Modus Ponens
Example usage:
Theorem
Assume that , and further assume . Therefore .
Proof
- Assume .
- Assume .
- Therefore [By Modus Ponens on lines 1 and 2]
This example is a demonstration of a classic rule of inferences. It takes 2 lines:
- If we believe in , we must also believe is true.
- We believe in
And makes the following conclusion: 3. We believe in .
Modus Ponens Rule
Given statements , and , we are able to derive statement .
Rule: Modus Tollens
To make things a little simpler in our proof system, and a little more flexibility: let’s also think (intuitively first, before formally) about what else we could say. What if instead we were given the following?
- If it is raining, then I will bring an umbrella.
- It is not the case that I will bring an umbrella.
Can we say something about whether it is raining? Well we were promised if it was raining, we would have brought an umbrella. Considering how we are not bringing an umbrella, it cannot be raining. So we can actually also do the following:
Theorem
Assume that , and further assume . Therefore .
Proof
- Assume .
- Assume .
- Therefore [By Modus Tollens on lines 1 and 2]
In general, here is the rule:
Modus Tollens Rule
Given statements , and , we are able to derive statement .
Rule: Implication Introduction
So far, in the previous rules, we have been using implication statements in one way or another. What if we wanted to create implication statements? Here’s an example statement we can try to prove:
Theorem
Okay this might look a little intimidating. Let’s read it back in English, what is it saying? The statement here is that “If we believe is true, we believe is true”. Hold on a minute, this looks very familiar! Doesn’t this look like the Specialisation rule? Yes! Except now the statement has an implication instead of “Assume , therefore “.
Here’s the proof and how we use the deductive rule.
Proof:
- Assume .
- [By Specialisation on line 1].
- [By Implication Introduction on lines 1 and 1.1]
So what is going on here?
- On line we have made it very explicit that we are making an assumption that is true.
- We derived line 1.1 using the Specialisation rule on line 1.
- We derived our concluding line 2 by using the Implication Introduction rule on lines 1 and 1.1.
What’s the idea? Intuitively, our proof system makes an assumption that is true, so since we assumed it to be true, we can now start deriving other lines from it as well. In fact, line 1.1 is such a line. Line 1.1 is also the sub-conclusion from line 1.
Since we assumed and we concluded from it, the Implication Introduction rule takes the assumption, and also the sub-conclusion, to create the final line. In this case line 2. It takes on the form . So in our example, we obtain line .
In general, here is the rule:
Implication Introduction Rule
Assuming statement , if statement is derived as a sub-conclusion, then the Implication Introduction rule derives statement .
Rule: Double Negation
Here’s another (perhaps intuitive rule) that we have about the negations. In math, a double negative is pretty much the same as the original thing. That is to say: is logically equivalent to . While this might not make sense in real life, this is something that math abides by.
Double Negation Rule
If we have a statement , we are able to derive statement .
Frankly speaking this rule is rarely ever used in isolation. We will see uses of this in bigger proofs.
Rules: (Existential/Universal) (Generalisation/Instantiation)
For the sake of exposition, it is a lot more natural to consider all these 4 rules together at the same time for this section.
Let’s begin with a smaller example that demonstrates the use of universal instantiation. Let’s see that in action by proving this theorem formally:
Theorem
Assuming , then .
What is this theorem saying? It is saying that if we believe that any integer squared is non-negative, then squared is non-negative. How do we prove this? Let’s see this in action:
Proof
- Assume .
- [Basic Algebra]
- [Universal instantiation on lines 1 and 2].
What has happened here? Let’s explain the idea of the proof in English. Line 1 is our premise, it is assuming that all integers are such that if you square them, they are non-negative. Line 2 is bringing up the fact that is an integer. And line 3 is basically stating the following:
Since all integers are such that if you square them, they are non-negative. It is also true for any specific integer. We are convinced on line 2 that is an integer. Therefore, we are convinced by combining lines 1 and 2 that is also non-negative.
For the final proof of this section, let’s think about how to prove the following statement:
Theorem
.
Let us read the statement we wish to prove in English, it is basically saying, “Take any integer , if it is even, then is even as well”. Intuitive right? Let’s see how a mathematician does it.
Why is this true? Here’s the high level idea, we know that if is an even integer, we can re-write as . Then we also know that . Since is an integer, is also an integer. So that means we can write can be written as where is some integer.
Okay that’s the idea, but how do we do it formally? Again, we will want to use some rules to help us make the deduction. Let’s see them in action:
Proof:
- Let be arbitrarily chosen from .
- Assume that .
- [Unpacking definition of ]
- Let be such that [Existential instantiation on line 1.2]
- [Basic Algebra]
- [Basic Algebra]
- Since , [Basic Algebra]
- [Existential generalisation on lines 1.5 and 1.6]
- [Unpacking definition of ]
- [Implication introduction on lines 1.1 and 1.8]
- . [Universal generalisation on lines 1 and 1.9]
Okay! This is a lot of text, let’s go through this slowly, it re-uses some old rules you were already shown, and it uses 3 new rules here. What is going on?
The proof starts off by taking to be an integer value (i.e. from ). So the subsequent lines (1.1 through 1.9) are allowed to treat as any arbitrarily chosen integer from . It may be , it may be , who knows. Then line 1.1 assumes that we consider only values that are even. This means that if we were given a value like for , our proof is not applicable anymore. But that’s okay! We don’t want to say anything about odd numbers anyway.
Next up, we unpack the definition of . Remember that an integer is even if we can write it as for some value . Line 1.2 is just reminding us of the definition of being an even value.
What about line 1.3? This is the first new rule we have encountered. Intuitively in English, what is being done here is the following:
On line 1.2 we are saying ” is equals to times an integer.” Therefore, we are able to deduce line 1.3 which states “Let us call that integer .“.
This might feel pedantic, but imagine how in English there is a subtle difference between:
“Something is cold” vs “Call the cold thing “.
The former sentence has not given the “cold thing” a name. The latter sentence has given it a name. Then deduction rule is basically trying to say “Since we know a cold thing exists, we can give it a name. Let’s call it “. Similarly, in our proof above, the deduction rule is basically trying to say “Since we know is some value times , we can give such a value a name, call it .” In doing so, notice that the new line has effectively removed the symbol.
Let’s keep going. Now that we’ve given that value a name , we can start referring to it, and manipulating it. So lines 1.4 through 1.6 are all just basic algebra.
What about line 1.7? What is it doing? In some sense it is actually doing the opposite of line 1.3.
On line 1.5 we said that . On line 1.6 argue that is also in . Since we know a specific value from for which , we know some value from exists for which . Therefore, .
Again, this might feel weird but it’s basically doing the reverse of what we had explained earlier:
“We know ice is a cold thing” vs “Exists something that is cold”.
The deduction rule here basically takes the former sentence and deduces the latter.
Okay! Let’s keep chugging on. 1.8 is more definition unpacking, and line 1.9 re-uses the previous deduction rule of creating an implication statement.
Finally, what’s going on on line 2? It’s saying the following:
Since we took arbitrarily from , and we were able to create the sub-conclusion , we are able to write .
And if you read back the concluding line, it makes sense! It’s saying:
For every possible value taken from that we shall call , if is even, then is even.
Why is this reasonable? We took arbitrarily. What about the assumption we made? We used the implication introduction rule to turn that back into , so you could technically say we made no assumptions about and did take it arbitrarily.
Here are the 4 final deduction rules in detail:
Existential Generalisation Rule
Given a line where , where is some object in some set , and another line that makes a statement about , e.g. , we can then derive the line .
Existential Instantiation Rule
Given a line , we are able to derive the line “Let be such that .
Universal Generalisation Rule
Given a line that states was arbitrarily chosen from set , and another line that makes a statement about , e.g. , we can derive the line .
Universal Instantiation Rule
Given a line , and another line that says , we are able to derive the line .
Rule: Using Lemma
Think of a lemma as a helper statement. They are proven theorems that can now be used in other, bigger proofs.
This is like how how in programming we have library functions, think of lemmas as “given for free” truths we can use in our proofs.
You’ll see an example of this rule in action in the later examples. Keep an eye out for it!
Rule: Contradiction
Before our example, let’s think about the following idea: What happens if someone comes up to you and says the following:
I am in the house and I am not in the house.
What do we make of this? Does this sound absurd? It doesn’t make sense right? Similarly, we have a rule in first order logic that does exactly that. We call this concept a contradiction. Since it is seemingly contradictory to both be in the house and not in the house at the same time. Here’s an example in math. What if we said:
We should be able to say “this makes no sense”. We have a symbol for this: . We call the symbol “bot”. But you can think of this as just the “contradiction symbol”.
Allow me to state the deduction rule first before giving an example, since it is a little involved.
Contradiction Rule
Given statement , we are able to derive statement .
Rule: Proof by Contradiction
Let’s build off of the previous rule, and continue exploring that idea. Recall in the previous rule we talked about how if we have two contradictory statements, we can write a line in our proof that says . Basically that line is declaring “A-ha! We’ve found a contradiction.”
What can we do with that line?
Here’s the rough idea: let’s say we want to prove as a conclusion that a statement like is true. One way to do that is to assume , then using our assumption, somehow derive . (See how we can use the previous rule to do this)? Then from there, the rule of proof by contradiction tells us that if from we derived , we can conclude in our proof.
This rule is a little tricky, and let’s take a step back to think about what it means, and why this is okay. Here’s the basic example of this idea in action (in English). Let’s say we want to convince someone that the moon is not made of cheese. Here’s one way we might do that:
- Let’s assume that the moon is made of cheese.
- If so, it would get mouldy.
- If so, we should notice a greenish or bluish hue whenever we look at the moon.
- Do you notice how absurd that is?
- Therefore, the moon is not made of cheese.
We can do the same thing in math, and that is via the proof by contradiction rule.
Proof By Contradiction Rule
If assuming , we are able to derive , we may conclude in our proof .
An example of this proof is deferred to the end of this unit. We will first start showing a few proofs before ending on a proof that uses this rule.
Proof Strategies
For the remainder of this part, we will be talking about how to prove certain types of statements. To do this we will give general strategies you can stick to to try to prove everything throughout the semester. After this unit, we will basically start doing proofs for most of the topics.
Also, you may have noticed right now we are only showing very basic statements about math. This is a deliberate choice. As we go on in the semester we will be showing newer and newer concepts, and applying what we have learned here.
Direct Proof
So if the goal of a task is to prove something like:
If then .
Then one way is to do this via a direct proof. This is the most straightforward, and we have been doing this all the time, this is when we assume the antecedent of the statement, and prove the consequent. As an example, here is how we accomplish our task:
Proof
- Assume .
- Then [Basic Algebra]
- Then [Basic Algebra]
- Therefore
Think of it this way, if we wish to prove a theorem like “Assuming , then is true.”, then our proof should work in the same way, where the first line starts with “Assume ”, then we make some logical deductions along the way, and our concluding line should be .
What about the following statement?
Theorem
Let’s try.
Proof:
- Let , arbitrarily chosen.
- Assume that .
- [Unpacking definition of even]
- Let be such that [Existential instantiation of line 2.1]
- Then [Basic algebra]
- [Basic algebra]
- [Existential generalisation on lines 2.3 and 2.4]
- [Unpacking definition of even]
- [Implication introduction on lines 2 and 2.6]
- [Universal generalisation on lines 1 and 3]
So what have we effectively said? We have effectively said that any even integer squared is also even.
Proof by Contraposition
What about the other direction? Can we say the following?
Theorem
With what we have right now this looks tricky, here’s a first attempt:
Attempted Proof:
- Let , arbitrarily chosen.
- Assume that .
- [Unpacking definition of even]
- Let be such that [Existential instantiation of line 2.1]
- … what now?
We could try saying but.. that doesn’t prove to us that it is even.
Let’s instead make use of the following statements for free (they can be proven but let’s take these as lemmas). I hope they are intuitive.
- If an integer is not even, it is odd.
- If an integer is odd, it is not even.
- is odd if (This is the definition of an odd integer)
Writing this out in math, we have:
- (Lemma 1)
- (Lemma 2)
Now that we have these facts, let’s try proving the statement again. Pay attention to how we start, and how we end. Contrast it against the direct proof idea. Notice that we want to prove , but we begin by assuming , and proving .
- Let , arbitrarily chosen.
- Assume that .
- [Universal instantiation of Lemma 1]
- [Modus ponens on lines 2 and 2.1]
- [Unpacking definition of odd]
- Let be such that [Existential instantiation on line 2.3]
- [Basic algebra]
- [Basic algebra]
- [Existential generalisation on lines 2.5 and 2.6]
- [Unpacking definition of odd]
- [Universal instantiation of Lemma 2]
- [Modus ponens on lines 2.8 and 2.9]
- [Implication introduction on lines 2 and 2.10]
- [Logically equivalent to line 3]
- [Universal generalisation on lines 1 and 4]
Notice, we set out to prove:
But instead we proved:
Why is this okay? Recall in section Contrapositivity we talked about how is logically equivalent to . We are doing the same thing here: is logically equivalent to .
Even though they are logically equivalent, why do we prefer doing this?
Look at the first proof again see how we got stuck. Then look at the second proof and notice that we could actually make something happen.
Proof by Contradiction
This one might be one of the coolest ones you can do. We will show two examples of this proof strategy. A small one, and we will end on a really big proof.
Example 1
Let’s think about the following idea, take some number . And consider all possible ways we can write this as , with integer values for . Ever notice how no matter how hard we try, either or has to be at most ?
For example, take a number like . We could write it as: , , (or also , , ). Notice across all 6 possible ways to write it, we have at least one of the numbers being at most .
How about something like ? That has: , , . Again, in all possible ways to write as , either or is at most .
Okay, we’ve tried this for and . You might ask yourself at this point, are and special? Or does this work for all numbers? We’ll set out to prove that this is indeed true! Let’s do that by proving the theorem below.
Theorem
Let’s read this back and see what it’s saying:
Let be any integer, let be any two integers. If , then either or .
Okay, let’s try proving this. Also note that is the same as . Since if is not less than or equals to , then has to be larger than . They’re actually the same.
Here’s how the proof goes:
Proof:
- Let , arbitrarily chosen.
- Let , arbitrarily chosen.
- Let , arbitrarily chosen.
- Assume that .
- Assume for the sake of contradiction that .
- [Logically equivalent to line 4.1]
- [Basic algebra]
- [Basic algebra]
- [Basic algebra]
- [Basic algebra]
- [Conjunction rule on lines 4.4 and 4.5]
- [Contradiction rule on line 7]
- Therefore [Proof by contradiction rule on lines 4.1 and 4.8 ]
- [Implication introduction rule on lines 4 and 4.9]
- [Universal generalisation on lines 3 and 5]
- [Universal generalisation on lines 2 and 6]
- [Universal generalisation on lines 1 and 8]
Okay, as usual I like reading proofs back in English to see what sort of intuition they can convey. We are basically:
- Take any integers .
- Assuming that are integers such that . (I.e. we only care about the cases when is written as , whatever way that may be)
- Assume for a contradiction that “okay let’s say it is possible that it is not the case that ( or that )“.
- Then if that’s the case, line 3 is also equivalent to saying "" and "".
- But wait, means . Also, , means .
- Okay so both and .
- Well okay… but doesn’t that mean that ?
- That also means that cannot be equal to . I.e. .
- We can re-write line 8 as .
- Okay so
- But that’s a contradiction!
- That means that our assumption for a contradiction on line 3 was false. So therefore .
- Now, we made an assumption on line 2 that . So we can say
- Since we took any to prove the statement on line 13, it means that:
Phew that was a bit of a doozy. But at the end of this chapter I will show you how this might be used in computer science.
Example 2
How about the following statement?
Theorem
is an irrational number.
To be clear a rational number is a number that can be written as for some value , where . Here, a rational number means a number that can be written as a fraction where are integer values, and is not . So for example, , and are examples of rational numbers. Even something like since that can be written as . An example of an irrational number is something like or (this is not obvious, but take this to be true for now).
Definition
A number is rational if .
A number that is not rational can be referred to as irrational.
The set of rational numbers is denoted by .
So we can re-state the theorem as:
Theorem
Aside
As an aside, why do we care? One potential reason might be that if we know that we can represent as a fraction, we might want to do so when involving this number in our programs.
Here is a proof by contradiction. Pay attention to how we are starting it by assuming the opposite of the theorem statement. The theorem statement says that is irrational, and we start by assuming that is not irrational (i.e. rational).
To simplify things, let’s use this following (yet unproven) fact:
Lemma
For any , can be written as , where:
- If is a divisor of and a divisor of , then is .
Which is basically saying that:
Any rational number can be written as a fraction where , and both are integers, and the fraction is simplified.
For example, instead of writing , we should write the fraction as . How do we simplify fractions? We take the common divisors between the two numbers and remove them. E.g. and have a common divisor of . So and . So now the only common divisor between and is . Similarly, instead of , the common factor here is , so we should instead write the fraction as . Again, between and , the common divisor is .
So line 3 is basically promising us that the only divisor between and is .
Again, notice that we want to formalise this in math, instead of English. So here’s my proposed formalisation:
Lemma 2
The first two statements probably are familiar, we are saying that is a fraction , and that the denominator is non-zero. What about the last part? It’s saying that we go through all the numbers, call them . If divides both and , then it must be . This is our way of saying that “If is a divisor of and a divisor of , then is .”
We will also make use of the previously proven fact:
Lemma 1
I will first show you the proof, it is really long and perhaps quite intimidating, there are a few things I need you to bear in mind. Every line is either:
- A fact we are assuming to be true, e.g., Fact 1.
- An explicit assumption we are making.
- A previously proven theorem.
- A line created by an application of a rule.
Try to appreciate how this means we are basically starting from assumptions and facts, and everything else is a deduction. We are basically like Sherlock!
The final line is then the conclusion of our proof. (Notice how we used the lemma on line 2! We proved it previously so now we get to call it like a library function.)
Proof
- Assume for the sake of contradiction that: is rational, i.e.,
- [Using Lemma 2]
- [Universal instantiation of line 2, replacing with ]
- Let be such that [Existential instantiation on line 3]
- [Specialisation on line 4]
- Now [Basic algebra]
- [Basic algebra]
- [Existential generalisation on lines 6 and 7]
- [Unpacking definition of even]
- [Universal instantiation of Lemma 1]
- [Modus ponens on lines 9 and 10]
- [Unpacking definition of even]
- Let , such that [Existential instantiation on line 12]
- Therefore [Basic algebra]
- [Basic algebra, merging lines 14 and 6]
- [Basic algebra]
- Therefore [Existential generalisation on line 16]
- Therefore [Unpacking definition of even]
- [Universal instantiation of previously proven theorem]
- [Modus ponens on lines 18 and 19]
- [Conjunction of lines 10 and 19]
- [Basic Algebra]
- [Specialisation on line 4]
- [Basic algebra]
- [Universal instantiation on line 23]
- [Modus Ponens on lines 24 and 25]
- [Basic algebra]
- [Contradiction rule on line 28] (Look! We used it here!)
- is not rational. I.e. . [Proof by contradiction rule. Assumption on line 1, on line 29]
I think this proof warrants a read-back in English, here’s the proof again in English that skips the rules and contains the essence of what we are trying to say:
- Assume for the sake of contradiction that can be written as a fraction , where are integers.
- We know all fractions can be simplified, so let’s re-write as , where is simplified. In other words, the only common factor between is , and are integers.
- Now we know that .
- That means that is even.
- That means that is also even. (Proven from previous parts)
- So for some integer .
- So .
- So that means that .
- So that means that is also even.
- Which means that is also even. (Proven from previous parts)
- Which means for some integer .
- Combining lines 5 and 10, we conclude is even, and is even.
- This means is a common divisor of and .
- The only common divisor of and should be .
- This means is .
- But is not . 😱
- Therefore, our assumption needs to be negated. cannot be written as a fraction where are integers.
Mini Quiz
In our very big proof above, in the second half of the proof, we are missing a line in the proof. One of the rules needs 2 lines to be applied, but we used the rule with only a single line. Where is the faulty application of the rule? What is missing? And how should we fix it?
. This can be done by arguing that , and therefore .
Line 17 is applied wrongly. We need to also justify that
Proof Technique: Goal Statements vs. Steps in Proofs
Okay we’ve seen quite a few proofs and perhaps at some point you might have started asking yourself “ok but how do we know what to do in each step? I don’t even know how to begin the proof.”
Okay, there are a few general features of proofs that I can start showing you. These might be a little helpful. (Full disclaimer, this stuff does take a while to get used to. Don’t rush it, but I will encourage you to try out different proofs through either the extra questions or the tutorials. Just like how best way to learn programming is to write them yourself. The best way to learn proofs is to do them yourself, and have our tutors vet them for you.)
A certain way to look at proofs:
Let’s re-visit the proofs we did in the previous sections, and I want to show you certain techniques I use to figure out what I need to do.
There are a few key aspects to this:
- Knowing what the goal statement is.
- How do steps change what the goal statement is.
To be clear: knowing these alone will not create the entire proof for you. But it helps keep you focused on what you need to do next.
Example 1:
Okay, let me use an initial example to show you what I mean.
Let’s say we had the following statement again:
Theorem
, where .
Reading this back, it’s saying that “For every integer , if is even, then is even. And a number is considered ‘even’ means we can find an integer such that .”
Right now at the very beginning, our goal statement, is really just . This is the statement we want to prove. We haven’t written any lines of proof yet. Also, we on the side, we have the definition that .
(Knowing what the goal statement is.) The original statement to prove was . Remember, this means we need to show the statement works for all integers. One way (but not the only way) to do this, is to start by letting be arbitrarily chosen.
After we do that on line 1, the goal changes. Here’s the idea, we already said that we were taking any from on line 1. So that means is now some chosen value. So what should we do next? The next thing to do is to prove that for this , the statement is true.
Okay, well how do we prove that? One possible way to do this, is to assume that is true. And that’s exactly what we do on line 2.
Now that we have assumed is true? What is there left to do? Well.. the only thing left to do is to try to prove is true. And the middle part, lines 2.1 through 2.6 do exactly that. Notice that on line 2.6, we actually manage to prove the statement . But are we done? Not quite!
Here’s the reason: Originally we set out to prove . Along the way, we took arbitrarily from , and we also assumed . After we did that, we said “ok then all that remains to do is prove .” We still need to go back to our original goal statement. So if you look at lines 3 and 4, you’ll notice we need to recover the original statement. But that is okay because lines 1 and 2 help us out there. We use implication introduction rule to recreate with the help of line 2. And we use universal generalisation rule to recreate with the help of line 1. In fact, that’s precisely why line 1 and line 2 were there in the proof!
(How do steps change what the goal statement is.) Notice how the moment we created line 1, in some sense, the in the goal statement was taken care of. We knew that the moment we proved , we can then use the universal generalisation rule to put the back into the statement. That’s exactly what happened on line 3.
Similarly, the moment we created line 2, the goal statement was . We know for a fact that the moment we can prove that, we can use implication introduction rule to recreate .
One other thing you might notice is that the original goal statement wanted a , so our line 1 literally starts with “Let , arbitrarily chosen.”. This isn’t the only way to do it, but it’s usually a great starting point.
If you notice, the proof for the other direction looks the same!
Example 2: Delving a little deeper
Okay, in light of what we mentioned just now. How do we think about proving this theorem?
Theorem
, where
Okay so the goal statement at the very beginning is: . So perhaps what might happen is the following:
Instead of doing any lines of the proof first, we could also think about certain ways we can re-write the goal statement. For example:
Okay… but let’s think about this a little bit. The moment we start line 1, we’re setting out to prove that no integer value is such that . While this is definitely true… how you can prove this might feel clunky. Or at least even personally, I don’t know how to convince someone directly that is never equal to times any integer value.
Let’s roll back a little bit, and instead try the following:
In fact, this seems doable! Let’s see the proof.
- Assume for the sake of contradiction that
- [Logically equivalent to line 1]
- Let be such that [Existential instantiation on line 2]
- [Basic Algebra]
- [Basic Algebra on lines 3 and 4]
- [Basic Algebra]
- [Basic Algebra]
- [Basic Algebra]
- [Basic Algebra from line 7]
- [From line 3]
- [Conjunction rule on line 9 and 10]
- [Contradiction rule on line ]
- [Proof by contradiction rule on lines 1 and 12]
- [Definition unpacking on line 13]
So what’s the moral of the story here? Being mindful of the goal of the proof, how to accomplish the goal, and how to shift the goalpost are all tricks in the bag you can try. It is true that at the beginning, most people won’t know what to try. Knowing how to do proofs is like puzzle-solving of any kind: Every tried solving chess puzzles, sudoku puzzles or crossword puzzles? After a while you build your own techniques and tricks. The same thing applies here!
Bonus: How this math is useful
Let’s look at the statement proven previously again:
Theorem
This looks seemingly useless. Maybe just some random math “fun-fact”. But what if I told you that ideas like this were useful in computer science?
Let’s think of the following idea, right now you’re given an input number . And your boss needs you to write a program that tests whether is prime or not. So let’s state what means.
Definition
This basically says that a number is called a prime number, if the only two numbers that divide it are and itself. Examples of prime numbers are and so on. On the other hand, numbers like are not prime. Prime numbers are super useful for many reasons! And we might want to test for whether a number is prime so we can use it in our program.
So now that we know this, maybe we could write a Python script like this:
def is_prime(x):
d = 2
while d < x:
if x % d == 0:
return False
d += 1
return True
So now our loop takes around iterations in the worst case. Because if a number is prime, we actually test all numbers up to before quitting. But! We can actually use the above theorem to help us!
What if I told you, the following code also tests for whether is prime, but only uses around iterations?
import math
def faster_is_prime(x):
d = 2
while d <= math.sqrt(x):
if x % d == 0:
return False
d += 1
return True
It only tests divisors from up to . Why is this enough? Because the theorem tells us that any number has a divisor that is at most . If a number is prime, it only has 2 divisors: , and itself. That means that a prime number has no divisors from to . On the other hand, a non-prime number will have a divisor somewhere between to . So we only need to test those numbers.
Ideas like this, speed the world up.