Overview

In this unit, we will start moving onto topics that are very distinct from proofs and far closer to the traditional topics in math that are useful in computer science. To do this, we will first begin with combinatorics! Think of this as an advanced way of counting. The reason why this is useful is because when it comes to analysing certain mathematical constructs, or when it comes to counting how many cases our programs have to deal with, the techniques here the ones we fall back on.

We have broken this unit up into 4 parts:

  1. Basic Counting
  2. Principle of Inclusion-Exclusion
  3. Permutations and Combinations
  4. Applying Combinatorics for Problem Solving

Counting Based on Sets

Let’s begin by simplifying things a little bit, for now let’s just count items in sets. Since we are counting, we need to talk about the size of sets. So given a set , the size of a set is written as .

So for example: since this set has items. Also since is empty, this means that it holds no items, so . We won’t consider infinitely sized sets, so we won’t be talking about what is.

Basic Counting Techniques

So let’s start with the most general ideas that will be repeated for the next few weeks: adding disjoint and multiplying successive choices.

Rule of Sum: Adding Disjoint Cases

Let’s say we had two sets, and that had no common elements. That is to say, . We call these two sets disjoint.

So what is the size of ? In other words, if we knew three things:

What is ? Well, .

Example

Let and . Then . So .

Example

Let and . Then . So .

What about in general? What if we had sets? What if we had sets? What should the condition be? This is a little subtle.

Let’s try this for sets. Let’s say we had sets for which:

Can we conclude that ? Feels very tempting for us to conclude so, after all the sets are disjoint right?

What about the following example of sets?

, , . Now notice that . Check it! There is not a single element that is in all sets!

So , . So we expect elements. But . So our condition here isn’t quite right. Where did we go wrong?

Let’s think of it this way. We want to use the fact that when sets are disjoint, we can just add their sizes together. How should we use this to work on sets? Let’s consider an alternative set of conditions:

From conditions 4, 5, and 6, we can derive the following lemma:

Lemma 1:

So what does this mean? It means that the set and the set is disjoint! This means that .

So let’s try counting again:

Now it works! So what about between more sets? Here’s the general statement:

Given sets , such that:

  1. , (the set has size )
  2. , , if , then (any two distinct sets are disjoint)

Then, . The size of the all of the sets is just their individual sizes added up.

Conclusion: So what’s the conclusion here? Think of it this way, when we could make either choices from set or choices from set , where and are disjoint. Then we can just add them up.

Rule of Product: Multiplying Successive Choices

Okay, what about considering making a sequence of choices? Given two sets , (this time not necessarily disjoint), how do we count how many ways are there to make a choice from and then a choice from ?

This one is a little more straightforward. After all, we’ve actually made sets that actually represent this!

To make a choice from and then a choice from is to ask, how many pairs there are from and then , i.e.,

So we are essentially asking: what is the size of ? And this is pretty much just .

Example

Let and . Then .

, , and .

Example

Let and . Then .

, , and .

This one is far less involved. Here’s the general statement:

Given sets , then:

  1. , (the set has size )

Then, . The size of the all of the sets is just their individual sizes multiplied.

Counting Subsets

Let’s see a simple and repeated application of the rule of product. Let’s say we had a set such that . How many elements are in ? I.e. how many subsets are there?

Let’s lay out the elements of as . Notice that for each of these elements, we can make a choice of “take” or “don’t take”. And for any sequence of choices, this creates for us a subset of .

For example, if , then a sequence like “take”, “don’t take”, “take” will create a subset . Whereas a sequence like “don’t take”, “don’t take”, “don’t take” creates the subset .

We can think of this as making a set . And we are basically asking what is:

By the rule of product, we know this is:

We know that . So the above is basically:

So .

Rule of Division

So while the previous two laws might be a little clearer, there’s one less common one that is based on a nice trick. Let’s try this with a scenario:

Let’s say it was right after the midterm, and we have just collected a bunch of exam scripts. We know each script has the same number of pages to scan, pages. We put our entire stack of exam scripts in the scanner, and the scanner reports there are pages scanned.

How many scripts have we scanned? Since each script had pages, this means there were scripts.

This might sound a little obvious, but this trick can be taken to extremes eventually in the subsequent sections.

In general, if we have items, but we are able to group of them into groups where we think they are identical, then we have groups.

Subtracting Cases

One more useful idea is when we have two sets and we are promised that . Then, we know that:

Counting Consecutive Numbers

The last of the basic counting facts is the following: given a set of integers that are at least , and at most , i.e.,

How many integers are there? It’s very tempting to say , but the actual answer is .

For example, there are integers in the set .

Counting Multiples

What about if we wanted to count numbers divisible by some number within some range ?

This is a little tricky. Let’s try a simpler idea:

Given a positive number , how many numbers in the range are there that are divisible by ? Well, there are numbers in that range, so that would be:

If we instead wanted the numbers in the range that are divisible by , then this becomes:

because is divisible by any number (including ).

What we if wanted to count numbers between (inclusive) that are divisible by ? Now we can use the idea from Subtracting Cases! Notice we are counting the numbers from that are divisible by , then subtracting away the numbers from that are divisible by . Which gives us:

Let’s test out the formula.

Example

For example, how many numbers are there divisible by in the range ? There are . .

Example

For example, how many numbers are there divisible by in the range ? There are . .

Example

What about if we asked how many numbers are divisible by in the range ? There should be of them: .

.

Principle of Inclusion-Exclusion

Sticking to the topic of counting sets for now, can we be a little more precise in counting ? There actually is a formula for this!

Example

Let , . Then .

Then , , and . Then .

Why does this work? Because the idea is that if we just add and , we’re double counting the elements in , since they will be added once due to , then added once again in .

In fact, this tells us why the additive rule works when and are disjoint. Because in that case, .

This technique alone is quite useful. Here is an example that makes use of this idea.

Example:

Here’s a question: How many -digit numbers are there that are divisible by or ? So this is simple enough that we could have manually counted this: .

For this example we’re picking -digit numbers because it makes it easy for us to verify that we’re indeed correct. This becomes very infeasible if we wanted to do this for -digit numbers and so on.

But we could have made sets, , and . Then asked what is ?

So let’s compute , , and . The first numbers are computed based on Counting Multiples.

For , as before:

For :

And for , we want the numbers that are divisible by both and . In other words, divisible, by . So:

So now, we can figure out .

Follow-Up 1:

What if we wanted to count the -digit numbers that were not divisible by , and not divisible by ? In other words, we want to see how many numbers are in and not in . Notice again that . So again, through Subtracting Cases, we really just want . Which happens to be:

Follow-Up 2:

What if we changed the question to: how many -digit numbers are there that are divisible by and ? If we made set the numbers in the range that are divisible by , and the set of integers in the range that are divisible by .

But what should be? Should we take this to be ? No, we want numbers that are divisible by and , which happens to be numbers that are divisible by . In fact, we can check this: and . And of course, .

So, in general, how do we determine the divisibility condition for the set ? If set contains integers that are divisible by , and set contains integers that are divisible by , then for , we want integers divisible by both and . These are the integers that are divisible by the lowest common multiple of and .

So if , , we want , since is the smallest multiple of both and . On the other hand, if and , we want .

So let’s try one more example:

How many integers are there in the range that are divisible by or ? To test it out, let’s also list this manually. The integers we want are: .

So letting set be the integers in that range that are divisible by , and letting be the integers in that range that are divisible by , the lowest common multiple of and is not , but . So set contains numbers in that range divisible by .

So finally:

Extending This to 3 Sets

We can actually extend this idea for sets (actually it also works beyond , but the formula starts being quite unwieldy).

For sets , it holds that:

Why is this true? Let’s look at the following Venn diagram.

Notice that if we have , we would have double counted the elements in , and triple counted the elements in .

So consider . We would have removed the double counts, but elements in which were triple counted, are now not counted at all. So we have to add it back in.

At a food popularity contest, there are options being voted for by people who are surveyed. We want to know how many people in total were surveyed. The options were: (A) Lor Mee, (B) Nasi Lemak, (C) Chicken Rice. Anyone who is surveyed can vote for any combination of the options, i.e., a person could choose to vote for all , or any of the choices, or any single choice. But they must at least vote for something, i.e., everyone who surveyed voted for at least one option, and at most all . We know the following counts:

  1. There were people voting for option A (they might have also voted for other options).
  2. There were people voting for option B (they might have also voted for other options).
  3. There were people voting for option C (they might have also voted for other options).
  4. There were people voting for only both option A and option B.
  5. There were people voting for only both option A and option C.
  6. There were people voting for only both option B and option C.
  7. Only person voted for exactly all options.

How many people were surveyed?

So in total, people were surveyed.

Permutations and Combinations

Let’s move on and away from sets, and start talking about permutations and combinations. Some of these concepts might already be familiar ground, and we will be using it to build up to some bigger concepts.

Permutations of Distinct Items

Let’s begin with the following question:

Given distinct items, how many ways are there to order the items?

Let’s try this with . Here are all the possible ways to order them.

There are ways.

What about in general for distinct items? How many permutations are there? A permutation of the items, is a way to order them.

We can think of it recursively. Let be the number of ways to permute distinct items. If , there is only possible ordering: the item itself.

When , how do we count this?

So in particular:

Doesn’t this look familiar? This is , or .

Permutations With Duplicates

What happens when the items we need to permute has duplicates? For example, let’s say we want to count how many permutations there are of ?

Notice that the answer is not , but rather . In fact, here are the possible permutations:

  1. $baa

Let’s say we had items of type and item of type . Then the way we obtain the count is .

What about in general? Here’s an idea, we can first label each duplicate. For example, to count the number of ways to permute , we can treat it as first, permute all of them and notice that for example permutations like: should be treated the same as .

So, using the Rule of Division, for this example, we can argue that the total number of permutations is .

So in general, if we had types of items, and we had many copies of the item, then the total number of items is: . Then the total number of permutations is:

Here’s an example. How many permutations are there of ? There are copies of and copies of . There are items in total. So the answer is:

-Permutations of Distinct Elements

Now, let’s go back to looking at having distinct elements, but now we only care about picking of them. How many ways are there to permute that?

For example, let’s say we had elements , and . Then here all the possible ways:

But what about for general values of and ? We can again, think about this recursively. When , we are just asking how many ways are there to pick item out of items. That happens to just be itself. On the other hand, for , we can think of it as having to find out of the items to put into the first slot, then there are items remaining, and among which, we need to pick items to permute.

So the recurrence looks something like this:

What does this resolve to? If you expanded out the recurrence, you might notice that it looks something like this:

Which is basically:

There is another way we can see this. Imagine we considered all possible permutations, and then said “any permutations are considered the same as long as their first elements are the same”. If we fixed the first elements, then there are possible ways to permute the remaining elements. So any permutation has that share the same first elements. Using the Rule of Division, this means there in total:

ways to permute elements out of distinct elements.

In fact, you can check that:

We write this quantity as .

Combinations without duplicates

Let’s move on to another kind of counting that we commonly do: combinations! Let’s say we had distinct items, and this time, we want to count how many ways there are to choose items from the items.

For example, if there are items, , and . How many ways are there to choose items out of these ?

  1. ,
  2. ,
  3. ,

Notice, the ordering does not matter. What is the formula this time? Again, there are a few ways to prove this. But perhaps the most straightforward way is to again make use of the Rule of Division.

Let’s compare choosing items out of versus permuting items out of . Previously, in [[#k-permutations-of-n-distinct-elements|-Permutations of Distinct Elements]], notice that the cases we had were:

For permutations, the ordering matters, whereas if we simply want to choose items, it does not matter. Again, for this example, notice how we are basically saying that and are the same. Because they have the same elements, and we do not care about their ordering. Since we picked items, there are ways to permute the items. So this is:

We write this quantity as , or also write it most commonly as .

Interestingly, we can also show that . This might be “obvious” if you look at the fraction alone. In fact:

But there’s a very nice intuition behind why this is the case. Again, falling back to our trusty example of and :

Instead of picking items out of the distinct items , and to create a set of size , we can think about how to exclude items out of the distinct items to create a set of size .

  1. , (means we picked to be excluded)
  2. , (means we picked to be excluded)
  3. , (means we picked to be excluded)

The special case of choosing/permuting nothing.

One last thing we need to talk about: What is ? Or what is ?

How many ways are there to choose nothing from a set of elements? How many ways are there to permute nothing? It might be tempting to think the answer is , but it’s actually defined to be .

There is exactly one way to choose nothing from a set of elements: The empty set!

There is exactly one way to sequence/permute nothing: The empty sequence!

For that reason, is defined to be when .

This actually helps make our computations make more sense. For example, we can say that

Applying Combinatorics

Now that we’ve covered the basic quantities, let’s see a few applications of the counting methods shown.

Stars and Bars

Let’s say we had variables , all of which need to be . How many value assignments can we give each , so that:

For example, if , , then there are ways (far too many for us to list out here). As a simpler example, if , and , there are ways.

  1. ,
  2. ,
  3. ,
  4. ,
  5. ,
  6. ,

How should we even count the number of possible ways? Even for small values of and , the number of cases becomes astronomically large.

The technique is called stars and bars. Here’s the idea, again consider :

Think about how if we had slots, then we can pick slots to be bars in order to “force” the remaining slots to be stars. If we do this, then the first grouping of stars before the first bar is the value of . The second grouping of stars behind the second bar is the value of , and so on. The last grouping of stars after the bar is the value of .

Notice the sum of the values is exactly the number of stars we have: .

So to find out how many ways to assign values to the variables, is to just pick bars out of possible positions. In other words:

Block Walking

Let’s try another one. Given a grid of columns, and rows, where we start from the bottom left. And we are only allowed to either take one step up, or one step to the right each time. How many ways are there for us to go from the bottom left to the top right?

In the picture below, the red and blue path are examples of paths you can take.

So here’s the idea, treat the red path as a sequence . The blue path is a sequence . Notice that for rows and columns, we have to take “up” steps, and “right” steps. A different sequencing of the “up” and “right” steps gives us a different path.

Again, this is similar as to stars and bars. But this time, we have slots, of which we need to pick to be “up” steps, and the remaining to be “right” steps.

In general, for columns and rows, the number of paths is given as:

Counting Subsets

Let’s say we had a set of distinct elements.

  1. How many subsets are there of size ?
  2. How many subsets are there of size at most ?
  3. How many subsets are there of size at least ?

Part 1: To answer the first question, this was pretty much just . But what about the other two?

Part 2: Note that there are many subsets of size , that there are many subsets of size , and many subsets of size .

Part 3: As for this part, we could solve it directly, like so:

Or, perhaps a little more simply: we could notice that there are possible subsets of . To count the number of subsets that are or larger, we can just subtract away the subsets that are up to size from . Which gives us:

Turns out, both these quantities are the same!

Counting Permutations

Let’s say there were people: Alice, Bob, Clara, Dean. Let’s say that we had to arrange then in a line but Bob refuses to stand next to Dean. How many ways are there to arrange them in a line?

One way we could do this would be to count three cases separately, where we make slots, place Bob and Dean down first separately from each other, and then count how many permutations we could get.

There are possible cases that we can treat similarly. Pick any of the non-adjacent slots. There are ways to arrange Bob and Dean, and then there are also ways to arrange Alice and Clara. So there are .

There is an alternative way to count this: What about we count the number of ways where Bob and Dean are arranged together? We first treat them like a pair, attached at the hip. Then we count the number of ways we can arrange everyone so that Bob and Dean are stuck together, then subtract that number away from .

So if we treat Bob and Dean as being stuck together. Then they’re either arranged as Bob-Dean, or as Dean-Bob. Then for each of these cases (of which there are ), there are possible ways to permute: Alice, Clara, and the Bob-Dean pair. So in total, there are ways to arrange the 4 people, such that Bob and Dean are next to each other. Subtracting this away from yields us as our answer.

A Sense of Scale

Let’s see some examples of why combinatorics is useful and helpful.

Binomial Coefficients and Nested For Loops

Let’s say we had some code that looked like this:

total = 0
for i in range(n):
	for j in range(i + 1, n):
		for k in range(j + 1, n):
			total += arr[i][j][k]	

How many iterations does this for loop make? Notice here that there is an iteration whenever , and all 3 values are taken from the range . How do we count how many iterations we have done?

One very straightforward way is to literally count based on the possible values:

then looking at the inner sum, we notice the values range from to . So we can say the following:

Now the inner sum is just summing over an arithmetic progression:

Since ranges from to , the fraction ranges from to . So again, by changing variables:

That was a lot of effort! What if I told you there was a much more straightforward way? We basically just want to count how many distinct triplets we can pick from the set .

This happens to be . In fact, we can check this!

Interestingly, this tells us at the program makes iterations.

Some Useful Bounds on Combinatorial Properties

Let us end this unit by getting a sense of scale whenever we are counting. This is particularly useful when we are thinking about “generating all possibilities” or “just trying all possible solutions”.

Often times, having an approximation of the quantities are “good enough” for us to eyeball values. Let’s look at two very common approximations for factorials and binomial coefficients.

Stirling’s Approximation

The first is Stirling’s asymptotic approximation for . Think of “asymptotic approximation” as meaning “the larger the value of , the more accurate the approximation”.

Stirling’s approximation states that:

This makes the term of the right sometimes a little more useful to deal with.

Bounding Binomial Coefficients

The second is for binomial coefficients. Given any and

An example application

For example, consider the following problem statement:

Given a list of pairs such that the pair denotes the start and end time of the event. We will say two distinct events are clashing if ( and ) or ( and ). Find out what largest subset of events we can schedule so that none of the events in clash with each other.

This is a very classical algorithms problem, while we won’t have the tools to solve it here, let’s at least think about some very “straightforward” solutions.

What about the following strategy?

  1. Set a variable .
  2. Go through all possible subsets of .
  3. For each subset , check whether they contain clashing events. 1. If they do, ignore this subset. 2. If they don’t, update .
  4. Return .

This way we do find the largest subset. But how long did it take? What is the relationship between the running time of this strategy versus the number of events?

For a subset of length , we need to check possible pairings to see if the subset contains any clashing events. There are many subsets of of length . We need to check all possible subsets where ranges from to . So in total, we take this many steps:

Let’s say we want to lower bound the amount of steps needed minimum. How do we even begin approximating this?

One way would be to lower bound the original summation:

\begin{align*} \sum_{\ell = 2}^n \binom{n}{\ell}\binom{\ell}{2} &\geq \sum_{\ell = 1}^n\left(\frac{n}{\ell}\right)^\ell \left(\frac{\ell}{2}\right)^2 \\ &\geq \left(\frac{n}{\sqrt{n}}\right)^\sqrt{n} \left(\frac{\sqrt{n}}{2}\right)^2 \\ &\geq \left(\sqrt{n}\right)^\sqrt{n} \cdot \frac{n}{4}\\ &\geq n^{\frac{\sqrt{n}}{2}}\\ &\geq 2^{\frac{\sqrt{n}}{2}}\\ \end{align*}

Which is close to being exponential in . This means the strategy scales quite badly when we have more and more events.