This **preview** has intentionally **blurred** parts. Sign up to view the full document

**Unformatted Document Excerpt**

#52 Handout November 14, 2008 CS103A Robert Plummer Combinatorics This handout presents in prose form many of the principles and examples discussed in class. Combinatorics is the study of counting, which is important in Computer Science in many ways: To understand the performance of algorithms, we need to count the steps they execute We also need to count the amount of memory used as algorithms execute Counting is important in the study of probability, which is used in many algorithms and games Counting alternatives is often important in algorithm design Warm-up questions discussed in class: Suppose there are 18 math majors and 200 CS majors at Stanford. How many ways are there to pick one representative who is either a math major or a CS major? How many ways are there to pick two representatives, so that one is a math major and one is a CS major? How many ways are there to pick two representatives, regardless of their majors? Sum Rule and Product Rule The Sum Rule: If a task can be accomplished by choosing one of the nA alternatives in set A or by choosing one of the nB alternatives in set B, and if the sets A and B are disjoint, then there are nA + nB ways to accomplish the task. This can be generalized to any number of tasks. The Product Rule: If a task consists of a sequence of two subtasks, and there are n1 ways to accomplish the first subtask, and for each of these there are n2 ways to accomplish the second subtask, then there are n1n2 ways to accomplish the overall task. This can be generalized to any number of tasks. 2 Sets Before proceeding, we will give a few definitions concerning sets: A set is an unordered collection of distinct objects, which we call the elements of the set. The set of no elements is called the empty set. If A is a finite set, |A| denotes the number of elements in A, which is called the cardinality of A. The union of sets A and B, denoted A B, is the set of all elements in A or B. The intersection of sets A and B, denoted A B, is the set of all elements in both A and B. Generalized Sum and Product Rules The Sum Rule: If a task can be accomplished by choosing one of the alternatives from the sets S1, S2, ..., Sm, and these sets are pairwise disjoint (i.e., Si Sj = for all i j), and ni is the number of elements in Si, then the number of ways to accomplish the task is n1 + n2 + ... + nm. Using the notation of set theory, we would write |S1 S2 ... Sm| = |S1| + |S2| + ... + |Sm| (where the sets are disjoint). The Product Rule: If E1, E2, ..., Em is a sequence of events such that E1 can occur in n1 ways and if E1, E2, ..., Ek-1 have occurred, then Ek can occur in nk ways, then there are n1n2...nm ways in which the entire sequence of events can occur. Examples from class: Suppose you are either going to go to an Italian restaurant that serves 15 entres or to a French restaurant that serves 10 entres. How many choices for an entre do you have? Suppose you choose the French restaurant, and find out that the prix fixe menu is three courses, with a choice of 4 appetizers, the 10 entres, and 5 desserts. How many different meals can you have? How many different three-letter uppercase initials are there (with repetition and without)? How many different three-letter uppercase initials are there that begin with the letter A? 3 How many binary numbers of length 10 begin and end with a 1? How many strings of lowercase letters are there of length four or less? You may be wondering why we care about counting at all as computer scientists. Well, take a look at the following: What is the value of k after the following code has been executed? k = 0; for (i1 = 1; k = k + for (i2 = 1; k = k + for (i3 = 1; k = k + i1 <= n1; i1++) 1; i2 <= n2; i2++) 1; i3 <= n3; i3++) 1; What is the value of k after the following code has been executed? k = 0; for (i1 = 1; i1 <= n1; i1++) { for (i2 = 1; i2 <= n2; i2++) { for (i3 = 1; i3 <= n3; i3++) { k = k + 1; } } } More complex problems can involve using the sum and product rule together. In one early version of BASIC, the name of a variable is a string of one or two alphanumeric chars, where uppercase and lowercase are not distinguished. (So much for meaningful variable names.) Alphanumeric means either one of the 26 English letters or one of the 10 digits. In addition, all variables must begin with a letter and must be different from the five reserved words. How many different variable names are possible in this (very simplistic) version of BASIC? 4 Solution: Let V equal the number of different variable names. Let V1 be the number of variable names one-char long, and V2 be the number of variable names two-chars long. So, by the Sum Rule, V = V1 + V2. V1 must equal 26 since we can't start with a digit. By the Product Rule, V2 = 26 * 36. But five of these strings must be excluded so we get: V2 = 26 * 36 - 5 = 931. V = V1 + V2 so V = 26 + 931 = 957 different names. In solving combinatoric problems, we must be careful not to double count. Consider the following: How many binary numbers of length eight either start with a 1 or end with 00? Solution: It follows from the Product Rule that there are 27 = 128 ways to construct an 8digit binary number that starts with a 1, since there is 1 way to choose the first digit and 2 ways to choose each of the other 7 digits. Likewise, there are 26 = 64 ways to construct an 8-digit number ending with 00. The answer to the question is not 128 + 64, however, because we would be counting some numbers twice. To get the correct answer, we observe that there are 25 = 32 ways to construct a 8-digit number that starts with 1 and ends with 00, and that these are exactly the numbers that are double counted in our sum. So the number that start with 1 or end with 00 is 128 + 64 32 = 160. The Principle of Inclusion-Exclusion We often have to solve problems that involve finding the number of elements in the union of two sets, and as we saw above, we have to watch out for double counting: Sets What is |A B| ? A B |A| = 8 |B| = 6 |A| + |B| = 14, which double counts the intersection Since adding the sizes of the two sets gives an answer that double counts the intersection, we can get the correct answer by subtracting the size of the intersection: |A B| = |A| + |B| - |A B| 5 This is known as the Principle of Inclusion-Exclusion. Let's extend this to problems involving three sets. Again we will start by adding the number of elements in each set, and we'll note how many times each element in the union is counted: What is |A B C| ? A B Once Twice 3 times Twice Twice Once Once C Counted by: |A| + |B| + |C| Again we can correct the formula, but if we eliminate the double counting of the intersections of the pairs of sets, we will eliminate all counting of the intersection of the three sets. So the correct formula is: |A B C | = |A| + |B| + |C| - |AB| - |AC| - |BC| + |ABC| A survey of 200 TV viewers found that 110 watch sports, 120 watch comedy, 85 watch drama, 50 watch drama and sports, 70 watch comedy and sports, 55 watch comedy and drama, and 30 watch all three. How many people watch sports, comedy, or drama? How many do not watch any of these categories? We can extend the Principle of Inclusion-Exclusion to any number of sets. Here is the case for four: B C A D |ABCD| = |A| + |B| + |C| + |D| -|AB| - |AC| - |AD| - |BC| - |BD| - |CD| +|ABC| + |ABD| + |ACD| + |BCD| - |ABCD| 6 We have to add on the sizes of all the intersections of three sets, but a final correction is needed so that we don't double count the intersection of all four. This pattern continues as we go to higher numbers of sets (and the Venn diagrams get really hard to draw!). BTW, can you draw a Venn diagram for four sets using circles rather than ellipses? The Pigeonhole Principle Suppose a bunch of pigeons fly into a bunch of pigeonholes to roost. The Pigeonhole Principle states that if there are more pigeons than pigeonholes, then there must be at least one pigeonhole with at least two pigeons in it. Seems obvious, and fortunately for us, we can apply this observation to objects besides just pigeons. The Pigeonhole Principle: If k+1 or more objects are placed in k boxes, then there is at least one box containing two or more of the objects. Proof by Contradiction: Suppose that none of the k boxes has more than one object. Then the total number of objects would be k. This is a contradiction since we have k+1 or more objects. Here are some applications of this principle: 1) Among any group of 367 people, there must be at least two with the same birthday since there are only 366 possible birthdays. 2) In any group of 27 English words, there must be at least two that start with the same letter, since there are 26 letters in the alphabet. All of this may seem painfully obvious to you, but this really is a useful tool once we generalize it: The Generalized Pigeonhole Principle: If N objects are placed in k boxes, then there is at least one box containing at least ceil(N/k) objects. Recall: The ceiling function ceil(x) assigns to the real number x the smallest integer that is greater than or equal to x. Proof by Contradiction: Suppose that none of the k boxes contains more than ceil(N/k) - 1 objects. Then in the k boxes we have total number of objects k (ceil(N/k) - 1) Using the inequality ceil(N/k) < (N/k) + 1, we have k(ceil(N/k) - 1) < k(((N/k) + 1) - 1) Simplifying the last term and putting these inequalities together, we have 7 total number of objects < k(N/k), or total number of objects < N This is a contradiction since there are a total of N objects. Here are some more applications of the generalized version: 1) Among 100 people there are at least ceil(100/12) = 9 who were born in the same month. 2) What is the minimum number of students required in a class to be sure that at least six will receive the same grade, given the five possible grades A, B, C, D, NP? Now that you have the basic idea down, we can look at a more elegant application of this principle. This example illustrates an important area of combinatorics called Ramsey Theory, which deals with the distribution of subsets of elements of sets. Assume that in a group of six people, each pair of individuals consists of two friends or two enemies (that is, any two individuals are either friends or enemies--there are no other relationships). Show that there are either three mutual friends or three mutual enemies in the group. Solution from K. Rosen, Discrete Mathematics: Let A be one of the six people. Of the five other people in the group, there are either three or more who are friends of A or three or more who are enemies of A. This follows from the generalized pigeonhole principle, since when five objects are divided into two sets, one of the sets has at least ceil(5/2) = 3 elements. In the former case, suppose B, C, and D are friends of A. If any two of these three individuals are friends, then these two and A form a group of three mutual friends. Otherwise B, C, and D form a group of three mutual enemies. The proof in the latter case, where there are three or more enemies of A, proceeds in a similar manner. Permutations A permutation of a set of objects is an ordering of the objects. For example, the set of elements {a b c} can be ordered in the following ways: abc acb cba bac bca cab giving us six possible permutations. In general, given a set of n objects, how many permutations does the set have? Imagine forming a permutation as an n-step process: step 1: choose an element to put in position 1 8 step 2: choose an element to put in position 2 ... step n: choose an element to put in position n This is similar to our application of the product rule when certain constraints (like no repetition) are applied. There are n ways to do the first step, n-1 ways to do the second step (since one element was used in the first step), n-2 ways to do the third step, etc. By the time we get to the nth step, there is only one element left. So, by the product rule we get: n (n-1) (n-2) ... 2 1 = n! For any integer n with n >= 1, the number of permutations of a set with n elements is n! How many ways can the letters in the word "boinga" be arranged in a row? How many ways can the letters in the word "boinga" be arranged if the letters "bo" must remain next to each other (in order) as a unit? Now, what if we introduce the idea of ordering objects into a circular arrangement. This adds a little twist to things. Say we have to seat representatives of six countries. An old trick of diplomacy is to use a circular table so there are no ends of the table which might confer a particular status. How many different ways can we seat these representatives? We will name our representatives A, B, C, D, E and F. Since only relative position matters, start with one of them, say A, and place that person anywhere, say in the top seat in the following diagram. A B through F can be arranged in the seats around A in all possible orders. So there are 5! = 120 ways to seat the group. 9 r-Permutations Another twist to this permutation idea is say we want to determine the number of ways to select some number of ordered elements from a set. For example, given the set {a b c}, in how many different orders can we select two letters from the set? ab ac ba ca cb bc Each such ordering of 2 elements is called a 2-permutation of the set {a b c}. An r-permutation of a set of n elements is an ordered selection of r elements taken from the set of n elements. The number of r-permutations of a set of n elements is denoted P(n, r). If n and r are integers and 1 <= r <= n, then P(n, r) = n! / (n - r)!. The proof of this formula is fairly straight-forward. Here is the basic idea: Suppose a set of n elements is given. Formation of an r-permutation can be thought of as an r-step process: step 1: choose an element to be first step 2: choose an element to be second step 3: choose an element to be third ... step r: choose an element to be rth (there are n ways to do this) (there are n-1 ways to do this) (there are n-2 ways to do this) (there are n-(r-1) or n-r+1 ways to do this) and we thus, have finished forming an r-permutation. It follows from the product rule that the number of ways to form an r-permutation is n * (n-1) * (n-2) * ... * (n - r + 1). How do we get n! / (n - r)! from this result? How many different ways can three of the letters of the word "blurp" be chosen and written in a row? How many different ways can this be done if "b" must be the first letter? Combinations Consider the following: Suppose 5 members of a group of 12 are to be chosen as a team to work on a project. How many distinct 5-person teams can be selected? 10 Or in general: Given a set S with n elements, how many subsets of size r, can be chosen from S? Each individual subset of S of size r, is called an r-combination. Suppose n and r are non-negative integers with r <= n. An r-combination of a set of n elements is a subset of r of the n elements. The symbol n r (which we read as "n choose r") denotes the number of subsets of size r that can be chosen from the n elements. This is also denoted C(n, r). If we are going to select elements out of a set, we now have two methods to apply. We could do an ordered selection, where we are interested not only in the elements chosen, but also in the order in which the elements are chosen. This is our definition of an r-permutation. Or, we could do an unordered selection, where we are interested only in the elements chosen, and we don't care about the order. This is what we mean by an r-combination. How many unordered selections of 2 elements can be made from {0 3}? In other words, what is C(4, 2)? 1 2 So how do we calculate C(n, r)? In order to answer this, we need to analyze the relationship between r-permutations and r-combinations. We'll do this with a simple example: Write all 2-permutations of the set {0 equation relating P(4, 2) and C(4, 2). 1 2 3}. Then, find an The reasoning we apply in this example, works in the general case. To form an r-permutation of a set of n elements, first choose a subset of r of the n elements (there are C(n, r) ways to do this). Then, choose an ordering for the r elements (there are r! ways to do this). Then, the number of rpermutations is P(n, r) = C(n, r) * r!. If we solve for C(n, r) we get C(n, r) = P(n, r) / r!. We know that P(n, r) = n! / (n - r)!, so substitution gives us: C(n, r) = n! / (r! * (n - r)!), assuming n and r are nonnegative and r <= n. Now, we can find the answer to the question that began this section: 11 Suppose 5 members of a group of 12 are to be chosen as a team to work on a project. How many distinct 5-person teams can be selected? We need to calculate C(12, 5) = 12! / (5! * (12 - 5)!). The best way to solve this is not by multiplying all the factorials out, even though it's pretty easy to punch these into a calculator. Instead we do this: 12 * 11 * 10 * 9 * 8 * 7! / (5 * 4 * 3 * 2 * 1) * 7! The 7! terms cancel; the 5 * 2 in the denominator cancel the 10 on top; the 4 * 3 in the denominator cancel the 12 on top and we are left with: 11 * 9 * 8 = 792. Here's an even easier way: with Google, search on "12 choose 5". Yes, Google does discrete math! As usual in the combinatorial universe, we can come up with all kinds of special situations. So, let's say that Fred and Ginger (2 of the 12 people in the above example) simply must work together or they will make everyone else's lives miserable. Thus, any team of 5 that we form, must either include both of them or neither of them (the latter being preferable to the other 10 people). Now how many 5-person teams can be formed? Here is a diagram of the situation: All possible 5-person teams teams with both Fred and Ginger teams with neither Fred nor Ginger A team with Fred and Ginger contains exactly three other people from the remaining ten. So there are as many such teams as there are 3-person subsets: C(10, 3) = 120. The other set of teams is just C(10, 5) = 252. Add them together and we get 372 possible teams. Now suppose Fred and Ginger have a terrible fight, and simply refuse to work on the same team. How many 5-person teams can be formed? Binomial Coefficients and the Binomial Theorem Recall that we can denote r-combinations as C(n, r) or: n r This value (no matter how we denote it) is called a binomial coefficient. We use this term because the numbers occur as coefficients in the expansion of powers of binomial expressions such as (a + b)n. What is a binomial expression? It is an expression that is the sum of two terms, like x + y (these terms can be products of constants and variables, but that is not relevant here). 12 First, let's see why binomial coefficients are even involved in this area. Think about the expansion of (x + y)3. We could just sit down and multiply it all out, or we could be clever little combinatoric wizards and make the following observation: When (x+y)(x+y)(x+y) is expanded all the products of a term in the first sum, a term in the second sum, and a term in the third sum are added. Terms of the form x3, x2y, xy2, and y3 arise. To obtain a term of the form x3, an x must be chosen from the three sums and this can be done in only one way. Thus, x3 must have a coefficient of 1. To obtain a term of x2y, we need one y chosen from one of the three sums (and two x's from the other two sums). The number of such terms must be C(3,1) since it is the number of 1-combinations of 3 objects. To obtain a term of xy2, we need two y's chosen from two of the three sums (and one x from the remaining sum). The number of such terms must be C(3,2). The reasoning for y3 is the same as that for x3, and we get: (x + y)3 = x3 + 3x2y + 3xy2 + y3 When we generalize this result, we arrive at the binomial theorem, which gives the coefficients of the expansion of powers of binomial expressions. The Binomial Theorem: (x + y) = n n C(n, j) xn-j yj j=0 What is the coefficient of x12y13 in the expansion (x + y)25? There are some important properties of binomial coefficients which we will present. The first comes from a very important mathematician of the 17th century by the name of Pascal: Pascal's Identity: Let n and k be positive integers with n >= k. Then: C(n+1, k) = C(n, k-1) + C(n, k) How would you prove this? 13 Pascal's Identity is the basis for an interesting geometric arrangement of the binomial coefficients into a triangle. The nth row of the triangle consists of the binomial coefficients, C(n, k) for k from 0 to n. C(0,0) C(1,0) C(2,0) C(3,0) ... C(3,1) C(2,1) C(3,2) C(1,1) C(2,2) C(3,3) This is known as Pascal's Triangle. Pascal's Identity shows that when two adjacent binomial coefficients are added, we get the one in that lies between them in the next row. For example, C(2,0) + C(2,1) = C(3,1) (1 + 2 = 3). This triangle turns out to be a useful little calculator for binomial coefficients. Another useful identity concerning binomial coefficients: n C(n, k) = 2n k=0 How would you prove this identity? Permutations and Combinations with Repetition All the permutation and combination problems we have seen thus far did not have any repeated elements. This is a serious constraint since in many counting problems, elements may be used repeatedly. For example, if we could not re-use letters and numbers on license plates, we would have much less of a traffic problem than we do now. Let's start with permutations: How many strings of length n can be formed from the English alphabet? By the product rule, since there are 26 letters and since each letter can be used repeatedly, we see there are 26n strings of length n. This simple example can be generalized to a formula: The number of r-permutations of a set of n objects with repetition allowed is nr. As for combinations, consider the following example (from Rosen): How many ways are there to select four pieces of fruit from a bowl containing apples, oranges, and pears if the order in which the fruit is selected does not matter, only the type of fruit and not the individual piece matters, and there are at least four of each type of fruit in the bowl. 14 Well, one way to solve this is to just go diving for fruit: 4 apples 3 apples, 1 orange 3 oranges, 1 pear 2 apples, 2 oranges 2 apples, 1 orange, 1 pear 4 oranges 3 apples, 1 pear 3 pears, 1 apple 2 apples, 2 pears 2 oranges, 1 apple, 1 pear 4 pears 3 oranges, 1 apple 3 pears, 1 orange 2 oranges, 2 pears 2 pears, 1 apple, 1 orange There are 15 ways, and it turns out that the solution is actually the number of 4combinations with repetition, allowed fr...