Unformatted text preview: Algebra I – MATH235
Course Notes by Dr. Eyal Goren
McGill University
Fall 2007
Last updated: December 3, 2007. c
All rights reserved to the author, Eyal Goren, Department of Mathematics and Statistics,
McGill University. Contents
1. Introduction 4 Part 1. Some Language and Notation of Mathematics
2. Sets
2.1.
3. Proofs: idea and technique
3.1. Proving equality by two inequalities
3.2. Proof by contradiction and the contrapositive
3.3. Proof by Induction
3.4. Prove or disprove
3.5. The pigeonhole principle
4. Functions
4.1. Injective, surjective, bijective, inverse image
4.2. Composition of functions
4.3. The inverse function
5. Cardinality of a set
6. Number systems
6.1. The polar representation
6.2. The Fundamental Theorem of Algebra
1 6
6
8
9
9
10
11
13
13
14
15
16
16
17
20
22
24 2 7. Fields and rings  deﬁnitions and ﬁrst examples
7.1. Some formal consequences of the axioms 26
28 Part 2. Arithmetic in Z
8. Division
9. GCD and the Euclidean algorithm
9.1. GCD
9.2. The Euclidean algorithm
10. Primes and unique factorization
10.1. Further applications of the Fundamental Theorem of Arithmetic 30
30
31
31
32
33
36 Part 3. Congruences and modular arithmetic
11. Relations
12. Congruence relations
12.1. Fermat’s little theorem
12.2. Solving equations in Z/nZ.
12.3. Public key cryptography; RSA method 39
39
40
43
44
45 Part 4. Polynomials and their arithmetic
13. The ring of polynomials
14. Division with residue
15. Arithmetic in F[x]
15.1. Some remarks about divisibility in a commutative ring T
15.2. GCD of polynomials
15.3. The Euclidean algorithm for polynomials
15.4. Irreducible polynomials and unique factorization
15.5. Roots
15.6. Eisenstein’s criterion
15.7. Roots of polynomials in Z/pZ 47
47
48
48
49
49
50
51
55
56
56 Part 5. Rings
16. Some basic deﬁnitions and examples
17. Ideals
18. Homomorphisms
18.1. Units
19. Quotient rings
19.1. The quotient ring F[x]/(f (x))
19.2. Every polynomial has a root in a bigger ﬁeld
19.3. Roots of polynomials over Z/pZ
20. The First Isomorphism Theorem
20.1. Isomorphism of rings
20.2. The First Isomorphism Theorem
20.3. The Chinese Remainder Theorem
21. Prime and maximal ideals 58
58
61
63
66
67
70
71
72
72
72
73
74
77 Part 6. 78 Groups 3 22. First deﬁnitions and examples
22.1. Deﬁnitions and some formal consequences
22.2. Examples
22.3. Subgroups
23. The permutation and dihedral groups
23.1. Permutation groups
23.2. Cycles
23.3. The Dihedral group
24. The theorem of Lagrange
24.1. Cosets
24.2. Lagrange’s theorem
25. Homomorphisms and isomorphisms
25.1. homomorphisms of groups
25.2. Isomorphism
26. Group actions on sets
26.1. Basic deﬁnitions
26.2. Basic properties
26.3. Some examples
27. The CauchyFrobenius Formula
27.1. Some applications to Combinatorics
28. Cauchy’s theorem: a wonderful proof
29. The ﬁrst isomorphism theorem for groups
29.1. Normal subgroups
29.2. Quotient groups
29.3. The ﬁrst isomorphism theorem
29.4. Groups of low order
29.5. Odds and evens
29.6. Odds and Ends
Index 78
78
78
80
80
80
82
84
85
85
86
87
87
88
89
89
89
91
92
94
97
98
98
98
99
100
102
102
104 4 1. Introduction
The word “algebra” is derived from the title of a book  Hisab aljabr w’almuqabala  written by
the Farsi scholar Abu Ja’far Muhammad ibn Musa AlKhwarizmi (790  840). The word aljabr
itself comes from the root of reunite (in the sense of completing or putting together) and refers
to one of the methods of solving quadratic equations (by completing the square) described in
the book. The book can be considered as the ﬁrst treatise on algebra. The word algorithm is in
fact derived from the name AlKhwarizmi. The book was very much concerned with recipes for
known practical problems; AlKhwarizmi intended to teach (in his own words) “... what is easiest
and most useful in arithmetic, such as men constantly require in cases of inheritance, legacies,
partition, lawsuits, and trade, and in all their dealings with one another, or where the measuring
of lands, the digging of canals, geometrical computations, and other objects of various sorts and
kinds are concerned.” 1
MATH 235 is a ﬁrst course in Algebra. Little is assumed in way of background. Though the
course is selfcontained, it puts some of the responsibility of digesting and exploring the material
on the student, as is normal in university studies. You’ll soon realize that we are also learning
a new language in this course and a new attitude towards mathematics. The language is the
language of modern mathematics; it is very formal, precise and concise. One of the challenges of
the course is digesting and memorizing the new concepts and deﬁnitions. The new attitude is an
attitude where any assumptions one is making while making an argument have to be justiﬁed,
or at least clearly stated as a postulate, and from there on one proceeds in a logical and clear
manner towards the conclusion. This is called “proof” and one of the main challenges in the
course is to understand what constitutes a good proof and to be able to write proofs yourself. A
further challenge for most students is that the key ideas we learn in this course are very abstract,
bordering on philosophy and art, but yet they are truly scientiﬁc in their precision. You should
expect not to understand everything right away; you should expect to need time to reﬂect on the
meaning of the new ideas and concepts we introduce. Here are some pointers as to how to cope
with the challenges of this course:
• Read the class notes and the text book over and over again. Try and give yourself
examples of the theorems and propositions and try and provide counterexamples when
some of the hypotheses are dropped.
• Do lots and lots of exercises. The more, the better.
• Explain to your friends, and possibly to your family, the material of the course. Work
together with your class mates on assignments, but write your own solution by yourself
in the end; try to understand diﬀerent solutions to assignments and try and ﬁnd ﬂaws in
your class mates solutions.
• Use the instructor’s and the TA’s oﬃce hours, as well as the math help center, to quickly
close any gap and clarify any point you’re not sure about. 1 Cited from http://wwwgroups.dcs.stand.ac.uk/ history/Biographies/AlKhwarizmi.html. 5 So what is this course really about?
We are going to start by learning some of the notation and language of mathematics. We are
going to discuss sets and functions and various properties and operations one can perform on
those. We are going to talk about proofs and some techniques of proof, such as “induction”
and “proving the counterpositive”. We are going to discuss diﬀerent structures in which one
can do arithmetic, such as rational, real and complex numbers, polynomial rings and yet more
abstract systems called rings and ﬁelds. We are going to see unifying patterns of such systems
and some of their applications. A ﬁnite ﬁeld, perhaps a seemingly esoteric beast initially, is a key
notion in modern days computer science. It is a concept absolutely essential and fundamental
for cryptographic schemes as well as data management.
We are then going to do some really abstract algebra for a while, learning about rings and
homomorphisms and ideals. Our motivation is mostly to know how to construct a ﬁnite ﬁeld and
work with it in practice. We also lay the basis for further study in the following algebra courses.
The ﬁnal section of the course deals with groups and group actions on sets. After the previous
section on rings and ﬁelds we’ll feel more comfortable with the abstract notions of group theory.
Furthermore, there are going to be plenty of concrete examples in this section and applications
to problems in combinatorics and to the study of symmetries. Here is one concrete example:
imagine that a jewelry company wants, as publicity, to display all necklaces one can make using
10 diamonds and 10 rubies, perhaps under the slogan “to each its own”. In each necklace 10
diamonds and 10 rubies are to be used. The necklace itself is just round, with no hanging or
protruding parts. Thus, we can provide an example of such a necklace as
DDRRDRDRRRDDDDRRDRDR
(where the last R is adjacent to the ﬁrst D). Now, when we consider a particular design such as
above, we do want to identify it with the following design
DRRDRDRRRDDDDRRDRDRD
(we’ve put the ﬁrst D at the last spot), because this is just the same pattern; if the necklace is
put on a table, it is just rotating it a bit, or, alternately, looking at it from a slightly diﬀerent
angle. Also, note that the pattern
DDRRDRDRRRDDDDRRDRDR
is identiﬁed with
RDRDRRDDDDRRRDRDRRDD
which corresponds to ﬂipping over the necklace. Now the question is how many rubies and
diamonds we need to purchase in order to make all the diﬀerent designs? It turns out that this
can be approached using the theory of group actions on sets and a general formula we’ll develop
can be applied here. Turns out there are 4752 such diﬀerent designs; that will require 47520
diamonds and 47520 rubies. Perhaps the idea should be reconsidered ;) 6 Part 1. Some Language and Notation of Mathematics 2. Sets
A set is a collection of elements. The notion of a set is logically not quite deﬁned (what’s a
“collection”? an “element”?) but, hopefully, it makes sense to us. What we have is the ability
to say whether an element is a member of a set of not. Thus, in a sense, a set is a property, and
its elements are the objects having that property.
There are various ways to deﬁne sets:
(1) By writing it down:
S = {1, 3, 5}. The set is named S and its elements are 1, 3, 5. The use of curly brackets is mandatory!
Another example is
T = {2, 3, Jim’s football}. This is a set whose elements are the numbers 2, 3 and Jim’s football. It is assumed here
that “Jim” refers to one particular individual.
A set can also be given as all objects with a certain property:
S1 = {all beluga whales}.
Another example is
T5 = {n : n is an odd integer, n3 = n}.
The colon means that the part that follows is the list of properties n must satisfy, i.e.
the colon is shorthand to “such as”. Note that this set is equal to the set
U + = {n : n2 = 1}.
Our eccentric notation T5 , S1 , U + is just to make a point that a set can be denoted in
many ways.
(2) Sometimes we write a set where the description of its elements is implicit, to be understood by the reader. For example:
N = {0, 1, 2, 3, . . . },
and
Q= a Z = {. . . , −2, −1, 0, 1, 2, . . . },
: a, b ∈ Z, b = 0 . b
Thus N is the set of natural numbers, Z is the set of integers and Q the set of rational
numbers. The use of the letters N, Z, Q (and R, C below) is standard. Other standard
notation is
R = the set of real numbers ( = points on the line), 7 and the complex numbers
C = {a + bi : a, b ∈ R}.
Here i is the imaginary number satisfying i2 = −1 (we’ll come back to that in §6).
Note that we sneaked in new notation. If A is a set, the notation x ∈ A means x
is an element (a member) of A, while x ∈ A means that x is not an element of A.
Thus, C = {a + bi : a, b ∈ R} is saying C is the set whose elements are a + bi, where a and b
are real numbers. For example, in the notation above, 3 ∈ S, 2 ∈ S , Jim’s football ∈ T
but ∈ U + .
We haven’t really deﬁned any of these sets rigorously. We have assumed that the
reader understands what we mean. This suﬃces for the level of this course. A rigorous
treatment is usually given in a logic course (usually for N) or in analysis course for R. The
set of real numbers can also be thought of as the set of all numbers written in a, possibly
inﬁnite, decimal expansion. Thus, 1, 2, 1/3 = 0.33333 . . . and indeed any rational number
√
are elements of R as are π = 3.1415926 . . . , e = 2.718281 . . . , 2 = 1.414 . . . and so on.
We say that A ⊆ B (A is contained in B ), or simply A ⊂ B , if every element of A is an element
of B . For example N ⊂ Z. We say that A = B if A is equal to B , that is, if every element of A is
an element of B and viceversa. Note that A = B holds precisely when both A ⊂ B and B ⊂ A.
The notation φ stands for the empty set. It is a set but it has no elements. We let
A ∩ B = {x : x ∈ A and x ∈ B }
be the intersection of A and B , and
A ∪ B = {x : x ∈ A or x ∈ B }
be the union of A and B . For example, {1, 3} ∩ {n : n2 = n} = {1}, N ∩ {x : −x ∈ N} =
{0}, S1 ∩ T5 = ∅.
We shall also need arbitrary unions and intersections. Let I be a set (thought of as index set)
and suppose for each i ∈ I we are given a set Ai . Then
∩i∈I Ai = {x : x ∈ Ai , ∀i},
(∀ means “for all”) and
For example, deﬁne for i ∈ N, ∪i∈I Ai = {x : x ∈ Ai , for some i}.
Ai = {x ∈ N : x ≥ i} (so A0 = {0, 1, 2, 3, . . . }, A1 = {1, 2, 3, 4, . . . }, A2 = {2, 3, 4, 5, . . . } and so on). Then
while ∪i∈N Ai = N,
∩i∈N Ai = ∅. 8 Here’s a another example: for every real number x, 0 ≤ x ≤ 1 deﬁne
Then Sx = {(x, y ) : 0 ≤ y ≤ x}.
∪0≤x≤1 Sx is the triangle in the plane whose vertices are (0, 0), (1, 0), (1, 1).
Yet another operation on sets is the diﬀerence of sets:
A \ B = {x : x ∈ A, x ∈ B }. We remark that A ∪ B = B ∪ A, A ∩ B = B ∩ A but usually A \ B is not equal to B \ A.
Another deﬁnition we shall often use is that of the cartesian product. Let A1 , A2 , . . . , An be
sets. Then
A1 × A2 × · · · × An = {(x1 , x2 , . . . , xn ) : xi ∈ Ai }.
In particular, A × B = {(a, b) : a ∈ A, b ∈ B }.
Example 2.1. Let A = {1, 2, 3}, B = {1, 2}. Then A × B = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)}. Note that (3, 1) ∈ A × B but (1, 3) ∈ A × B , because although 1 ∈ A, 3 ∈ B .
Example 2.2. Let A = B = R. Then
A × B = {(x, y ) : x ∈ R, y ∈ R}. This is just the presentation of the plane in cartesian coordinates (which explains why we call
such products “cartesian” products).
2.1. We now wish to make a general statements relating some of these operations. Once a
statement is important enough to highlight it, it falls under the heading of Lemma, Proposition
or Theorem (or, more colloquially, Claim, Assertion and so on). Usually, “Lemma” is reserved for
technical statements often to be used in the proof of a proposition or a theorem. “Proposition”
and “Theorem” are more or less the same. They are used for claims that are more conceptual,
or central, with “Theorem” implying even more importance. However, none of these rules is
absolute.
Proposition 2.3. Let I be a set. Let A be a set and Bi , i ∈ I be sets as well then
and
Furthermore, A ∩ (∪i∈I Bi ) = ∪i∈I (A ∩ Bi ),
A ∪ (∩i∈I Bi ) = ∩i∈I (A ∪ Bi ).
A \ (∪i∈I Bi ) = ∩i∈I (A \ Bi ), 9 and
A \ (∩i∈I Bi ) = ∪i∈I (A \ Bi ). 3. Proofs: idea and technique
Proposition 2.3 is not obvious. It is not even clear at ﬁrst sight whether it’s correct. For that
reason we insist on proofs in mathematics. Proofs give us conﬁdence that we are making true
statements and they reveal, to a lesser or higher extent, why the statements hold true. The proof
should demonstrate that the statements made are true. In fact, a few decades ago, people were
using the word “demonstration” instead of “proof” and the famous QED in the end of the proof
stands for “quod erat demonstrandum”, meaning, “which was to be demonstrated ”. We shall
prove some of the statements in the proposition now, leaving the rest as an exercise. Our method
of proof for Proposition 2.3 is by a standard technique.
3.1. Proving equality by two inequalities. When one wants to show two real numbers x, y
are equal. It is often easier to show instead that x ≤ y and y ≤ x and to conclude that x = y .
In the same spirit, to show two sets A and B are equal, one may show that every element
of A is an element of B and that every element of B is an element of A. That is, we prove two
“inequalities”, A ⊂ B and B ⊂ A. Thus, our principle of proof is
A=B if and only if
x ∈ A ⇒ x ∈ B, (The notation ⇒ means “implies that”.) x ∈ B ⇒ x ∈ A. Let us now prove the statement A ∩ (∪i∈I Bi ) = ∪i∈I (A ∩ Bi ). The way you should write it in an
assignment, test, or a research paper is:
Proposition 3.1. Let I be a set. Let A and Bi , i ∈ I, be sets then
A ∩ (∪i∈I Bi ) = ∪i∈I (A ∩ Bi ). Proof. Let x ∈ A ∩ (∪i∈I Bi ) then x ∈ A and x ∈ ∪i∈I Bi . That is, x ∈ A and x ∈ Bi0 for some
i0 ∈ I . Then x ∈ A ∩ Bi0 and so x ∈ ∪i∈I (A ∩ Bi ). We have shown so far that A ∩ (∪i∈I Bi ) ⊂
∪i∈I (A ∩ Bi ).
Conversely, let x ∈ ∪i∈I (A∩Bi ). Then, there is some i0 ∈ I such that x ∈ A∩Bi0 and so for that
i0 we have x ∈ A and x ∈ Bi0 . In particular, x ∈ A and x ∈ ∪i∈I Bi and so x ∈ A ∩ (∪i∈I Bi ).
(The designates that the proof is complete. It is equivalent to writing QED.)
Lets also do
A \ (∪i∈I Bi ) = ∩i∈I (A \ Bi ). 10 We use the same technique. Let x ∈ A \ (∪i∈I Bi ) thus x ∈ A and x ∈ ∪i∈I Bi . That means
that x ∈ A and for all i ∈ I we have x ∈ Bi . That is, for all ∈ I we have x ∈ A \ Bi and
so x ∈ ∩i∈I (A \ Bi ).
Conversely, let x ∈ ∩i∈I (A \ Bi ). Then, for all i ∈ I we have x ∈ A \ Bi . That is, x ∈ A and
x ∈ Bi for every i. Thus, x ∈ A and x ∈ ∪i∈I Bi and it follows that x ∈ A \ (∪i∈I Bi ).
3.2. Proof by contradiction and the contrapositive. Proof by contradiction is a very useful
technique, even though using it too often shows lack of deeper understanding of the subject.
Suppose that some statement is to be proven true. In this technique one assumes that the
statement is false and then proceeds to derive logical consequences of this assumption until an
obvious contradiction arises. Here is an easy example that illustrates this:
Claim. There are no solution to the equation x2 − y 2 = 1 in positive integers.
Proof. Assume not. Then there are positive integers x, y such that x2 − y 2 = 1. Then, (x −
y )(x + y ) = 1. However, the only product of integers giving 1 is 1 × 1 or −1 × −1 and, in any
case, it follows that x − y = x + y . It follows that 2y = (x + y ) − (x − y ) = 0 and so that y = 0.
Contradiction.
Here is another example.
Claim. If x and y are two integers whose sum is odd, then exactly one of them is odd.
Proof. Suppose not. Then, either both x and y are odd, or both x and y are even. In the ﬁrst
case x = 2a +1, y = 2b +1, for some integers a, b, and x + y = 2(a + b +1) is even – contradiction. In
the second case, x = 2a, y = 2b, for some integers a, b, and x + y = 2(a + b) is even – contradiction
again.
Somewhat related is the technique of proving the contrapositive. Let A and B be two assertions
and let ¬A, ¬B be their negations. Logically the implication
is equivalent A =⇒ B
¬B =⇒ ¬A. Here is an example. Let A be the statement “it rains” and B the statement “it’s wet outside”.
Then ¬A is the statement “it doesn’t rain” and ¬B is the statement “it’s dry outside”. The
meaning of A =⇒ B is “it rains therefore it’s wet outside” and its contrapositive is “it’s dry
outside therefore it doesn’t rain”. Those statements are equivalent in the sense that each implies
the other. Here is a mathematical example:
Claim. If x and y are integers such that xy is even that either x or y is even. 11 Proof. The contrapositive is: If both x and y are odd then xy is odd. To prove that, write x =
2a + 1, y = 2b + 1 for some integers a, b. Then xy = 4ab + 2a + 2b + 1 = 2(2ab + a + b) + 1, is one
more than an even integer and so is odd.
3.3. Proof by Induction. Induction is perhaps the most fun technique. Its logical foundations
also lie deeper that the previous methods. The principle of induction, to be explained below,
rests on the following axiom :
Axiom: Every non empty subset of N has a minimal element.
We remark that the axiom is actually intuitively obviously true. The reason we state it as an
axiom is that when one develops the theory of sets in a very formal way from fundamental axioms
the axiom stated above doesn’t follow from simpler axioms and, in one form or another, has to
be included as an axiom.
Theorem 3.2. (Principle of Induction) Suppose that for every natural number n ≥ n0 we are
given a statement Pn . Suppose that we know that:
(1) Pn0 is true.
(2) If Pn is true then Pn+1 is true.
Then Pn is true for every n.
Proof. Suppose not. Then the set
S = {n ∈ N : n ≥ n0 , Pn is false} is a nonempty set and therefore has a minimal element a. Note that a > n0 , because Pn0 is true
by (1), and so a − 1 ≥ n0 . Now a − 1 ∈ S because of the minimality of a and so Pa−1 is true.
But then, by (2), also Pa is true. Contradiction.
Remark 3.3. The mental picture I have is of a staircase. The ﬁrst step is marked n0 and each
higher step by the integers n0 + 1, n0 + 2, no + 3, . . . . I know I can reach that ﬁrst step and I know
that if I can reach a certain step (say marked n) then I can reach the next one (marked n + 1).
I conclude that I can reach every step.
Example 3.4. Prove that for every positive integer n we have
1 + 2 + ··· + n =
(The statement is Pn is 1 + 2 + · · · + n = n(n+1)
2 n(n + 1)
.
2 and is made for n ≥ 1, that is n0 = 1.) The base case is when n = 1 (this is called the base case of the induction). In this case we need to show
that 1 = 1·(1+1) , which is obvious.
2
Now, we assume that statement true for n, that is we assume that
1 + 2 + ··· + n = n(n + 1)
,
2 12 and we need to show it’s true of n + 1. That is, we need to prove that 1 + 2 + · · · + n + (n + 1) = (n + 1)(n + 2)
.
2 (By achieving that we would have shown that if Pn is true then Pn+1 is true.) We use the
assumption that it’s true for n (that’s called the induction hypothesis ) and write
1 + 2 + · · · + n + (n + 1) = n(n + 1)
+n+1
2 = n(n + 1) + 2(n + 1)
2 = n 2 + 3n + 2
2 = (n + 1)(n + 2)
.
2 Example 3.5. Here we prove the following statement: Let q = 1 be a real number. Then for
every n ∈ N we have
1 + q + · · · + qn = 1 − q n+1
.
1−q The statement Pn is “ for every real number q = 1, 1 + q + · · · + q n = 1−q n+1
1−q ”. The base case, that is the ﬁrst n for the statement is being claimed true, is n = 0; the statement is then
1= 1−q
,
1−q which is obviously true.
Now suppose that the statement is true for n. That is, 1 + q + · · · + qn = 1 − q n+1
.
1−q We need to show that
1 + q + · · · + q n+1 = 1 − q n+2
.
1−q 13 Indeed,
1 + q + · · · + q n+1 = (1 + q + · · · + q n ) + q n+1
=
=
=
= 1 − q n+1
+ q n+1
1−q 1 − q n+1 (1 − q )q n+1
+
1−q
1−q 1 − q n+1 q n+1 − q n+2
+
1−q
1−q
1 − q n+2
.
1−q 3.4. Prove or disprove. A common exercise (and a situation one often faces in research) is to
prove or disprove a particular statement. For example,
“Prove or disprove: for every natural number n, 4n + 1 is either a square or a sum of two
squares.”
At that point you are requested ﬁrst to form a hunch, a guess, an opinion about whether the
statement is true or false. To form that hunch you can try some examples (4 ∗ 0 + 1 = 1 = 12 , 4 ∗ 1 + 1 = 5 = 12 + 22 , 4 ∗ 2 + 1 = 9 = 32 , 4 ∗ 3 + 1 = 13 = 22 + 32 , ...) to see if the statement
holds for these examples, see if it is similar to other statements you know to hold true, or, when
at lost, throw the dice. After deciding on your initial position, if you believe the statement you
should proceed to ﬁnd a proof. If you don’t, then you have two options. You can try and show
that if the statement is true it will imply a contradiction to a known fact, or to provide one
counterexample. The statement being false doesn’t mean it’s false for every n; it means it’s false
of one n. In the case in hand, if we take n = 5 we ﬁnd that 4 ∗ 5 + 1 = 21 is neither a square nor
a sum of squares (just try all possibilities) and so the statement is false.
3.5. The pigeonhole principle. The pigeonhole principle is deceptively simple, given how useful it is. It states the following:
If there are more pigeons than pigeonholes then in one hole there must be at least two pigeons.
Example 3.6. Let a, b, c, d, e, f be 6 integers. Then there are among them two integers whose
diﬀerence is divisible by 5.
To prove this consider the residues of a, b, c, d, e, f upon division by 5. The residue is either 0, 1, 2, 3 or 4 (these are the 5 pigeonholes) but we get from our numbers 6 residues (those are
the pigeons). Therefore, there are two numbers among a, b, c, d, e, f with the same residue. Say, a
and b have the same residue, say r. Then a − b is divisible by 5 (because a = 5a + r, b = 5b + r
and so a − b = 5(a − b )). 14 A similar example is the following:
Example 3.7. In any group of n people, there are two that have the same number of friends
within the group.
We prove that by induction on n. The cases n = 1, 2 are trivial. Suppose the claim for
all n ≤ N and consider a group of N + 1 people. If there is a person with 0 friends, we can look
at the rest of the people. This is a group of N people and the number of friends each has in
this smaller group is the same as in the original group. We can apply induction to conclude that
two have the same number of friends (initially in the smaller group, but in fact also in the larger
group).
The other case is when each person in the group of N + 1 people has at least 1 friend. By
considering the number of friends each person in the group has, we get N + 1 numbers between 1
and N and so two must be equal. 4. Functions
There are more formal and less formal ways to deﬁne a function. Here we take the most pedestrian
approach. Let A and B be sets. A function f from A to B ,
f : A −→ B, is a rule assigning to each element of A a unique element of B . The set A is called the source,
or the domain, of the function, and B the target of the function. For a ∈ A, f (a) is called the
image of a (under f ) and f (A) = {f (a) : a ∈ A} is the image of f .
Example 4.1. The simplest example is the identity function. Let A be any set and deﬁne
1A : A → A to be the function sending each element to itself. Namely,
1A (x) = x,
for any x ∈ A.
Example 4.2. Let A = {1, 2, 3}, B = {1, 2} and consider the following rules for f : A −→ B.
(1) f (1) = 2, f (2) = 1, f (3) = 1.
(2) f (1) = 1 or 2, f (2) = 2, f (3) = 1.
(3) f (1) = 1, f (2) = 1.
The ﬁrst recipe deﬁnes a function from A to B . The second recipe does not, because 1 is assigned
two possible values. The third also doesn’t deﬁne a function because no information is given
about f (3).
Example 4.3. Consider
√
(1) f : R → R, f (x) = x.
(2) f : R≥0 → R, f (x) = y, where y is a real number such that y 2 = x. 15 (3) f : R≥0 → R, f (x) = the non negative root of x. The ﬁrst deﬁnition fails because −1 doesn’t have a real root. The second deﬁnition fails because
every positive number has 2 roots (diﬀering by a sign) and it isn’t clear which root one is supposed
to take. This problem also exists in the ﬁrst deﬁnition. The third deﬁnition does deﬁne a function.
There are various ways to deﬁne a function. It could be done by writing down f (a) for every a ∈ A
explicitly. It could be done by providing a formula and it could be done by giving some other
description. For example, A = B is the set of all people who ever lived, f : A → A is given by
f (a) = a s mother. (This deﬁnitely looks like a good deﬁnition at ﬁrst sight. However, the astute reader will note
the problem here. If this function was truly welldeﬁned then the set A must be inﬁnite, because
if it were ﬁnite we would have a person who’s a descendant of itself. Since a mother is older from
any of her childs by at least one year (to be safe) it follows, if A and f are well deﬁned, that
people have existed forever. The ways to resolve the paradox are rather amazing and I leave it
to you as an amusing exercise.)
Here is some more notation: the symbol ∀ means “for all”. the symbol ∃ means “exists”. The
symbol ∃! means “exists unique”. A function can also be described as a set
Γ ⊂ A × B, with the following property: ∀a ∈ A, ∃!b ∈ B such that (a, b) ∈ Γ. We then deﬁne f (a) to be the
unique b such that (a, b) ∈ Γ. Conversely, given a function f we let
Γ = Γf = {(a, f (a)) : a ∈ A}. The set Γf is called the graph of f . Example 4.4. Let A be a set and Γ ⊂ A the “diagonal”,
The function deﬁned by Γ is 1A . Γ = {(x, x) : x ∈ A}. Example 4.5. Let f1 , f2 : A → B be functions. We say that f1 = f2 if for every a ∈ A we
have f1 (a) = f2 (a). Equivalently (exercise!) if Γf1 = Γf2 .
4.1. Injective, surjective, bijective, inverse image. We introduce some attributes of functions. Let
f :A→B be a function. Then:
(1) f is called injective if f (a) = f (a ) ⇒ a = a . (I.e., diﬀerent elements of A go to diﬀerent
elements of B .) Such a function is also called oneone.
(2) f is called surjective (or onto ) if ∀b ∈ B, ∃a ∈ A such that f (a) = b. (I.e., every element
in the target is the image of some element in the source.) 16 (3) f is called bijective if it is both injective and surjective. In that case, every element of B
is the image of a unique element of A.
Let f : A → B be a function. Let U ⊂ B . We deﬁne the preimage of U to be the set
f −1 (U ) = {a : a ∈ A, f (a) ∈ U }.
If U consists of a single element, U = {u}, we also write f −1 (u) and call it the ﬁbre of f over u.
Example 4.6.
(1) f : R → R, f (x) = x2 . Then f is neither surjective (a square is always
nonnegative) nor injective (f (x) = f (−x)).
We have f −1 ([1, 4]) = [1, 2] ∪ [−2, −1] and f −1 (0) = {0}, f −1 (−1) = ∅. (2) f : R → R≥0 , f (x) = x2 . Then f is surjective but not injective.
(3) f : R≥0 → R≥0 , f (x) = x2 . Then f is bijective.
4.2. Composition of functions. Let
f : A → B, g : B → C, be functions. We deﬁne their composition, g ◦ f , to be the function:
g ◦ f : A → C, (g ◦ f )(x) = g (f (x)). Lemma 4.7. We have the following properties:
(1) If g ◦ f is injective then f is injective.
(2) If g ◦ f is surjective then g is surjective.
Proof. Suppose that g ◦ f is injective. Let a, a ∈ A be elements such that f (a) = f (a ). We
need to show that a = a . We have g (f (a)) = g (f (a )) or otherwise said, (g ◦ f )(a) = (g ◦ f )(a ).
Since g ◦ f is injective, a = a .
Suppose now that g ◦ f is surjective. Let c ∈ C . We need to show that there is an element b ∈ B
with g (b) = c. Since g ◦ f is surjective, there is a ∈ A such that (g ◦ f )(a) = c. Let b = f (a)
then g (b) = g (f (a)) = (g ◦ f )(a) = c.
4.3. The inverse function. Let f : A → B be a bijective function. In this case we can deﬁne
the inverse function
by the property f −1 : B → A,
f −1 (a) = b if f (b) = a. This is well deﬁned: since f is surjective such b exists for every a and is unique (because f is
injective). Thus f −1 is a function. It is easy to verify that
f −1 ◦ f = 1A , f ◦ f −1 = 1B . 17 5. Cardinality of a set
Imagine a group of students about to enter a lecture hall. The instructor wants to know if
there are suﬃcient chairs for all the students. There are two ways to do that. One is to count
both students and chairs separately and determine which is larger. The other is to ask each
student to take a seat. If there are students left standing, the number of chairs is too small. If
there are chairs left unoccupied, there are more chairs than students. In the remaining case there
is a perfect match between students and chairs and so their number (cardinality) is equal.
This idea proves very powerful in discussing the cardinality (“size”, “magnitude”, “number of
elements”) of sets, ﬁnite or inﬁnite.
George Cantor has revolutionized mathematics, and human thought, by deﬁning two sets A, B
possibly inﬁnite to be of equal cardinality, denoted A = B , if there is a bijective function
f : A → B.
(Note that then there is an inverse function f −1 : B → A which is also a bijection, so it doesn’t
matter if we require a bijection from A to B or from B to A.) He deﬁned the cardinality of A to
be no larger than B ’s if there is an injective function
f : A → B,
and this is denoted A ≤ B . We also say that the cardinality of A < B  if A ≤ B 
and A = B . As a sanity check we’d like to know that
Proposition 5.1. Let A, B, C be sets. If A = B  and B  = C  then A = C .
Proof. Let f : A → B, g : B → C be bijections. Then
g◦f :A→C
is also a bijection. Indeed: if for some x, y ∈ A we have (g ◦ f )(x) = (g ◦ f )(y ) then g (f (x)) =
g (f (y )). Since g is injective, f (x) = f (y ) and, since f is injective, x = y .
To show g ◦ f is surjective, let c ∈ C and choose b ∈ B such that g (b) = c; such b exists since g
is surjective. Since f is surjective, there is an a ∈ A such that f (a) = b. Then (g ◦ f )(a) =
g (f (a)) = g (b) = c.
To show this makes sense at all, we need to know the following
Theorem 5.2 (CantorBernstein). If A ≤ B  and B  ≤ A then A = B .
Although it does not require any sophisticated mathematics, the proof will not be given in this
course. It is an ingenious and intricate proof. We would like to explain though why the theorem
is not obvious.
What we are given that there is some injective function f from A to B and some injective
function g from B to A. We should conclude from that there is some bijective function h from A
to B . It is not true that h = f or g . One should somehow construct it from f and g . Here is an 18 example: let A be the set of points in the plane in distance at most 1 from the origin (the closed
unit disk) and B the square [−1, 1] × [−1, 1]. The function
f : A → B, f (x) = x, is a welldeﬁned injective function but not a bijection. The function
√
g : B → A,
g (b) = b/ 2,
is also a welldeﬁned injective function, but not bijection. One can ﬁnd a bijection from A to B ,
but it is not immediately clear how to ﬁnd it based on the knowledge of f and g (and in fact in
this particular example, it is better to “rethink the situation” than to deduce it from f and g ).
A set A is called countable (or enumerable ) if it has the same cardinality as N. In this case, there
is a bijective function f : N → A. The elements of A are thus {f (0), f (1), f (2), f (3), . . . }. If we introduce the notation ai = f (i) then we can also enumerate the elements of A as {a0 , a1 , a2 , a3 , . . . }
and this explains the terminology.
Example 5.3. Let A be the set {0, 2, 4, 6, . . . } and B the set {1, 4, 9, 16, 25, . . . }. Then
N = A = B .
Indeed, one veriﬁes that the functions
f : N → A, f (x) = 2x, g : N → B, g (x) = x2 , and
are bijections. We can then conclude that A = B  (though it is not hard either to write a
bijection from A to B ).
Example 5.4. The cardinality of N is the cardinality of A = {x ∈ N : x is not a square}.
Instead of trying to write a bijection explicitly, which is not that straightforward, we use the
theorem. The function
f : A → N,
f (a) = a,
is injective. Thus, A ≤ N. Consider the function
f : N → A, f (x) = 4x + 3. First, this is well deﬁned. Namely, 4x + 3 is really in A. This is because if a n is square, n leaves
residue 1 or 0 when divided by 4 (if n = (2m)2 = 4m2 the residue is zero; if n = (2m + 1)2 =
4(m2 + m) + 1 the residue is one). But 4x + 3 leaves residue 3. Clearly f is injective and
so N ≤ A.
Proposition 5.5. N = Z. 19 Proof. We deﬁne
f : Z → N,
2x
x≥0
g (x) =
−2x − 1 x < 0.
Then f is a bijective function, as is easy to check.
Proposition 5.6. N = N × N.
Proof. Deﬁne f : N → N × N by f (n) = (n, 0). This is an injective function. Deﬁne
g : N × N → N, g (n, m) = 2n 3m . This is also an injective function. If 2n 3m = 2a 3b then n = a, m = b by unique factorization (to
be discussed in § 10). We conclude that N = N × N.
Corollary 5.7. Z = Z × Z.
Proof. Let h : N → N × N be a bijection. A bijection f : N → Z induces a bijection g = (f, f ) :
N × N → Z × Z and the composition
Z f N h N×N g −1 Z×Z , is also a bijection. Exercise 5.8. Prove that N = Q. (Hint: there’s an easy injection Q → Z × Z).
When Cantor has laid down the foundations for the study of inﬁnite cardinals he also dropped a
bombshell:
Theorem 5.9. N = R.
The argument in the proof became known as Cantor’s diagonal argument and is used in many
proofs.
Proof. Suppose that N = R. We can then enumerate the real numbers as a0 , a1 , a2 , . . . . Let
us write the decimal expression of each number as
a0 = b0 . . . b0 (0) .c0 c0 c0 c0 . . .
1
0123
n
a1 = b1 . . . b1 (1) .c1 c1 c1 c1 . . .
1
0123
n
a2 = b2 . . . b2 (2) .c2 c2 c2 c2 . . .
1
0123
n
a3 = b3 . . . b3 (3) .c3 c3 c3 c3 . . .
1
0123
n
where we agree to use 000000000 . . . instead of 999999999 . . . (so we write 1.00000000000 . . .
and not 0.9999999999 . . . , etc.).
Now consider the number
3 ci = 3
i
0.e0 e1 e2 e3 e4 . . . ,
ei =
4 ci = 3.
i 20 This is a real number that diﬀers from each ai at the i +1th digit after the decimal dot and hence
is not equal to any ai . It follows that the list a0 , a2 , a2 , . . . cannot consist of all real numbers and
so we arrive at a contradiction.
6. Number systems
Again we start with an apology of a sort. The formal discussion of number systems is a rather
involved piece of mathematics. Our approach is pragmatic. We assume that at some level we
all know what are integers and real numbers and no confusion shall arise there. We use those to
deﬁne more complicated notions.
As we have already said, we denote the natural numbers by
N = {0, 1, 2, . . . }, N+ = {1, 2, 3, . . . }. We also denote the integers by
Z = {. . . , −2, −1, 0, 1, 2, . . . }.
The rational numbers are the set
Q= a
b
: a, b ∈ Z, b = 0 . The real numbers R are the “points on the line”. Each real number has a decimal expansion such
as 0.19874526348 . . . that may or may not repeat itself from some point on. For example:
1/3 = 0.3333333 . . . , 1/2 = 0.5000000 . . . , 1/7 = 0.142857142857142857142857142857 . . .
It is a fact that a number is rational if and only if from some point on its decimal expansion
becomes periodic.
The complex numbers are deﬁned as the set
C = {a + bi : a, b ∈ R}.
Here i is a formal symbol. We can equally describe the complex numbers as points (a, b) ∈ R2  the plane. The function f : C → R2 , f (a + bi) = (a, b) is bijective. The xaxis are now called the
real axis and the y axis the imaginary axis. If z = a + bi is a complex number a is called the real
part of z and is denoted Re(z ) and b is called the imaginary part of z and is denoted Im(z ). We
therefore have
z = Re(z ) + Im(z )i.
The point corresponding to z in the plane model is (Re(z ), Im(z )).
One can perform arithmetic operations with the complex numbers using the following deﬁnitions:
−(a + bi) = −a − bi, (a + bi) + (c + di) = (a + c) + (b + d)i. 21 Up to this point things look nice in the plane model as well:
−(a, b) = (−a, −b), (a, b) + (c, d) = (a + c, b + d). (This is the addition of the vectors.) The key point is that we can also deﬁne multiplication. The
deﬁnition doesn’t have any prior interpretation in the model of the plane. We let
(a + bi)(c + di) = ac − bd + (ad + bc)i.
In particular,
i2 = −1. This shows that we have really gone beyond the realm of real numbers because there is no real
number whose square is −1 (or just negative, for that matter). The operations described above
satisfy the usual rules of arithmetic, such as
(z + z ) + z = z + (z + z ), z (z + z ) = zz + zz , ... (We shall later say that the complex numbers form a ﬁeld.)
Let z = a + bi be a complex number. We deﬁne its complex conjugate of z , z , as follows:
¯
z = a − bi.
¯
Lemma 6.1. The complex conjugate has the following properties:
(1) z = z .
(2) z1 + z2 = z1 + z2 , z1 · z2 = z1 · z2 .
¯
¯
¯¯
(3) Re(z ) = z +¯
z
2, Im(z )i = z −z
¯
2. (4) Deﬁne for z = a + bi,
z  =
a2 + b2 . (This is just the distance of the point (a, b) from the origin.) Then z 2 = z · z and the
¯
following holds:
z1 + z2  ≤ z1  + z2 , z1 · z2  = z1  · z2 . Proof. Denote z = z1 = a + bi, z2 = c + di.
We have z = a − bi and so z = a + bi = z . That is (1). For (2) we calculate
z1 + z2 = (a + c) + (b + d)i
= (a + c) − (b + d)i
= a − bi + c − di
= a + bi + c + di
= z1 + z2 . 22 Similarly,
z1 z2 = (ac − bd) + (ad + bc)i
= (ac − bd) − (ad + bc)i
= (a − bi)(c − di)
= a + bi · c + di
= z 1 · z2 . We have (z + z )/2 = ((a + bi) + (a − bi))/2 = a = Re(z ) and (z − z )/2 = ((a + bi) − (a − bi))/2 =
¯
¯
bi = Im(z )i, which is (3). Next, z 2 = a2 + b2 = (a + bi)(a − bi) = z · z . Now,
¯
z1 z2 2 = z1 z2 · z1 z2 = z1 z 2 · z1 · z2 = z1 · z1 · z 2 · z2
= z1 2 · z2 2 .
Thus, the assertion z1 · z2  = z1  · z2  follows by taking roots. The inequality z1 + z2  ≤ z1  + z2 
viewed in the plane model for complex numbers is precisely the assertion that the sum of the
lengths of two sides of a triangle is greater or equal to the length of the third side.
Example 6.2. If z = 0 then z has an inverse. Indeed, z ·
z −1 = z
¯
z 2 = z ·z
¯
z 2 = 1. We write z
¯
.
z 2 Just to illustrate we calculate the expression 1 + 2i +
1
(3 − i)(1 − 5i)/26 = − 13 − 8
13 i and thus 1 + 2i + 3− i
1
1+5i . We have 1+5i
3− i
12
18
1+5i = 13 + 13 i. = 1− 5i
26 and so 3− i
1+5i = 6.1. The polar representation. Considering z = a + bi in the plane model as the vector (a, b)
we see that we can describe each complex number by the length r of the vector and the angle θ
it forms with the real axis. We have
r = z , sin θ = Im(z )
,
z  cos θ = Re(z )
.
z  Lemma 6.3. If z1 has parameters r1 , θ1 and z2 has parameters r2 , θ2 then z1 z2 has parameters r1 r2 , θ1 + θ2 (up to multiples of 3600 ). 23 Proof. We have r1 r2 = z1 z2  = z1 z2  and this shows that r1 r2 is the length of z1 z2 . Let θ be
the angle of z1 z2 then
Im(z1 z2 )
sin θ =
z1 z2 
=
= Re(z1 )Im(z2 ) + Re(z2 )Im(z1 )
z1 z2  Re(z1 ) Im(z2 ) Re(z2 ) Im(z1 )
+
z1 
z2 
z2 
z1  = cos θ1 sin θ2 + cos θ2 sin θ1
= sin(θ1 + θ2 ).
Similarly, we get
cos θ =
= Re(z1 z2 )
z1 z2  Re(z1 ) Re(z2 ) Im(z1 ) Im(z2 )
−
z1 
z2 
z1 
z2  = cos(θ1 ) cos(θ2 ) − sin(θ1 ) sin(θ2 )
= cos(θ1 + θ2 ). It follows that θ = θ1 + θ2 up to multiples of 3600 . Let θ be any real number. Let eiθ denote the unit vector whose angle is θ. That is, that
complex number with length 1 and angle θ. Clearly we have
eiθ = cos θ + i sin θ.
If z is any complex number with length r and angle θ then we have the equality z = z eiθ . The
formula we have proven is
and in particular z1 z2 = z1 z2 ei(θ1 +θ2 )
eiθ1 eiθ2 = ei(θ1 +θ2 ) . (Conversely, the identity eiθ1 eiθ2 = ei(θ1 +θ2 ) implies the formula for multiplication). Let z = a + bi
be a complex number then we deﬁne
ez = ea eib ,
where ea is the usual exponential and eib is as deﬁned above. We then have
ez1 ez2 = ez1 +z2 .
We have deﬁned here ez in a purely formal way. One can show that for every complex number z
the series
z2 z3
zn
1+z+
+
+ ··· +
+ ...
2!
3!
n! 24 converges and is equal to ez .
Example 6.4. Consider the polynomial xn − a = 0, where a is a nonzero complex number. We claim that this has n distinct roots in C. Write a = reiθ (so a = r and the line from 0 to a forms an angle θ with the real axis). A complex number z = ReiΘ is a solution to the equation
if and only if z n = Rn einΘ = reiθ . That is, if and only if
Rn = r, nΘ ≡ θ (mod 2π ). Thus, the solutions are exactly
j z = r1/n ei(θ+2π· n ) , j = 0, 1, . . . , n − 1. In particular, taking a = 1 the solutions are called the roots of unity of order n. There are
1
2
precisely n of them. Those are the points on the unit circle having angle 0, n ·2π , n ·2π , . . . , n−1 ·2π .
n 6.2. The Fundamental Theorem of Algebra. A complex polynomial f (x) is an expression
of the form an xn + an−1 xn−1 + · · · + a1 x + a0 , where n is a nonnegative integer, x is a variable,
and the coeﬃcients ai are complex numbers. (If all the coeﬃcients are real we may call it a real
polynomial ; if all the coeﬃcients are rational numbers we may call it a rational polynomial and
so on. But note that x2 + 1 is both a rational, a real and a complex polynomial.) The zero
polynomial, denote 0, is the case when n = 0 and a0 = 0.
A polynomial deﬁnes a function
f : C → C, z → f (z ) = an z n + an−1 z n−1 + · · · + a1 z + a0 . (The notation → means “maps to”.) If an = 0 then we say f has degree n. If f (z ) = 0 we say
that the complex number z is a root (or a solution, or a zero ) of the polynomial f .
Example 6.5. Consider the polynomial f (x) = x2 + 1. It has degree 2 and f (i) = i2 + 1 =
−1 + 1 = 0, f (−i) = (−i)2 + 1 = −1 + 1 = 0. So i and −i are roots of f . (This is a special case
of Example 6.4.)
Theorem 6.6. (The Fundamental Theorem of Algebra) Let f (x) be a complex polynomial of
degree at least 1. Then f (x) has a root in C.
The proof of the theorem is beyond the scope of this course. It has many proofs. In Honours
Algebra 4 MATH371 one sees an algebraic proof; in Complex Variables and Transforms MATH
381 one sees an analytic proof.
Proposition 6.7. Let f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 be a complex nonzero polynomial
of degree n. Then
n
f (x) = an (x − zi ),
i=1 25 for suitable complex numbers, not necessarily distinct, zi . The numbers zi are all roots of f and
any root of f is equal to some zi . Moreover, this factorization is unique.
Proof. We prove the result by induction on n. For n = 0 we understand the product n (x − zi )
i=1 as one (this is a convention: the empty product is equal to one, the empty sum is equal to zero.)
and so the claim is just that a constant polynomial is equal to its leading coeﬃcient. Clear.
Now, assume that f has degree at least one. By the Fundamental Theorem of Algebra there
is a complex number zn say such that f (zn ) = 0. We claim that for every complex number z we
can write
f (x) = (x − z )g (x) + r,
where g (x) is a polynomial of degree n − 1 and leading coeﬃcient an and r is a complex number. Indeed, write g (x) = bn−1 xn−1 + · · · + b1 x + b0 and equate coeﬃcients in (x − z )g (x) = bn−1 xn + (bn−2 − zbn−1 )xn−1 + · · · + (b0 − zb1 )x and f (x). We want complex numbers b0 , . . . , bn−1 such
that
bn−1 = an , (bn−2 − zbn−1 ) = an−1 , . . . , (b0 − zb1 ) = a1 ,
and there is no problem solving these equations. Thus, we can choose g (x) with a leading
coeﬃcient an such that f (x) − (x − z )g (x) = r is a constant.
Now, apply that for z = zn . We have f (x) − (x − zn )g (x) = r. We view r as a polynomial and
substitute x = zn . We get
f (zn ) − (zn − zn )g (zn ) = r. Since f (zn ) = 0 we conclude that r = 0.
We showed that if f (zn ) = 0 then f (zn ) = (x − zn )g (x), g (x) = bn−1 xn−1 + · · · + b0 . In fact, bn−1 = an . Using the induction hypothesis, we have
g (x) = an n−1
i=1 (x − zi ), for some complex numbers zi and so
f (x) = an n
(x − zi ).
i=1 n We note that f (zj ) = an i=1 (zj − zi ) = 0, because the product contains the term (zj − zj ).
If f (z ) = 0 then an n (z − zi ) = 0. But, if a product of complex numbers is zero one of the
i=1
number is already zero. Since an = 0, we must have z = zi for some i.
It remains to prove the uniqueness of the factorization. Suppose that
f (x) = an n
n
(x − zi ) = a (x − ti ).
i=1 i=1 26 Since the leading coeﬃcient of f is an we must have a = an . We now argue by induction. The
case of degree 0 is clear. Assume f has degree greater than zero. Then the ti are roots of f
and so t1 is equal to some zi . But we may reindex the zi so that t1 = z1 . Dividing both sides
by x − z1 we then conclude that2
an n
n
(x − zi ) = an (x − ti ),
i=2 and, by induction, zi = ti for all i. i=2 We remark that for n = 1, 2 the result is well known:
−b
ax + b = a x −
,
a
√
√
−b + b2 − 4ac
−b − b2 − 4ac
2
ax + bx + c = a x −
x−
.
2a
2a
There are also formulas for the roots for polynomials of degree 3 and 4, but in degrees 5 and
higher no such formulas exist. This follows from Galois theory, taught in MATH371. 7. Fields and rings  definitions and first examples
In the examples we have already discussed, or the deﬁnitions we have made, there are implicit
structures that we want to deﬁne now in a formal way. At this point we just provide the deﬁnitions
and reconsider previous examples. Later we shall enter a systematic development of the theory.
An operation (more pedantically called a “binary operation”) on a set R is a function
w : R × R → R. That is, it is a rule taking two elements of R and returning a new one. For example:
w : C × C → C, or w : C × C → C, w(z1 , z2 ) = z1 + z2 ,
w(z1 , z2 ) = z1 z2 . Often, for a general set R we may denote w(z1 , z2 ) by z1 + z2 , or z1 z2 , if we want to stress the
fact that the operation behaves like addition, or multiplication.
Deﬁnition 7.1. A ring R is a nonempty set together with two operations, called “addition”
and “multiplication” that are denoted, respectively, by
(x, y ) → x + y, One requires the following axioms to hold:
2We (x, y ) → xy. say that f (x)/g (x) = h(x) if h(x) is a polynomial such that f (x) = g (x)h(x). We shall see later
n
that h(x) is uniquely determined. In our case clearly f (x)/(x − z1 ) = an i=2 (x − zi ). 27 (1) x + y = y + x, ∀x, y ∈ R. (Commutativity of addition)
(2) (x + y ) + z = x + (y + z ), ∀x, y, z ∈ R. (Associativity of addition)
(3) There exists an element in R, denoted 0, such that 0 + x = x, ∀x ∈ R. (Neutral element
for addition)
(4) ∀x ∈ R, ∃y ∈ R such that x + y = 0. (Inverse with respect to addition)
(5) (xy )z = x(yz ), ∀x, y, z ∈ R. (Associativity of multiplication)
(6) There exists an element 1 ∈ R such that 1x = x1 = x, ∀x ∈ R. (Neutral element for
multiplication)
(7) z (x + y ) = zx + zy, (x + y )z = xz + yz, ∀x, y, z ∈ R. (Distributivity)
We remark that for us a ring always has an identity element with respect to multiplication. Not
all authors insist on that, though.
Note that the multiplication is not assumed to be commutative.
Deﬁnition 7.2. If xy = yx for all x, y ∈ R, we say R is a commutative ring. If for every nonzero x ∈ R there is an element y ∈ R such that xy = yx = 1, and also 0 = 1 in R, we call R a
division ring. A commutative division ring is called a ﬁeld.
Example 7.3. Z is a commutative ring. It is not a division ring and so it is not a ﬁeld.
Example 7.4. The rational numbers Q form a ﬁeld. The real numbers R form a ﬁeld. In both
those cases we are forced to assume the properties as “well known”. The complex numbers also
form a ﬁeld, in fact we have at some level already used all the axioms implicitly in our calculations,
but now we prove it formally using that R is a ﬁeld.
Proposition 7.5. C is a ﬁeld.
Proof. Let z1 = a1 + b1 i, z2 = a2 + b2 i, z3 = a3 + b3 i. We verify the axioms:
1. z1 + z2 = (a1 + a2 ) + (b1 + b2 )i = (a2 + a1 ) + (b2 + b1 )i = z2 + z1 .
2. (z1 + z2 ) + z3 = [(a1 + a2 ) + (b1 + b2 )i] + a3 + b3 i = [(a1 + a2 ) + a3 ] + [(b1 + b2 ) + b3 ]i =
[a1 + (a2 + a3 )] + [b1 + (b2 + b3 )]i = z1 + [(a2 + a3 ) + (b2 + b3 )i] = z1 + (z2 + z3 ).
3. Clearly 0 + z1 = z1 .
4. We have (−a1 − b1 i) + (a1 + b1 i) = (−a1 + a1 ) + (−b1 + b1 )i = 0 + 0i = 0.
5. (z1 z2 )z3 = [(a1 + b1 i)(a2 + b2 i)](a3 + b3 i) = ((a1 a2 − b1 b2 ) + (a1 b2 + b1 a2 )i)(a3 + b3 i) =
(a1 a2 − b1 b2 )a3 − (a1 b2 + b1 a2 )b3 + ((a1 a2 − b1 b2 )b3 + (a1 b2 + b1 a2 )a3 )i = a1 a2 a3 − b1 b2 a3 − a1 b2 b3 −
b1 a2 b3 + (a1 a2 b3 − b1 b2 b3 + a1 b2 a3 + b1 a2 a3 )i. One now develops the product z1 (z2 z3 ) in the same
way and checks that the answers match. We don’t do that here.
6. Clearly 1 · z1 = z1 · 1 = z1 .
7. z1 (z2 +z3 ) = (a1 +b1 i)((a2 +a3 )+(b2 +b3 )i) = a1 (a2 +a3 )−b1 (b2 +b3 )+(b1 (a2 +a3 )+a1 (b2 +b3 )i) =
(a1 a2 −b1 b2 )+(b1 a2 +a1 b2 )i+(a1 a3 −b1 b3 )+(b1 a3 +a1 b3 )i = (a1 +b1 i)(a2 +b2 i)+(a1 +b1 i)(a3 +b3 i) =
z1 z 2 + z1 z3 .
Before proving the next property, we check that z1 z2 = z2 z1 . We have z1 z2 = (a1 + b1 i) (a2 +
b2 i) = a1 a2 − b1 b2 + (a1 b2 + b1 a2 )i = a2 a1 − b2 b1 + (a2 b1 + b2 a1 )i = (a2 + b2 i)(a1 + b1 i) = z2 z1 .
In particular, (z1 + z2 )z3 = z3 (z1 + z2 ) = z3 z1 + z3 z2 = z1 z3 + z2 z3 . 28 Finally, as we have already seen, if z1 = 0 then z1 · z1
¯
z 1  2 = 1. We proved that C is a ﬁeld. 7.1. Some formal consequences of the axioms. We note some useful formal consequences
of the axioms deﬁning a ring:
(1) The element 0 appearing in axiom (3) is unique. Indeed, if q is another element with the
same property then q + x = x for any x and in particular q + 0 = 0. But also, using the
property of 0 and commutativity, we have q + 0 = 0 + q = q . So q = 0.
(2) The element y appearing in axiom (4) is unique. Indeed, if for a given x we have x + y =
x + y = 0 then y = y + (x + y ) = (y + x) + y = (x + y ) + y = 0 + y = y . We shall
denote y by −x.
(3) We have −(−x) = x and −(x + y ) = −x − y , where, technically −x − y means −x + (−y ).
To prove that, it is enough, after what we have just proven, to show that −x + x = 0
and that (x + y ) + (−x − y ) = 0. But, −x + x = x + (−x) = 0 and (x + y ) + (−x − y ) =
x + (−x) + y + (−y ) = 0 + 0 = 0.
(4) The element 1 in axiom (6) is unique. (Use the same argument as in (1)).
(5) We have x · 0 = 0, 0 · x = 0. Indeed, x · 0 = x · (0 + 0) = x · 0 + x · 0. Let y = x · 0
then y = y + y and so 0 = −y + y = −y + (y + y ) = (−y + y ) + y = 0 + y = y . We shall see many examples of rings and ﬁelds in the course. For now, we just give one more
deﬁnition and some examples.
Deﬁnition 7.6. Let R be a ring. A subset S ⊂ R is called a subring if 0, 1 ∈ S and if a, b ∈ S
implies that a + b, −a, ab ∈ S . Note that the deﬁnition says that the operations of addition and multiplication in R give
operations of addition and multiplication in S (namely, the outcome is in S and so we get
functions S × S → S ), satisfying all the axioms of a ring. It follows that S is a ring whose zero
element is that of R and whose identity is that of R.
For example, Q is a subring of R, Z is a subring of Q and so on.
Example 7.7. Consider the set {0, 1} with the following addition and multiplication tables.
+01
001
110 ×01
0 0 0.
101 One can verify that this is a ring by directly checking the axioms. (We shall later see that this
is the ring of integers modulo 2).
√
Example 7.8. Consider all expressions of the form {a + b 2 : a, b ∈ Z}. We use the nota√
√
tion Z[ 2] for this set. This set is actually a ring. Since Z[ 2] ⊂ R and R is a ring (even a ﬁeld!),
√
√
√
√
it is enough to check it’s a subring. Indeed, 0, 1 ∈ Z[ 2]. Suppose a + b 2, c + d 2 ∈ Z[ 2].
√
√
√
√
√
√
√
Then: (1) (a + b 2)+(c + d 2) = (a + c)+(b + d) 2 ∈ Z[ 2]; (2) −(a + b 2) = −a − b 2 ∈ Z[ 2];
√
√
√
√
(3) (a + b 2)(c + d 2) = (ac + 2bd) + (ad + bc) 2 ∈ Z[ 2]. 29 Let now √
√
Q[ 2] = {a + b 2 : a, b ∈ Q}. This is a ﬁeld. The veriﬁcation that this is a subring of C is the same as above. It is thus a
√
commutative ring in which 0 = 1. We need to show inverse for multiplication. If a + b 2 is not
zero then either a or b are not zero. If
c = a2 − 2b2 √
is zero then either b = 0 (but then a = 0 and so c = 0, so this case doesn’t happen), or 2 = a/b
is a rational number. We shall prove in Proposition 10.9 that this is not the case. Thus, c = 0.
√
√
b
Now, a − c 2 ∈ Q[ 2] and it is easy to check that
c
√
a b√
(a + b 2)
−
2 = 1.
cc 30 Part 2. Arithmetic in Z
In this part of the course we are going to study arithmetic in the ring of integers Z. We are
going to focus on particular properties. Our choice of properties is motivated by an analogy to
be drawn later between integers and polynomials. In fact, there even more general classes of
rings for which one can extend this analogy, called Euclidean rings, but in this course we shall
not discuss them beyond providing some margin notes. 8. Division
Theorem 8.1. (Division with residue)3 Let a, b be integers with b = 0. There exist integers q, r
such that
a = qb + r, 0 ≤ r < b. Moreover, q and r are uniquely determined.
Proof. For simplicity, assume b > 0. Very similar arguments prove the case b < 0.
Consider the set
S = {a − bx : x ∈ Z, a − bx ≥ 0}.
S is a nonempty set, indeed, if a > 0 take x = 0 and a ∈ S . If a < 0 take x = a and a − bx =
a(1 − b) ≥ 0 (because b > 0 and so b ≥ 1). That is, a(1 − b) ∈ S . It follows that S has a minimal
element r = a − bq for some q . Then r < b; otherwise, 0 ≤ r − b = a − b(q + 1) is an element of S
as well and smaller that r, which is a contradiction. It follows that
a = bq + r, 0 ≤ r < b. We now show that q and r are unique. Suppose
a = bq + r , 0 ≤ r < b. If q = q then also r = a − bq = a − bq = r . Else, either q > q or q > q . We then get
0 = bq + r − (bq + r ) = b(q − q ) + (r − r ).
If q > q then r = r + b(q − q ) ≥ r + b ≥ b. Contradiction. If q < q we get r = r + b(q − q ) ≥ b
and again a contradiction.
3A commutative ring is called an integral domain if xy = 0 implies x = 0 or y = 0. An integral domain R
is called a Euclidean ring if there is a function  ·  : R − {0} → N such that for all x, y in R with y = 0
there are elements q, r ∈ R such that x = qy + r and either r = 0 or r < x. (But in general q, r are not
unique.) Thus, the theorem we are proving is that Z is a Euclidean ring. 31 9. GCD and the Euclidean algorithm
Deﬁnition 9.1. Let a, b be integers. We say that ab (read, a divides b) if there is an element c ∈ Z
such that b = ac.
Here are some properties:
(1) ab ⇒ a − b.
(2) ab ⇒ abd for any d ∈ Z.
(3) ab, ad ⇒ a(b ± d).
Proof. Write b = ac. Then −b = a · (−c) and so a − b. Also, bd = a · (cd) and so abd.
Write also d = ae. Then b ± d = a · (c ± e) and so a(b ± d). Corollary 9.2. Let a = 0. ab if and only if in dividing b in a with residue, b = aq + r, the
residue r is zero.
Proof. If the residue r = 0 then b = aq and so ab. If ab and b = aq + r then a(b − aq ), i.e., ar.
But r < a and so that’s possible only if r = 0.
9.1. GCD.
Deﬁnition 9.3. Let a, b be integers, not both zero. The greatest common divisor (gcd) of a
and b, denoted gcd(a, b) or just (a, b) if the context is clear, is the largest integer dividing both a
and b.
Theorem 9.4. Let a, b be integers, not both zero, and d = (a, b) their gcd. Then every common
divisor of a and b divides d. There are integers u, v such that
d = ua + vb.
Moreover, d is the minimal positive number that has the form ua + vb.
Proof. Let
S = {ma + nb : m, n ∈ Z, ma + nb > 0}. First note that S = ∅. Indeed, aa + bb ∈ S . Let D be the minimal element of S . Then, for
some u, v ∈ Z we have D = ua + vb.
We claim that D = d. To show Da, write a = qD + r, 0 ≤ r < D. Then, D > r = a − qD =
a − q (ua + vb) = (1 − qu)a − qvb. If r = 0 then r = (1 − qu)a − qvb is an element of S smaller
than D and that’s a contradiction. It follows that r = 0, that is Da. In the same way, Db.
On the other hand, let e be any common divisor of a and b. Then e also divides ua + vb = D.
It follows that D is the largest common divisor of a, b, so D = d, and also that any other common
divisor divides it.
Corollary 9.5. If abc and gcd(a, b) = 1 then ac. Proof. We have 1 = ua + vb for some integers u, v . Since auac and avbc we have auac + vbc =
c.
32 9.2. The Euclidean algorithm. The question arises: how do we compute in practice the gcd
of two integers? This is a very practical issue, even in the simple task of simplifying fractions!
As we shall see, there are two methods. One method uses the prime factorization of the two
numbers – we shall discuss that later. The other method, which is much more eﬃcient, is the
Euclidean algorithm.
Theorem 9.6. (The Euclidean Algorithm) Let a, b be positive integers with a ≥ b. If ba
then gcd(a, b) = b. Else perform the following recursive division with residue:
a = bq0 + r0 ,
0 < r0 < b,
b = r 0 q 1 + r1 ,
0 ≤ r1 < r0
r0 = r1 q 2 + r2 ,
0 ≤ r1 < r2
.
.
.
For some t we must ﬁrst get that rt+1 = 0. That is,
rt−2 = rt−1 qt + rt ,
0 ≤ rt < rt−1
rt−1 = rt qt+1 .
Then rt is the gcd of a and b.
Before giving the proof we provide two examples.
1). Take a = 113, b = 54. Then
113 = 54 · 2 + 5
54 = 5 · 10 + 4
5=4·1+1
4 = 4 · 1.
Thus gcd(113, 54) = 1.
2). Now take a = 442, b = 182. Then
442 = 182 · 2 + 78
182 = 78 · 2 + 26
78 = 26 · 3
and so gcd(442, 182) = 26.
Proof. Let d = gcd(a, b). We claim that drn for every n. We prove that by induction: First, da, db
then d(a − bq0 ) = r0 . Suppose that dri , i = 0, 1, 2, . . . , n. Since rn+1 = rn−1 − rn qn+1 we get
that drn+1 as well. In particular, drt .
We now show that rt a, rt b. It then follows that rt d and therefore rt = d. We again prove
that by induction. We have rt rt and rt rt qt+1 = rt−1 . Suppose we have already shown that rt
divides rt , rt−1 , . . . , rn . Then, since rn−1 = rn qn+1 + rn+1 we also get rt rn−1 . Therefore, rt
divides r0 , r1 , . . . , rt . Again, b = r0 q1 + r1 and so rt b and then a = bq0 + r0 and so rt a.
A further bonus supplied by the Euclidean algorithm is that it allows us to ﬁnd u, v such
that gcd(a, b) = ua + vb. We just illustrate it in two examples:
1). Take a = 113, b = 54. Then, as we saw,
113 = 54 · 2 + 5
54 = 5 · 10 + 4 33 5=4·1+1
4=4·1
and so gcd(113, 54) = 1. We have 1 = 5 − 4 · 1, and we keep substituting for the residues we now
have, expressions using previous residues (the important numbers to modify are the residues not
the quotients qi ). 4 = 54 − 5 · 10 and we get 1 = 5 − (54 − 5 · 10) = −54+5 · 11. Next, 5 = 113 − 54 · 2
and we get 1 = −54 + 5 · 11 = −54 + (113 − 54 · 2) · 11 = 54 · (−23) + 113 · 11. Thus,
1 = gcd(54, 113) = −23 · 54 + 11 · 113.
2). Now take a = 442, b = 182. Then
442 = 182 · 2 + 78
182 = 78 · 2 + 26
78 = 26 · 3
and so gcd(442, 182) = 26. Here the process is easier: 26 = 182 − 78 · 2 = 182 − (442 − 182 · 2) · 2 =
5 · 182 − 2 · 442.
26 = gcd(182, 442) = 5 · 182 − 2 · 442. 10. Primes and unique factorization
Deﬁnition 10.1. An integer p = 0, ±1 is called prime if its only divisors are ±1, ±p.
The phrase “prime number” is usually used to denote a prime positive integer. A positive integer
is prime if its only positive divisors are 1 and p.
The sieve of Eratosthenes: 4 This is a method that allows one to construct rapidly a list of all
primes less than a given number N . We illustrate that with N = 50. One writes all the numbers
from 2 to 50:
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
The ﬁrst number on the list is prime. This is 2. We write it in boldface and cross all its multiples
(we denoting crossing out by an underline):
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
The ﬁrst number on the list not in boldface and not crossed out is prime. This is 3. We write
it in boldface and cross all its multiples (we denoting crossing out by an underline):
4Eratosthenes of Cyrene, 276BC  194BC, was a Greek mathematician who is famous for his work on
prime numbers and for measuring the diameter of the earth. For more see http://wwwgroups.dcs.stand.ac.uk/%7Ehistory/Biographies/Eratosthenes.html 34 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
The ﬁrst number on the list not in boldface and not crossed out is prime. This is 5. We write
it in boldface and cross all its multiples (we denoting crossing out by an underline):
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
The ﬁrst number on the list not in boldface and not crossed out is prime. This is 7. We write
it in boldface and cross all its multiples (we denoting crossing out by an underline):
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
√
√
The next number 11 is already greater than N = 50 ∼ 7.071 . . . . So we stop, because any
number is a product of prime numbers (see below) and so has any number less of equal to N ,
√
which is not prime, has a prime divisor smaller or equal to N . Thus, any number left on our
list is prime.
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 Theorem 10.2. (The Fundamental Theorem of Arithmetic) Every nonzero integer n is a product
of primes. (We allow the empty product, equal by deﬁnition to 1). That is, one can write every
nonzero integer n as
n = p1 p2 · · · pm ,
where = ±1 and 0 ≤ p1 ≤ p2 ≤ · · · ≤ pm are primes (m ≥ 0). Moreover, this way of writing n
is unique. Proof. We ﬁrst show n can be written this way. We may assume n is positive (if n is negative,
apply the statement for −n, −n = p1 p2 · · · pm and thus n = −1 · p1 p2 · · · pm ).
Lemma 10.3. Every positive integer is a product of primes numbers. (We allow the empty
product, equal by deﬁnition to 1).
Proof. Suppose not. Then the set of integers S that are not a product of prime numbers has
a minimal element, say n0 . n0 is not one, or a prime, because in those cases it is a product of
primes. Thus, there are integers 1 < s < n0 , 1 < t < n0 such that n0 = st. Note that s, t are not
in S because they are smaller than n0 . Thus, s = q1 q2 · · · qa is a product of primes, t = r1 r2 · · · rb
is a product of primes and therefore n = q1 q2 · · · qa r1 r2 · · · rb is also a product of primes. This is
a contradiction to our initial assumption that there are positive integers that are not a product
of prime numbers. Thus, every positive integer is a product of prime numbers.
35 Choosing the sign appropriately and ordering the primes in increasing order we conclude that
any nonzero integer n = p1 p2 · · · pm , where = ±1 and 0 ≤ p1 ≤ p2 ≤ · · · ≤ pm are primes. We
now show uniqueness. For this we need the following important fact.
Proposition 10.4. Let p be a positive integer. The following are equivalent: (i) p is a prime
number; (ii) if pab then pa or pb.
Proof. Suppose p is prime and pab. If p a then gcd(p, a) = 1 and so, as we have already
seen, pb.
Now suppose that p satisﬁes (ii). If p = st then pst and so ps, say. So s = ps and p = ps t.
But we must have then that s = t = 1, because s, t are positive integers. So p has no proper
divisors and hence is prime.
We now ﬁnish the proof of the theorem. Suppose that
and also n = p1 p2 · · · pm ,
n = µq1 q2 · · · qt , are two expressions of n as in the statement of the theorem. First, is negative if and if n is,
and the same holds for µ. So = µ. We may then assume n is positive and = µ = 1 and
we argue by induction on n. The case n = 1 is clear: a product of one or more primes will be
greater than 1 so the only way to express n is as the empty product. Assume the statement holds
for 1, 2, . . . , n − 1 and consider two factorizations of n:
and n = p1 p2 · · · p m ,
n = q1 q2 · · · qt . First, note that m ≥ 1 and t ≥ 1 because n > 1. Assume that p1 ≤ q1 (the argument in the other
case goes the same way). We have p1 n and so p1 q1 q2 · · · qt . It follows that p1 divides some qi
but then, qi being prime, p1 = qi . Furthermore, p1 ≤ q1 ≤ qi = p1 , so p1 = q1 . We then have the
factorizations
n
= p 2 · · · pm = q 2 · · · q t .
p1
Since n/p1 < n we may apply the induction hypothesis and conclude that m = t and pi = qi for
all i.
We next derive some consequences of the fundamental theorem of arithmetic. The theorem
exhibits the prime numbers as the building blocks of the integers. In itself, it doesn’t tell us if
there are ﬁnitely or inﬁnitely many such building blocks, that is, such primes.
Theorem 10.5. (Euclid5) There are inﬁnitely many prime numbers.
5Euclid of Alexandria, 325BC  265BC, was a Greek mathematician best known for his treatise on
geometry: The Elements. This inﬂuenced the development of Western mathematics for more than 2000
years. For more see http://wwwgroups.dcs.stand.ac.uk/%7Ehistory/Biographies/Euclid.html 36 Proof. Let p1 , p2 , . . . , pn be distinct prime numbers. We show then that there is a prime not in
this list. It follows that there couldn’t be only ﬁnitely many prime numbers.
Consider the integer n = p1 p2 · · · pn + 1 and its prime factorization. Let q be a prime dividing n. If q ∈ {p1 , p2 , . . . , pn } then q p1 p2 · · · pn and so q (n − p1 p2 · · · pn ), that is q 1, which is a
contradiction.
So! We know every integer is a product of prime numbers, we know there are inﬁnitely many
prime numbers. That teaches us about the integers, and invites some more questions:
– How frequent are the prime numbers? The Prime Number Theorem asserts that the number
of primes in the interval [1, n] is roughly n/ log n, in the sense that the ratio between the true
number and the estimate n/ log n approaches 1 as n goes to inﬁnity. The result was conjectured
by Gauss6 at the age of 15 or 16 and proven by J. Hadamard and Ch. de la Vall´e Poussin in
e
1896.
– How small can the gaps between consecutive primes be? For example, we have (3, 5), (5, 7), (11, 13), (17, 19),
... are there inﬁnitely many such pairs? The answer is believed to be yes but no one has proven
it yet (Fall 2007). This is called the Twin Prime Conjecture.
– How far does one need to go until the next prime shows up? For example, it is known that
there is always a prime between n and 2n, but this is a diﬃcult result.
– What about adding primes? Goldbach’s conjecture asserts that every even integer greater
than 2 is the sum of two prime numbers 4 = 2 + 2, 6 = 3 + 3, 8 = 3+5, 10 = 3+7, 12 = 5 + 7,
14 = 3 + 11, 16 = 5+ 11, .... It has been veriﬁed (Fall 2007) up to n ≤ 4 × 1017 , but no proof is
currently known.
10.1. Further applications of the Fundamental Theorem of Arithmetic.
Proposition 10.6. Let a, b be nonzero integers. Then ab if and only if a = pa1 · · · pam and b =
m
1
a a b
mb
µp11 · · · pm q11 · · · qt t (products of distinct primes) with a ≥ ai for all i = 1, . . . , m.
i Proof. Clearly for such factorizations it follows that ab, in fact
a −a1 b/a = (µ/)p11 b
b
· · · pam −am q11 · · · qt t .
m
a a b
mb
Conversely, if ab, write a = pa1 · · · pam and b/a = ν p11 · · · pm q11 · · · qt t , with ν = ±1 and a ≥ 0.
m
i
1
a +a
1 Then b = (ν)p11 a +a b1
m
q1 m
· · · pm b
· · · qt t and let µ = ν, a = ai + a .
i
i 6Johann Carl Friedrich Gauss, 1777  1855, worked in a wide variety of ﬁelds in both mathematics
and physics including number theory, analysis, diﬀerential geometry, geodesy, magnetism, astronomy and
optics. His work has had an immense inﬂuence in many areas. For more see http://wwwgroups.dcs.stand.ac.uk/%7Ehistory/Biographies/Gauss.html 37 Corollary 10.7. Let a = pa1 · · · pam , b = pb1 · · · pbm with pi distinct prime numbers and ai , bi
m
m
1
1
nonnegative integers. (Any two positive integers can be written this way). Then
min(a1 ,b1 ) gcd(a, b) = p1 · · · pmin(am ,bm ) .
m The next proposition establishes the existence of real numbers that are not rational. It can
be generalized considerably. In fact, it is known that in randomly choosing a number in the
interval [0, 1] the probability of picking a rational number is zero. So, though there are inﬁnitely
many rational numbers, even in the interval [0, 1], they are still a rather meagre set inside the
real numbers. The advantage of the following proposition is that it shows that a speciﬁc number
is irrational. We will need the following result.
Proposition 10.8. Any nonzero rational number q can be written as
q = pa1 . . . pam ,
m
1
where = ±1, the pi are distinct prime numbers and a1 , . . . , am are nonzero integers (possibly
negative). Moreover, this expression is unique.
Proof. By deﬁnition, for some integers a, b we have q = a/b. Let us write:
s
s
a = a r11 · · · rnn , t
t
b = b r11 · · · rnn , where a , b ∈ {±1}, the ri are distinct primes and si , ti are nonnegative integers, possibly zero.
This is always possible to do because we allow zero exponents here. Then, clearly,
s
s
q = (a /b )r11 −t1 · · · rnn −tn . Now, omitting the primes such that si −ti = 0 from the list and calling the remaining primes p1 , . . . , pm ,
and letting = a /b , we obtain an expression as desired.
Suppose that we have two such expressions for q . Again, by allowing zero exponents, we can
consider the following situation WLOG:
a q = pa1 . . . pam = p11 . . . pam .
m
m
1
Since , determine the sign of q , they must be equal and we need to show that ai = a for all i.
i
Dividing through, we get an expression of the form,
1 = pc 1 . . . p c m ,
m
1
where ci = ai − a and we need to show all the ci are zero. WLOG, by rearranging the primes,
i
we may assume that c1 , . . . , ct are negative and ct+1 , . . . , cm are nonnegative. We conclude that
c p−c1 . . . p−ct = p1t+1 . . . pcm .
m
t
1
In this expression there are no negative exponents. Thus, from unique factorization for integers,
since all the primes are distinct all the powers must be zero.
√
Proposition 10.9.
2 is not a rational number. 38 Proof. Suppose it is, and write
sibly negative). Then √ 2 = pa1 · · · pam , distinct primes with nonzero exponents (posm
1
2 = p 2 a 1 · · · p2 a m
m
1 must be the unique factorization of 2. However, 2 is prime. So there must be only one prime on
the right hand side, i.e. m = 1. Then 2 = p2a1 and we must have p1 = 2 and 2a1 = 1. But this
1
contradicts the fact that a1 is an integer.
Here is another proof:
√
√
Suppose that 2 is rational and write 2 = m/n, where (m, n) = 1. Then
2n2 = m2 .
This implies that 2m2 . If m = 2k + 1 then m2 = 4k 2 + 4k + 1. Since 2(4k 2 + 4k ) it would
follow that 21 which is a contradiction. (In simple language, what we are saying is that m is
odd implies that m2 is odd.) Thus, m is even. Then 2m. Say m = 2k . It follows that 2n2 = 4k 2 and so n2 = 2k 2 . Therefore, 2n2 and so 2n. This means that 2 divides both n and m, contrary
√
√
to our assumption. Thus, assuming 2 is rational leads to a contradiction and so 2 is not a
rational number.
Though the second proof is more elementary, the ﬁrst proof is better. It is much easier to
√√
generalize and indeed in a similar way, one can show 3, 5 17 etc. are irrational.
It is also known that e is irrational (not too hard) and π is irrational (hard). But it is still an
open question (Fall 2007) if Euler’s constant
11
1
γ = lim 1 + + + · · · + + − log(n) ≈ 0.57721
n→∞
23
n
is rational or not (it is believed to be irrational; if γ is rational, it was proved that its denominator
has to have more than 10242080 digits!). 39 Part 3. Congruences and modular arithmetic 11. Relations
A relation on a set S is best described as a subset Γ ⊂ S × S . For each s ∈ S , s is related to t
if (s, t) ∈ Γ. Though the format reminds one of functions, the actual relevance of the notion
of functions is minimal. For example, usually for a given s there will be many elements t such
that (s, t) ∈ Γ, which is the opposite of what we have for functions, where there is only one t.
We shall usually denote that x is related to y , namely that (x, y ) ∈ Γ, by x ∼ y .
Note that so far the deﬁnition is wide enough to allow any Γ. A relation is called transitive
if x ∼ y and y ∼ z implies x ∼ z . A relation is called a partial order if it is transitive and in
addition we don’t have both x ∼ y and y ∼ x. We then use the notation x < y for x ∼ y . We
then require x < y, y < z ⇒ x < z and we do not allow both x < y and y < x. There may very
well be x, y for which neither x < y nor y < x holds. A linear order (or a simple order ) is a
partial order such that for every x, y we have either x < y or y < x.
Another important class of relations, even more important for this course, are the equivalence
relations. They are very far from order relations. A relation is called an equivalence relation if it
satisﬁes the following properties:
(1) (Reﬂexive) For every x we have x ∼ x.
(2) (Symmetric) If x ∼ y then y ∼ x.
(3) (Transitive) If x ∼ y and y ∼ z then x ∼ z . Equivalence relations arise when one wishes to identify elements in a given set R according to
.
some principle. Implicit in the word “identify” is that x is identiﬁed with x, if x is identiﬁed
with y then y is identiﬁed with x, and that if x is identiﬁed with y and y is identiﬁed with z then,
by all accounts, x should be identiﬁed with z too. That is, we have an equivalence relation.
Lemma 11.1. Let ∼ be an equivalence relation on a set S . Deﬁne the equivalence class [x] of
an element x ∈ S as follows:
[x] = {y : y ∈ S, x ∼ y }. This is a subset of S . The following holds: (1) Two equivalence classes are either disjoint or equal.
(2) S is a disjoint union of equivalent classes.
Conversely, if S is a disjoint union S = ∪i∈I Ui of nonempty sets (this is called a partition of S )
then there is a unique equivalence relation on S for which the Ui are the equivalence classes.
Proof. Let x, y be elements of S and suppose that [x] ∩ [y ] = ∅. Then, there is an element z
such that x ∼ z, y ∼ z . Since ∼ is symmetric also z ∼ y and using transitivity x ∼ y . Now,
if s ∈ [y ] then y ∼ s and by transitivity x ∼ s and so s ∈ [x] and we showed [y ] ⊂ [x]. Since x ∼ y
also y ∼ x and the same argument gives [x] ⊂ [y ]. We conclude that [x] = [y ]. 40 Every element of S lies in the equivalence class of itself. It follows that S is a disjoint union
of equivalence classes.
To prove the second part of the lemma, we deﬁne that x ∼ y if both x and y lie in the same
set Ui . It is clearly reﬂexive and symmetric. It is also transitive: x ∼ y means x, y ∈ Ui for
some i, y ∼ z means y, z ∈ Uj for some j . But there is a unique Ui containing y because the
union is a disjoint union. That is Ui = Uj and so x, z ∈ Uj , meaning x ∼ z . The equivalence
classes are clearly the Ui .
We introduce the following terminology: we say that a set {xi : i ∈ I }, I some index set, is a
complete set of representatives if the equivalence classes [xi ] are disjoint and S = ∪i∈I [xi ]. This
means that every equivalence class if of the form [xi ] for some i ∈ I and no two equivalence
classes [xi ], [xj ] for i, j ∈ I are the same, unless xi = xj . 12. Congruence relations
Let n be a positive integer. Deﬁne a relation x ∼ y if n(x − y ) (we shall also write that
as x ≡ y (mod n), or simply x ≡ y if n is clear from the context). We say that x is congruent
to y modulo n.
Lemma 12.1. Congruence modulo n is an equivalence relation on Z. The set {0, 1, . . . , n − 1}
is a complete set of representatives.
Proof. First n(x − x) so x ≡ x and the relation is reﬂexive. If n(x − y ) then n − (x − y ) = y − x,
so the relation is symmetric. Suppose n(x − y ), n(y − z ) then n(x − y ) + (y − z ) = x − z and
so the relation is transitive too.
Let x be any integer and write x = qn + r with 0 ≤ r < n. Then x − r = qn and so x ≡ r. It
follows that every equivalence class is represented by some r ∈ {0, 1, . . . , n − 1}. The equivalence
classes deﬁned by elements of {0, 1, . . . , n − 1} are disjoint. If not, then for some 0 ≤ i < j < n
we have i ≡ j , that is, n(j − i). But 0 < j − i < n and we get a contradiction.
Theorem 12.2. Denote the equivalence classes of congruence modulo n by ¯, ¯, . . . , n − 1. Denote
01
this set by Z/nZ. The set Z/nZ is a commutative ring under the following operations:
¯ + ¯ = i + j,
ij ¯ · ¯ = ij.
ij The neutral element for addition is ¯, for multiplication ¯ and the inverse of ¯ is −i = n − i.
0
1
i
Before proving the theorem we illustrate the deﬁnitions in a numerical example:
Example 12.3. We take n = 13 and calculate ¯ · ¯ − ¯ First, ¯ · ¯ = 30 = ¯ Then
5 6 5.
56
4.
¯ + −5 = 4 − 5 = −1 = 12. Note that we could have also calculated ¯ − ¯ = ¯ +
4
45
4
¯ · ¯ − ¯ = ¯ ¯ − ¯ = ¯ · ¯ = 25 = 12.
or 5 6 5 5(6 1) 5 5 ¯−¯ =
45
¯=¯
8
12, 41 Modular arithmetic, that is calculating in the ring Z/nZ, is some times called “clock arithmetic”. The reason is the following. The usual clock is really displaying hours modulo 12. When
5 hours pass from the time 10 o’clock the clock shows 3. Note that 3 ≡ 15 (mod 12). We are used
to adding hours modulo 12 (or modulo 24, for that matter), but we are not used to multiplying
hours, that doesn’t make sense. However, if you’d like you can think about multiplication as
repeated addition 5 · 3 = 5 + 5 + 5. So, in that sense, we are already familiar with the operations
modulo 12 and the deﬁnitions above are a generalization.
Continuing with our numerical example, let us try and solve the equation 4x + 2 = 7 in Z/13Z.
Now, and from now on, we are just writing 4, 2, 7 etc. for ¯, ¯, ¯ So we need to solve 4x = 5. We
4 2 7.
are now looking for a residue class r modulo 13 so that 4r ≡ 1 (mod 13). We guess that r = 10
and check: 4 · 10 = 40 ≡ 1 (mod 13). Then, 4x = 5 implies 10 · 4x ≡ x ≡ 50 ≡ 11 (mod 13).
Thus, the only possibility is x = 11. We go back to the original equation 4x = 5 and verify
that 4 · 11 ≡ 5 (mod 13). We found the solution x = 11. We remark that in general such an r
need not exists if the modulos n is not a prime, and that in that case one may need to go back
to the original equation and verify that indeed the solution to the reduced equation solves the
original equation. These issues will be discussed later.
Example 12.4. As another example, we give that addition and multiplication table of Z/5Z.
+
0
1
2
3
4 0
0
1
2
3
4 1
1
2
3
4
0 2
2
3
4
0
1 3
3
4
0
1
2 4
4
0
,
1
2
3 ·
0
1
2
3
4 0
0
0
0
0
0 1
0
1
2
3
4 2
0
2
4
1
3 3
0
3
1
4
2 4
0
4
3
2
1 Proof. (Of Theorem) We ﬁrst prove that the operations do not depend on the representatives for
the equivalence classes that we have chosen. Suppose ¯ = ¯ , ¯ = ¯ , where i, i , j, j need not be
i ij j in the set {0, 1, 2, . . . , n − 1}. We deﬁned ¯ + ¯ = i + j . We need to check that this is the same
ij
, n(i − i ) and similarly n(j − j ). Therefore, n(i + j ) − (i + j ), that
as i + j . Since ¯ = ¯
i
i
is, i + j = i + j . We also need to show that ij = i j . But, ij − i j = ij − ij + ij − i j = i(j − j ) + j (i − i )
and so n(ij − i j ).
The veriﬁcation of the axioms is now easy if we make use of the fact that Z is a commutative
ring:
(1) ¯ + ¯ = i + j = j + i = ¯ + ¯.
ij
ji
¯
¯
(2) (¯ + ¯) + k = i + j + k = (i + j ) + k . Note that at this point we used the simpliﬁcation
ij
that we can use any representative of the equivalence class to carry out the operations.
Had we insisted on always using the representative in the set {0, 1, 2, . . . , n − 1} we would
usually need to replace i + j by its representative in that set and things would be turning
messy. Now, (i + j ) + k = i + (j + k ) = ¯ + j + k = ¯ + (¯ + k ).
i
i
j¯ 42 (3) ¯ + ¯ = 0 + i = ¯.
0i
i
¯ + −i = i + (−i) = ¯ Note that −i = n − i.
(4) i
0.
¯
(5) (¯ · ¯)k = ij · k = (ij )k = i(jk ) = ¯ · jk = ¯(¯ · k ).
i j¯
i
ij ¯ (6) ¯ · ¯ = 1 · i = ¯ ¯ · ¯ = i · 1 = ¯.
1i
i, i 1
i
(7) ¯(¯ + k ) = ¯ · j + k = i(j + k ) = ij + ik = ij + ik = ¯ · ¯ + ¯ · k . Similarly, (¯ + k )¯ =
ij ¯
i
ij i¯
j ¯i
j + k · ¯ = (j + k )i = ji + ki = ji + ki = ¯ · ¯ + k · ¯.
i
ji ¯i ¯i
Furthermore, this is a commutative ring: ¯ · k = ik = ki = k · ¯.
i¯ In the proof we saw that the ring properties of Z/nZ, the set of equivalence classes modulo n, all
follow from the ring properties of Z. We shall later see that this can be generalized to any ring R:
if we impose a correct notion of an equivalence relation, the equivalence classes themselves will
form a ring and the fact that the axioms hold follows from the fact they hold for R.
Theorem 12.5. Z/nZ is a ﬁeld if and only if n is prime.
Before providing the proof we introduce some terminology. Let R be a ring, x ∈ R a nonzero
element. x is called a zero divisor if there is an element y = 0 such that either xy = 0 or yx = 0
(or both).
Lemma 12.6. Let R be a commutative ring. If R has zero divisors then R is not a ﬁeld.
Proof. Let x = 0 be a zero divisor and let y = 0 be an element such that xy = 0. If R is
a ﬁeld then there is an element z ∈ R such that zx = 1. But then z (xy ) = z · 0 = 0 and
also z (xy ) = (zx)y = 1 · y = y so y = 0 and that is a contradiction.
Proof. (Of Theorem) If n = 1 then Z/nZ has a single element and so 0 = 1 in that ring. Therefore,
it is not a ﬁeld. Suppose that n > 1 and n is not prime, n = ab where 1 < a < n, 1 < b < n.
Then a = 0, ¯ = 0 but a · ¯ = ab = n = 0. So Z/nZ has zero divisors and thus is not a ﬁeld.
¯ ¯b ¯
¯b
¯¯
Suppose now that n is prime and let a = ¯ That is, n a, which, since n is prime, means
¯
0.
that (n, a) = 1. Consider the list of elements ¯ · a, ¯ · a, . . . , n − 1 · a.
0 ¯1 ¯
¯ We claim that they are distinct elements of Z/nZ. Suppose that ¯ · a = ¯ · a, for some 0 ≤ i ≤
i¯ j¯ j ≤ n − 1 then ia = ja, which means that n(ia − ja) = (i − j )a. Since (n, a) = 1, it follows
¯ ¯1 ¯
that n(i − j ) but that means i = j . Thus, the list 0 · a, ¯ · a, . . . , n − 1 · a contains n distinct
¯
elements of Z/nZ and so it must contain ¯ That is, there’s an i such that ¯· a = ¯ and therefore a
1.
i¯ 1
¯
is invertible. Let p be a prime number. We denote Z/pZ also by Fp . It is a ﬁeld with p elements. In fact,
any ﬁnite ﬁeld, that is any ﬁeld with ﬁnitely many elements, has cardinality a power of a prime
and for any prime power there is a ﬁeld with that cardinality. Finite ﬁelds, such as Fp , play an
important role in coding and cryptography as well as in pure mathematics. 43 12.1. Fermat’s little theorem.
Theorem 12.7. (Fermat7) Let p be a prime number. Let a ≡ 0 (mod p) then
ap−1 ≡ 1 (mod p). Before proving the theorem we prove state two auxiliary statements whose proofs are delegated
to the assignments.
p
Lemma 12.8. We have p
for every 1 ≤ i ≤ p − 1.
i
n
n
Lemma 12.9. Let R be a commutative ring and x, y ∈ R. Interpret
as adding
times
i
i
the element 1. Then the binomial formula holds in R:
n
n
(x + y )n =
xi y n−i .
i
i=0 Proof. (Of Fermat’s little theorem) We prove that by induction on 1 ≤ a ≤ p − 1. For a = 1 the
result is clear. Suppose the result for a and consider a + 1, provided a + 1 < p. We have, by the
binomial formula,
p
p
p
(a + 1) =
ai
i
i=0
p
p2
p
=1+
a+
a + ··· +
ap−1 + ap
1
2
p−1
= 1 + ap
=1+a (using the lemma) (using the induction hypothesis) Since 1 + a ≡ 0 (mod p) it has an inverse y in Fp , y (1 + a) ≡ 1 and we get y (1 + a)p ≡ y (1 + a),
that is (1 + a)p−1 ≡ 1. Example 12.10. We calculate 2100 modulo 13. We have 2100 = 296 24 = (212 )8 24 ≡ 24 ≡ 3
modulo 13.
Fermat’s little theorem gives a criterion for numbers to be composite. Let n be a positive integer.
If there is 1 ≤ a ≤ n − 1 such that an−1 ≡ 1 (mod n) then n is not prime. Unfortunately, it is possible that for every 1 ≤ a ≤ n − 1 such that (a, n) = 1, one has an−1 ≡ 1 (mod n)
and yet n is not prime. Thus, this test fails to recognize such n as composite numbers. Such
numbers are called Carmichael numbers. There are inﬁnitely many such numbers. The ﬁrst being
7Pierre de Fermat, 1601  1665, was a French lawyer and government oﬃcial most remembered for his
work in number theory; in particular for Fermat’s Last Theorem. He is also important in the foundations
of the calculus. For more, see http://wwwgroups.dcs.stand.ac.uk/%7Ehistory/Biographies/Fermat.html 44 561, 1105, 1729, 2465, 2821, 6601, 8911, 10585, 15841, 29341, ...8 Primality testing programs
ﬁrst test divisibility by small primes available to the program as precomputed data and then
choose randomly some 1 ≤ a < n: if (a, n) = 1 then n is not prime. If (a, n) = 1 the program
calculated an−1 (mod n). If the result is not 1 (mod n) then n is not prime. If the result is 1, the
program chooses another a. After a certain number of tests, say 10, if n passed all the tests it is
declared as “prime”, though there is no absolute reassurance it is indeed a prime. We remark that
calculating an−1 (mod n) can be done quickly. One calculates a, a2 , a4 , a8 , a16 , · · · modulo n, as
long as the power is less than n. This can be done rapidly. One then expresses n in base 2 to ﬁnd
the result. Here is an example: Let us calculate 354 (mod 55) (random choice of numbers). We have 3, 32 = 9, 34 = 81 = 26, 38 = 262 = 676 = 16, 316 = 162 = 256 = 36, 332 = 362 = 1296 = 31.
Now, 54 = 2 + 4 + 16 + 32 and so 354 = 9 · 26 · 36 · 31 = 4. In particular, 55 is not a prime – not
that I’m particularly proud in drawing this conclusion...
It is important to note that there is a polynomial time algorithm to decide, without any doubt,
if an integer is prime. Such an algorithm was discovered by Agrawal, Kayal and Saxena in 2002.
It is important to note that the algorithm does not produce a decomposition of n in case n is
composite. Such an algorithm will compromise the very backbone of ecommerce and military
security. For more, see http://www.ams.org/notices/200305/feabornemann.pdf 12.2. Solving equations in Z/nZ. There is no general method to solving polynomials equations
in Z/nZ. We just present some selected topics.
12.2.1. Linear equations. We want to consider the equation ax + b in Z/nZ. Let us assume
that (a, n) = 1. Then, there are integers u, v such that 1 = ua + vn. We remark that u, v are
found by the Euclidean algorithm. Note that this implies that ua ≡ 1 (mod n). Thus, if x
solves ax + b = 0 in Z/nZ then x solves the equation uax + ub = 0 (mod n), that is x + ub = 0
and so x = −ub in Z/nZ. Conversely, if x = −ub in Z/nZ where ua = 1 in Z/nZ then ax =
a(−ub) = −aub = −b in Z/nZ.
We summarize: if (a, n) = 1 then the equation
ax + b = 0 (mod n), has a unique solution x = −ub, where u is such that ua = 1 (mod n).
Here is a numerical example: Let us solve 12x + 3 = 0 (mod 17). First 17 = 12 + 5, 12 =
2 ∗ 5 + 2, 5 = 2 ∗ 2 + 1, so (12, 17) = 1. Moreover, 1 = 5 − 2 ∗ 2 = 5 − 2 ∗ (12 − 2 ∗ 5) = 5 ∗ 5 − 2 ∗ 12 =
5 ∗ (17 − 12) − 2 ∗ 12 = 5 ∗ 17 − 7 ∗ 12. We see that −7 ∗ 12 = 1 (mod 17). Thus, the solution
is x = 7 ∗ 3 = 21 = 4 (mod 17).
8For more see http://mathworld.wolfram.com/CarmichaelNumber.html 45 12.2.2. Quadratic equations. Consider the equation ax2 + bx + c = 0 in Z/nZ and assume n is a
prime greater than 2. In that case, assuming that a = 0 modulo n, there is an element (2a)−1 . One can prove that the equation has a solution if and only if b2 − 4ac is a square in Z/nZ (which
may or may not be the case). In case it is a square, the solutions of this equation are given by
the usual formula:
(2a)−1 (−b ± b2 − 4ac). For example, the equation x2 + x + 1 has no solution in Z/5Z because the discriminant b2 − 4ac
is in this case 12 − 4 = −3 = 2 in Z/5Z and 2 can be checked not to be a square in Z/5Z (one just tries: 02 = 0, 12 = 1, 22 = 4, 32 = 4, 42 = 1 in Z/5Z). One the other hand, x2 + x + 1 can be
√
√
solved in Z/7Z. The solutions are 4(−1 ± −3) = −4 ± 4 4 = −4 ± 8 = −4 ± 1 = {2, 4} .
When n is not prime, we shall not study the problem in this course, beyond remarking that
one can proceed by trying all possibilities if n is small and that the number of solutions can be
very large. For example: consider the equation x3 − x in Z/8Z. We can verify that its solutions
are 0, 1, 3, 5, 7. There are 5 solutions but the equation has degree three. We shall later see that
in any ﬁeld a polynomial equation of degree n has at most n roots.
12.3. Public key cryptography; RSA method. 9We cannot go here too much into the cryptographical practical aspects. Suﬃces to say that in many cryptographical applications two parties
X and Y wish to exchange a secret. Given any large integer n that secret can be represented as
a number modulo n, and we leave it to the reader’s imagination to devise methods for that. The
method proceeds as follows:
X chooses two large primes p < q .
X calculates n = pq .
X calculates k = (p − 1)(q − 1).
X chooses an integer d such that (d, k ) = 1.
X ﬁnds e such that ed ≡ 1 (mod k ).
X publishes for anyone to see the data e, n. This is called the public key. The rest of the data p, q, k, d is kept secret. In fact, p, q, k can be destroyed altogether and only d
be kept, and kept secret. This is called the private key.
Y, wishing to send a secret, writes it as a number b modulo n, which is also relatively prime to n,
and sends be (mod n) to X, allowing anyone interested to see that message. The point is, and
this is called the discrete log problem, that it is very diﬃcult to ﬁnd what b is, even when one
knows be and n. Thus, someone seeing Y’s message cannot ﬁnd the secret b from it.
X, upon receiving Y’s message be , calculates (be )d .
9The RSA method described above is named after Ron Rivest, Adi Shamir and Len Adleman, who
discovered it in 1977. 46 Lemma 12.11. We have bed ≡ b (mod n).
Proof. We need to show that bed ≡ b (mod p) and bed ≡ b (mod q ). Then p(bed − b) and q (bed − b)
and so (using the p, q are primes and distinct), n = pq (bed − b). The argument being symmetric, we just show bed ≡ b (mod p). We have modulo p
bed = b1+vk = b · ((bp−1 )q−1 )v
= b · (1q−1 )v Fermat’s little theorem = b.
We have shown that X can retrieve Y’s secret.
Here is a numerical example:
p = 10007, q = 10009;
n = p*q = 100160063
k = (p1)*(q1) = 100140048
d = 10001
e = 88695185
b=3
b^e = 33265563
33265563^10001 = 3 Mod n. 47 Part 4. Polynomials and their arithmetic
13. The ring of polynomials
Let R be a commutative ring. A good example to keep in mind is R = Z or R = C, but our
discussion allows any ring. We deﬁne the ring of polynomials over R as
R[x] = {an xn + · · · + a1 x + a0 : ai ∈ R}. In the deﬁnition n is any nonnegative integer. Note that we allow some, or even all, coeﬃcients
to be zero. The zero polynomial 0 is the choice n = 0 and a0 = 0. We deﬁne addition as
(assume n ≥ m)
(an xn + · · · + a1 x + a0 )+(bm xm + · · · + b1 x + b0 ) = an xn + · · · +(am + bm )xm + · · · (a1 + b1 )x +(a0 + b0 ).
We also deﬁne multiplication by where (an xn + · · · + a1 x + a0 )(bm xm + · · · + b1 x + b0 ) = cn+m xn+m + · · · + c1 x + c0 ,
ci = a0 bi + a1 bi−1 + · · · ai−1 b1 + ai b0 . Note that in the formula for ci it is entirely possible that some aj or bj are not deﬁned; this
happens if j > n or j > m, respectively. In this case we understand aj , or bj , as zero.
Example 13.1. Take R = Z then
(2x2 + x − 2)+(x3 + x − 1) = x3 +2x2 +2x − 3, (2x2 + x − 2)(x3 + x − 1) = 2x5 + x4 − x2 − 3x +2. A polynomial f (x) = an xn + · · · + a1 x + a0 is called monic if an = 1. It is called of degree n
if an = 0. If f has degree 0, that is f (x) = a, a ∈ R, a = 0, then f is called a constant polynomial.
The degree of the zero polynomial is not deﬁned.
Proposition 13.2. With the operations deﬁned above R[x] is a commutative ring, with zero being
the zero polynomial and 1 being the constant polynomial 1. The additive inverse of an xn + · · · +
a1 x + a0 is −an xn − · · · − a1 x − a0 .
Since the proof is straightforward we leave it as an exercise.
Proposition 13.3. If R is an integral domain then R[x] is an integral domain. If f (x), g (x) ∈
R[x] are nonzero polynomials,
deg(f (x)g (x)) = deg(f (x)) + deg(g (x)).
Proof. Say deg(f (x)) = n, deg(g (x)) = m, then by deﬁnition f (x) = an xn + · · · + a1 x + a0 with an = 0 and g (x) = bm xm + · · · + b1 x + b0 with bm = 0. Then f (x)g (x) = an bm xn+m +
(an bm−1 + an−1 bm )xn+m−1 + · · · . Since R is an integral domain an bm = 0 and so f (x)g (x) = 0
and deg(f (x)g (x)) = n + m.
48 14. Division with residue
Let F be a ﬁeld. We have deﬁned the ring of polynomials F[x]; it is an integral domain (but is
never a ﬁeld; for example x does not have an inverse with respect to multiplication).
Theorem 14.1. Let f (x), g (x) be two polynomials in F[x], g (x) = 0. Then, there exist unique
polynomials q (x), r(x) in F[x] such that
f (x) = q (x)g (x) + r(x), r(x) = 0 or deg(r(x)) < deg g (x). Proof. We ﬁrst show the existence and later the uniqueness. Consider the set
S = {f (x) − q (x)g (x) : q (x) ∈ R[x]}.
If 0 ∈ S then there is a q (x) such that f (x) = q (x)g (x) and we take r(x) = 0. Else, choose an
element r(x) in S of minimal degree. Since r(x) is in S we can write r(x) = f (x) − q (x)g (x) for
some q (x).
Claim. deg(r(x)) < deg(g (x)).
Let us write r(x) = rn xn + · · · r1 x + r0 and g (x) = gm xm + · · · + g1 x + g0 , with rn = 0, gm = 0.
Assume, by contradiction, that n ≥ m.
−
rn gm1 gm−1 )xn−1 + · · · −
rn gm1 xn−m g (x) −
Then r1 (x) = r(x) − rn gm1 xn−m g (x) = (rn−1 − has degree smaller then r(x).
−
rn gm1 xn−m g (x) On the other hand, r1 (x) = r(x) − −
= f (x) − q (x)g (x) −
= f (x) − (q (x) + rn gm1 xn−m )g (x) shows
that r1 (x) ∈ S . Contradiction. We have therefore established the existence of q (x), r(x) such
that f (x) = q (x)g (x) + r(x), r(x) = 0 or deg(r(x)) < deg g (x). We now prove uniqueness. Suppose that also
f (x) = q1 (x)g (x) + r1 (x), r1 (x) = 0 or deg(r1 (x)) < deg g (x). We need to show that q (x) = q1 (x), r(x) = r1 (x). We have,
(q (x) − q1 (x))g (x) = r1 (x) − r(x).
The right hand side is either zero or has degree less that g (x). If it’s zero then, since F[x] is
an integral domain, we also have q (x) = q1 (x). If r(x) = r1 (x) then also q (x) = q1 (x) but
then the degree of the left hand side is deg(q (x) − q1 (x)) + deg(g (x)) ≥ deg(g (x)) and we get a
contradiction.
15. Arithmetic in F[x]
In this section F is a ﬁeld. We denote by F× the set of nonzero elements of F. 49 15.1. Some remarks about divisibility in a commutative ring T . The deﬁnitions we made
in § 9 can be made in general and the same basic properties hold. Let T be a commutative ring
and a, b ∈ T . We say that a divides b if b = ac for some c ∈ R. We have the following properties:
(1) ab ⇒ a − b.
(2) ab ⇒ abd for any d ∈ T .
(3) ab, ad ⇒ a(b ± d).
In particular, these deﬁnition and properties hold for the ring of polynomials R[x], where R is a
commutative ring.
15.2. GCD of polynomials.
Deﬁnition 15.1. Let f (x), g (x) ∈ F[x], not both zero. The greatest common divisor of f (x)
and g (x), denoted gcd(f (x), g (x)) of just (f (x), g (x)), is the monic polynomial of largest degree
dividing both f (x) and g (x). (We shall see below that there is a unique such polynomial.)
Theorem 15.2. Let f (x), g (x) be polynomials, not both zero. The gcd of f (x) and g (x), h(x) =
(f (x), g (x)), is unique and can be expressed as
h(x) = u(x)f (x) + v (x)g (x), u(x), v (x) ∈ F[x]. It is the monic polynomial of minimal degree having such an expression. If t(x) divides both g (x)
and f (x) then t(x)h(x).
Proof. Consider the following set of monic polynomials
S = {a(x) : a(x) = u(x)f (x) + v (x)g (x) for some u(x), v (x) ∈ F[x], a(x) monic}.
It is a nonempty set because if f (x) = bxn + l.o.t.10 then b−1 f (x) ∈ S ; if f (x) = 0 then g (x)
is not zero and the same argument can be applied to g (x). Let h(x) be an element of minimal
degree of S . We claim that h(x) divides both f (x) and g (x). Since the situation is symmetric,
we just prove h(x)f (x). Suppose not, then we can write f (x) = q (x)h(x) + r(x), where r(x) is a
nonzero polynomial of degree smaller than h(x). Then r(x) = f (x) − q (x)(u(x)f (x)+ v (x)g (x)) =
(1 − q (x)u(x))f (x) − q (x)v (x)g (x) and so, if we let r1 (x) be r(x) divided by its leading coeﬃcients,
we see that r1 (x) ∈ S and has degree smaller than h(x), which is a contradiction.
By construction, h(x) is the monic polynomial of minimal degree having such an expression.
If t(x) divides both g (x) and f (x) then t(x)(u(x)f (x) + v (x)g (x)) = h(x). Therefore, h(x) is
a monic polynomial of the largest possible degree dividing both f (x), g (x). Suppose that h1 (x)
is another monic polynomial dividing f (x) and g (x) having the largest possible degree, i.e.,
the degree of h(x). Then, we have h(x) = h1 (x)b(x), but what we just proved. Since both
polynomials have the same degree b(x) must be a constant polynomial, and, then, since both are
monic, b(x) = 1. We’ve shown the gcd is unique.
10l.o.t. = lower order terms. 50 15.3. The Euclidean algorithm for polynomials.
Theorem 15.3. Let f (x), g (x) ∈ F[x] be nonzero polynomials, g (x) = an xn + l.o.t. If g (x)f (x) then (f (x), g (x)) = a−1 g (x). Else, deﬁne inductively,
n
f (x) = q0 (x)g (x) + r0 (x),
deg(r0 ) < deg(g )
g (x) = q1 (x)r0 (x) + r1 (x),
deg(r1 ) < deg(r0 )
r0 (x) = q2 (x)r1 (x) + r2 (x),
deg(r2 ) < deg(r1 )
.
.
.
rt−2 (x) = qt (x)rt−1 (x) + rt (x),
deg(rt ) < deg(rt−1 ))
rt−1 (x) = qt+1 (x)rt (x).
This is indeed possible, and the process always terminates. Letting rt (x) = cm xm + · · · + c0 , we have (f (x), g (x)) = c−1 rt (x). Moreover, this algorithm also allows expressing (f (x), g (x)) in the
m
form u(x)f (x) + v (x)g (x).
Proof. Each step in the process is done based on Theorem 14.1. The process must terminate
because the degrees decrease.
It is easy to see that rt rt−1 . Suppose we know rt divides rt−1 , rt−2 , . . . , ra then, since ra−1 =
qa+1 ra + ra+1 we get also that rt ra−1 . We conclude that rt divides r0 , r1 , . . . , rt . Exactly the
same argument gives that rt divides g (x) and f (x).
Conversely, if a(x) divides f (x) and g (x) then a(x)(f (x)−q0 (x)g (x)) = r0 (x) and so a(x)(g (x)−
q1 (x)r0 (x)) = r1 (x), etc. We see that a(x)rt (x) and so rt (x), once divided by its leading coeﬃcient, must be the greatest common divisor of f (x) and g (x).
Example 15.4. (1) f (x) = x2 + 1, g (x) = x2 + 2ix − 1, complex polynomials. We have
f (x) = 1 · (x2 + 2ix − 1) + (−2ix + 2) (x2 + 2ix − 1) = (
It follows that (f (x), g (x)) = 1
1
x − )(−2ix + 2).
−2i
2 1
−2i (−2ix + 2) = x + i. This implies that −i is a root of both polynomials, as one can verify.
(2) Now we choose F = Z/3Z, the ﬁeld with 3 elements. We take f (x) = x3 + 2x + 1, g (x) =
x2 + 1. We then have,
f (x) = x · (x2 + 1) + (x + 1)
(x2 + 1) = (x − 1) · (x + 1) + 2
x + 1 = (2x + 2) · 2. This implies that (f (x), g (x)) = 1. We have
2 = (x2 + 1) − (x − 1) · (x + 1) = g (x) − (x − 1)(f (x) − xg (x))
= (−x + 1)f (x) + (x2 − x + 1)g (x). 51 And so we ﬁnd (note that 1 = −2 in F)
1 = (f (x), g (x)) = (x − 1)f (x) − (x2 − x + 1)g (x).
(3) Consider the polynomials f (x) = x3 + 5x2 + 4x, g (x) = x3 + x2 − x − 1 as rational
polynomials. Then
f (x) = 1 · g (x) + 4x2 + 5x + 1 1
1
15
15
x3 + x2 − x − 1 = ( x − )(4x2 + 5x + 1) − x −
4
16
16
16
−16
15
15
(4x2 + 5x + 1) =
(4x + 1)(− x − ).
15
16
16
It follows that (f (x), g (x)) = x + 1.
To express x + 1 as u(x)f (x) + v (x)g (x) we work backwards:
−15
1
1
(x + 1) = g (x) − ( x − )(4x2 + 5x + 1)
16
4
16
1
1
= g (x) − ( x − )(f (x) − g (x))
4
16
1
1
15 1
= −( x − ) · f (x) + ( + x)
4
16
16 4
Thus,
x + 1 = (f (x), g (x)) = (− 1
4
4
+ x) · f (x) − (1 + x) · g (x).
15 15
15 (4) Now consider the same polynomials over the ﬁeld F = Z/3Z. We now have:
f (x) = 1 · g (x) + x2 + 2x + 1
x3 + x2 − x − 1 = (x − 1)(x2 + 2x + 1).
Therefore, now we have (f (x), g (x)) = x2 + 2x + 1 = (x + 1)2 .
15.4. Irreducible polynomials and unique factorization. Let F be a ﬁeld. We deﬁne a
relation on polynomials f (x) ∈ F[x]. We say that f (x) ∼ g (x) if there is an element a ∈ F, a = 0
such that f (x) = ag (x).
Lemma 15.5. This relation is an equivalence relation. Related polynomials are called associates.
Proof. The relation is reﬂexive because f (x) = 1 · f (x) and symmetric, because f (x) = ag (x) implies g (x) = a−1 f (x). It is also transitive since f (x) = ag (x) and g (x) = bh(x) implies
f (x) = abh(x) and ab = 0.
A nonconstant polynomial f is called irreducible if g f implies that g ∼ 1 or g ∼ f . That’s as
close we can get to the notion of a prime. We cannot expect g (x) = 1, or g (x) = f (x), of course.
Note that if g h and g1 ∼ g then g1 h. 52 Proposition 15.6. Let f (x) ∈ F[x] be a nonconstant polynomial. The following are equivalent:
(1) f is irreducible.
(2) if f gh then f g or f h.
Proof. Suppose that f is irreducible, f gh and f g . The only monic polynomials dividing f are 1 and a−1 f , where a is the leading coeﬃcient of f . Therefore, (f, g ) = 1 and so, for suitable
polynomials u, v we have uf + vg = 1. Then uf h + vgh = h. Since f divides the left hand side,
it also divides the right hand side, i.e., f h.
Suppose now that f has the property f gh ⇒ f g or f h. Let g be a divisor of f . Then f = gh
for some h and so f gh. Therefore, f g or f h. Since hf , the situation is symmetric and we
can assume that g f and f g . This implies that deg(g ) ≤ deg(f ) and deg(f ) ≤ deg(g ), and
so deg(f ) = deg(g ). But then deg(h) = deg(f ) − deg(g ) = 0 and so h is a constant polynomial.
We ﬁnd that f ∼ g .
Example 15.7. Here are some comments on irreducible polynomials.
(1) Every linear polynomial is irreducible.
(2) If f = ax2 + bx + c is reducible then f = (αx + β )(γ x + δ ), where α, β , γ , δ ∈ F and α =
0, γ = 0. It follows then that f has a root in F, for example x = −α−1 β .
Conversely, suppose that f has a root α ∈ F then, as we shall see shortly (Theorem 15.12), f = (x − α)g (x) for some polynomial g (x) ∈ F[x], and degree considerations
dictate that g (x) is a also a linear polynomial.
Therefore, for quadratic polynomials one can say that f is reducible if and only if f has
a root in F. If, furthermore, 2 = 0 in the ﬁeld F, then as we have seen in the assignment,
we know that f has a root if and only if b2 − 4ac is a square in F. In fact, in that case,
the unique factorization of f is
√
√
−b + b2 − 4ac
−b − b2 − 4ac
2
ax + bx + c = a x −
x−
.
2a
2a (3) If f has degree 3 it is still true that f is reducible if and only if f has a root. But if f has
degree 4 or higher this may fail. For example, the polynomial x2 − 2 is irreducible over Q
√
because 2 is irrational. Same for x2 − 3. Thus, for example, the polynomials (x2 − 2)2 , (x2 − 2)(x2 − 3), (x2 − 3)2 are reducible over Q but don’t have a root. (Indeed, if α
√
√
√
√
is a root of (x2 − 2)(x2 − 3) then in C we have (α − 2)(α + 2)(α − 3)(α + 3) = 0
√
√
and so α is ± 2 or ± 3 and, in any case, is not rational.
(4) The property of f being irreducible depends on the ﬁeld. It is not an absolute property.
For example, x2 − 2 is irreducible in Q[x] by reducible in C[x] because there we can
√
√
write x2 = (x − 2)(x + 2)).
Theorem 15.8. (Unique factorization for polynomials) Let f (x) ∈ F[x] be a nonzero polynomial.
Then there is an a ∈ F× and monic irreducible polynomials f1 , · · · , fg of positive degree and 53 positive integers r1 , . . . , rg such that
r r
f = af1 1 · · · fg g . Moreover, if
f = bhs1 · · · hst ,
t
1 where b ∈ F× , hi monic irreducible polynomials of positive degree and si > 0, then a = b, g = t,
and after renaming the hi ’s we have hi = fi for all i and ri = si for all i.
Proof. The proof is very similar to the proof for integers. We ﬁrst prove the existence of factorization. Suppose that there is a nonzero polynomial f (x) with no such factorization. Choose then a
nonzero polynomial f (x) of minimal degree for which no such factorization exists. Then f (x) is
not a constant polynomial and is not an irreducible polynomial either, else f (x) = an xn +· · ·+a0 =
an · (a−1 f (x)) is a suitable factorization. It follows that f (x) = f1 (x)f2 (x), where each fi (x) has
n
degree less than that of f (x).
Therefore, each fi (x) has a factorization
f1 (x) = c1 a1 (x) · · · am (x), f2 (x) = c2 b1 (x) · · · bn (x), with ci ∈ F and ai , bj monic irreducible polynomials. It follows that
f (x) = (c1 c2 )a1 (x) · · · am (x)b1 (x) · · · bn (x),
has also a factorization as claimed. Contradiction. Thus no such f (x) exists and every polynomial
has a factorization as claimed.
We now show the uniqueness of the factorization. Suppose that
f (x) = c1 a1 (x) · · · am (x) = c2 b1 (x) · · · bn (x),
with ci ∈ F and ai , bj monic irreducible polynomials. We prove the result by induction on degree f .
Since ci is the leading coeﬃcient of f , we have c1 = c2 . In particular, the case of deg(f ) = 0
holds. Assume we have proved uniqueness for all polynomials of degree ≤ n and deg(f ) = n + 1.
Since a1 (x)c2 b1 (x) · · · bn (x) and a1 (x) is irreducible, it follows that a1 (x)c2 (which is impossible
because c2 is a constant) or a1 (x)bi (x) for some i (and in particular we must have n ≥ 1). But
since bi (x) is irreducible it then follows that that a1 (x) ∼ bi (x) and so, both polynomials being
monic, a1 (x) = bi (x).
Let us renumber the bi so that a1 = b1 . Then, dividing by a1 (x) we have
c1 a2 (x) · · · am (x) = c2 b2 (x) · · · bn (x).
Induction gives that m = n and, after renumbering the bi , ai (x) = bi (x), i = 2, 3, . . . , n.
Example 15.9. Here are some examples.
(1) f (x) = ax + b, a = 0 has unique factorization a(x + a−1 b). 54 (2) f (x) = ax2 + bx + c is irreducible if and only if it has no root in F, as we have seen
above. If this is the case,
f (x) = a(x2 + a−1 bx + a−1 c)
is the unique factorization. Otherwise, f has two roots, say β and γ (and if 2 = 0 we
have a formula for them) and
is the unique factorization. f (x) = a(x − α)(x − β ) √
(3) Consider the polynomial f (x) = (x2 − 2)(x2 − 3) over the ﬁeld K = Q( 2). We claim
√
√
√
that x2 − 3 is irreducible over K . Indeed, if not, 3 ∈ K and so 3 = a + b 2 for some
rational numbers a, b. Squaring, we get
√
3 = a2 + 2b2 + 2ab 2.
√
√
But this implies that 2ab 2 is rational and so ab = 0. If b = 0 we get that 3 = a is
rational, which is a contradiction. If a = 0 we get that 3 = 2a2 , which is a contradiction,
because of unique factorization of rational numbers: The power of 2 in the right hand
side (i.e., in 3) is 0, while in the right hand side (i.e. in 2a2 ) is odd, whatever it may be.
Therefore,
√
√
f (x) = (x − 2)(x + 2)(x2 − 3)
√
is the unique factorization of f over Q( 2).
(4) What is the unique factorization of x4 + x over F2 [x]? An obvious factor is x and then
we are left with x3 + 1. We note that x = 1 is a root (2 = 0 now!) and so x3 + 1 =
(x + 1)(x2 + x + 1) (cf. Theorem 15.12). The polynomial x2 + x + 1 is quadratic and so
is reducible over F2 if and only if it has a root in F2 . We try and ﬁnd that neither x = 0
nor x = 1 are roots. We conclude that the unique factorization is given in this case by
x4 + x = x(x + 1)(x2 + x + 1).
(5) Over the complex numbers any nonconstant polynomial f factors as
n
f (x) = an (x − zi ),
i=1 (see Proposition 6.7) and this is precisely its unique factorization.
We can know deduce from Theorem 15.8 the analogues of Proposition 10.6 and Corollary 10.7.
The proofs are the same.
Proposition 15.10. Let f, g be nonzero polynomials in F[x]. Then f g if and only if f =
a a b
mb
apa1 · · · pam and g = bp11 · · · pm q11 · · · qt t (products of distinct irreducible monic polynomials pi ; a, b
m
1 nonzero scalars) with a ≥ ai for all i = 1, . . . , m.
i 55 Corollary 15.11. Let f = apa1 · · · pam , g = bpb1 · · · pbm with pi distinct irreducible monic polym
m
1
1
nomials, a, b non zero scalars and ai , bi nonnegative integers. (Any two nonzero polynomials
can be written this way). Then
min(a1 ,b1 ) gcd(f, g ) = p1 · · · pmin(am ,bm ) .
m 15.5. Roots. Let F be a ﬁeld and let f (x) ∈ F[x] be a nonzero polynomial. Recall that an
element a ∈ F is called a root (or zero, or solution ) of f if f (a) = 0.
Theorem 15.12. Let f (x) ∈ F[x] be a nonzero polynomial. (1) If f (a) = 0 then f (x) = (x − a)g (x) for a unique polynomial g (x) ∈ F[x]. In particular,
if f is irreducible then f has no roots in F.
(2) Let deg(f ) = d then f has at most d roots. Proof. Suppose that f (a) = 0 and divide f by x − a with a residue, getting f (x) = g (x)(x −
a) + r(x), where r(x) is either zero or a polynomial of degree less than that of x − a. That is,
in either case, r(x) is a constant. Substitute x = a. We get 0 = f (a) = g (a)(a − a) + r = r and
so f (x) = (x − a)g (x).
Consider the factorization of f into irreducible monic polynomials:
f = A(x − a1 )s1 · · · (x − am )sm f1 (x)r1 . . . fn (x)rn , where the fi are irreducible polynomials of degree larger than 1, and the ri , si are positive. Note
that if f (a) = 0 then, since fi (a) = 0 (else fi (x) = (x − a)gi (x)), we must have a = ai for
some i. It follows that the number of roots of f , counting multiplicities, is s1 + s2 + · · · + sm =
deg((x − a1 )s1 · · · (x − am )sm ) ≤ deg(f ) = d.
A ﬁeld F is called algebraically closed if any nonconstant polynomial f (x) ∈ F[x] has a root in F.
The Fundamental Theorem of Algebra says that the ﬁeld of complex numbers C is algebraically
closed. It is a fact (proven in MATH370) that every ﬁeld is contained in an algebraically closed
ﬁeld.
If F is algebraically closed, then the only irreducible polynomials over F are the linear ones x −
a, a ∈ F. It follows then that
f (x) = A(x − a1 )s1 · · · (x − am )sm , where A is the leading coeﬃcient of f and a1 , . . . , am are the roots (with multiplicities s1 , . . . , sm ).
A natural question is, for a given ﬁeld F and a given polynomial f (x), to tell if f has a root in F
or not. This is in general impossible, but we have some partial answers in special cases.
Proposition 15.13. Let f (x) = an xn + · · · + a1 x + a0 be a nonconstant polynomial with integer
coeﬃcients. If a = s/t, (s, t) = 1, is a rational root of f then sa0 and tan . 56 Proof. We have an (s/t)n + · · · + a1 (s/t) + a0 = 0 and so
an sn + an−1 sn−1 t + · · · + a1 stn−1 + a0 tn = 0.
Since s divides an sn + an−1 sn−1 t + · · · + a1 stn−1 , it follows that sa0 tn . Then, since (s, t) = 1, we get that sa0 . Similarly, t divides an−1 sn−1 t + · · · + a1 stn−1 + a0 tn , so t divides an sn . Now (s, t) = 1
implies that tan .
Example 15.14. Problem: Find the rational roots of the polynomial x4 − 7 x3 + 5 x2 − 7 x + 3 . The
2
2
2
2 roots are the same as for the polynomial 2x4 − 7x3 + 5x2 − 7x + 3. There are thus of the form s/t,
where s = ±1, ±3, t = ±1, ±2. We have the possibilities ±1, ±1/2, ±3, ±3/2. By checking each
case, we ﬁnd the roots are 1/2 and 3. We remark that after having found the root 1/2 we can
divide the polynomial 2x4 − 7x3 + 5x2 − 7x + 3 by x − 1/2 ﬁnding 2x3 − 6x2 + 2x − 6, whose roots are the roots of x3 − 3x2 + x − 3. So, in fact, the only possibilities for additional roots are ±3.
We saved this way the need to check if ±3/2 are roots.
Here is another example. Is the polynomial x3 + 2x2 + 5 irreducible over Q? In this case, if it is
reducible then one of the factors would have to have degree 1 (this type of argument only works
for degrees 1, 2, 3 polynomials. For higher degree, we might have a reducible polynomial with no
linear factor, e.g., (x2 + 1)(x2 + 3)). Namely, the polynomial would have a rational root. But the
rational roots can only be ±1, ±5 and one veriﬁes those are not roots. Thus, the polynomial is
irreducible.
Proposition 15.15. If f (x) ∈ R[x] is a polynomial of odd degree then f has a root in R.
Proof. Since the roots of f are the roots of −f , we may assume that f (x) = an xn + · · · + a1 x +
a0 , ai ∈ R, an > 0. An easy estimate shows that there is an N > 0 such that f (N ) > 0
and f (−N ) < 0. By the intermediate value theorem there is some a, −N ≤ a ≤ N such
that f (a) = 0.
15.6. Eisenstein’s criterion. (Planned)
15.7. Roots of polynomials in Z/pZ. Let p be a prime and let Z/pZ be the ﬁeld with p
elements whose elements are congruence classes modulo p. By Fermat’s little theorem, every
element of Z/pZ× is a root of xp−1 − 1. This gives p − 1 distinct roots of xp−1 − 1 and so these
must be all the roots and each with multiplicity one. It follows that the roots of xp − x are
precisely the elements of Z/pZ again with multiplicity one. That is,
x −x=
p p−1
a=0 (x − a).
¯ Proposition 15.16. Let f (x) be any polynomial in Z/pZ[x]. Then f (x) has a root in Z/pZ if
and only if gcd(f (x), xp − x) = 1. 57 Proof. If f (a) = 0 for some a ∈ Z/pZ then (x − a)f (x), but also (x − a)(xp − x). It follows
that gcd(f (x), xp − x) = 1. Conversely, if h(x) = gcd(f (x), xp − x) = 1 then, since h(x)xp − x =
p−1
¯
a=0 (x − a), by unique factorization, we must have h(x) =
i=1,...,n (x − ai ) for some distinct
elements a1 , . . . , an of Z/pZ. In particular, each such ai is a root of f (x). The straightforward way to check if f (x) has a root in Z/pZ is just to try all possibilities for x.
Suppose that f (x) has a small degree relative to p, say 5, to ﬁx ideas. We still have to try p
numbers, each in its turn, to see if any of which is a root. But p may be very large, much too
large for this method to be feasible. For example, p might be of cryptographic size ≈ 2512 . Even
with a computer doing 1010 operations per second, checking all these possibilities will take more
than 10134 years!
Proposition 15.16 suggests a diﬀerent method: Calculate gcd(f (x), xp − x). Note that except
for the ﬁrst step
xp − x = q0 (x)f (x) + r0 (x),
all the polynomials involved in the Euclidean algorithm would have very small degrees (smaller
then f ’s for example) and so the Euclidean algorithm will terminate very quickly. The ﬁrst step,
though, could be very time consuming given what we know at this point. Later we shall see that
it can, in fact, be done quickly. We have seen that many of the features of arithmetic in Z can be carried out in F[x]. We still
don’t have an analogue of passing from Z to Z/nZ in the context of F[x]. This is one of our main
motivation for studying rings in much more detail. 58 Part 5. Rings
16. Some basic definitions and examples
Recall our deﬁnition of a ring
Deﬁnition 16.1. A ring R is a nonempty set together with two operations, called “addition”
and “multiplication” that are denoted, respectively, by
(x, y ) → x + y, (x, y ) → xy. One requires the following axioms to hold:
(1) x + y = y + x, ∀x, y ∈ R. (Commutativity of addition)
(2) (x + y ) + z = x + (y + z ), ∀x, y, z ∈ R. (Associativity of addition)
(3) There exists an element in R, denoted 0, such that 0 + x = x, ∀x ∈ R. (Neutral element
for addition)
(4) ∀x ∈ R, ∃y ∈ R such that x + y = 0. (Inverse with respect to addition)
(5) (xy )z = x(yz )∀x, y, z ∈ R. (Associativity of multiplication)
(6) There exists an element 1 ∈ R such that 1x = x1 = x, ∀x ∈ R. (Neutral element for
multiplication)
(7) z (x + y ) = zx + zy, (x + y )z = xz + yz, ∀x, y, z ∈ R. (Distributivity)
Recall also that a ring R is called a division ring (or sometimes a skewﬁeld) if 1 = 0 in R and
any nonzero element of R has an inverse w.r.t. multiplication. A commutative division ring is
precisely what we call a ﬁeld.
Example 16.2. Z is a commutative ring. It is not a division ring and so is not a ﬁeld. The
rational numbers Q form a ﬁeld. The real numbers R form a ﬁeld. The complex numbers C form
a ﬁeld.
We have also noted some useful formal consequences of the axioms deﬁning a ring:
(1) The element 0 appearing in axiom (3) is unique.
(2) The element y appearing in axiom (4) is unique. We shall denote y by −x.
(3) We have −(−x) = x and −(x + x ) = −x − x , where, technically −x − x means (−x) +
(−x ).
(4) We have x · 0 = 0, 0 · x = 0.
Here are some further examples. We do not prove the ring axioms hold; this is left as an exercise.
Example 16.3. Let F be a ﬁeld and n ≥ 1 an integer. Consider the set of n × n matrices: a11 . . . a1n .
.
. :a ∈F .
.
.
. ij
Mn (F) = .
.
. an1 . . . ann
For example: 59 (1) for n = 1 we just get (a11 ), a11 ∈ F;
a11 a12
(2) for n = 2,
;
a21 a22 a11 a12 a13
(3) for n = 3 we get a21 a22 a23 .
a31 a32 a33 In general we shall write an n × n matrix as (aij ), or (aij )n =1 . The index i is the row index and
i,j
the index j is the column index. We then deﬁne (aij ) + (bij ) = (aij + bij ), (aij )(bij ) = (cij ), where
cij = n
aik bkj . k=1 For example: (1) for n = 1 we get (a) + (b) = (a + b) and (a)(b) = (ab). Namely, we just get F again!
(2) for n = 2, we have
a11 a12
b11 b12
a11 + b11 a12 + b12
+
=
,
a21 a22
b21 b22
a21 + b21 a22 + b22
and a11 a12
a21 a22
b11 b12
a11 b11 + a12 b21 a11 b12 + a12 b22
=
.
b21 b22
a21 b11 + a22 b21 a21 b12 + a22 b22 Under these deﬁnitions Mn (F) is a ring, called the ring of n × n matrices with entries in F, with
identity given by the identity matrix 1 0 ... 0
0 1 In = ,
.. .
0 ...
1
and zero given by the zero matrix (the matrix all whose entries are zero). For n ≥ 2 this is a
noncommutative ring. Indeed, for n = 2, for example, we have,
11
10
21
10
11
11
=
,
=
.
01
11
11
11
01
12
These are never equal, else 2 = 1 in F, which implies 1 = 0 in F, which is never the case, by
deﬁnition.
Example 16.4. Let be a formal symbol and F a ﬁeld. The ring of dual numbers, F[], is deﬁned
as
F[] = {a + b : a, b ∈ F}, 60 with the following addition and multiplication:
(a + b) + (c + d) = a + c + (b + d), (a + b)(c + d) = ac + (ad + bc). Note that is a zero divisor: = 0 but 2 = 0. √
Example 16.5. Let n = ±1 be a square free integer (i.e., if pn, p prime, then p2 n). Then n
√
√
is not a rational number. Indeed if n is rational, n = s/t, (s, t) = 1, then n = s2 /t2 . Let p a
prime dividing n and so p2 n. Then nt2 = s2 and so ps2 . But then ps. Looking at the power
of p in the unique factorization of both sides, it follows that pt and thus p(s, t) – a contradiction.
Consider
√
√
Z[ n] = {a + b n : a, b ∈ Z}.
This is a subset of C, containing 0 and 1 and is closed under addition and multiplication:
√
√
√
(a + b n) + (c + d n) = a + c + (b + d) n, √
√
√
(a + b n)(c + d n) = ac + bdn + (ad + bc) n. √
√
We remark that any element of this ring has a unique expression as a + b n. Indeed, if a + b n =
√
√
c + d n, either b = d (and then obviously a = c) or n = (a − c)/(d − b) is a rational number,
which it’s not.
Example 16.6. Let R1 , R2 be rings. Then R1 × R2 is a ring with the following operations:
(a1 , b1 ) + (a2 , b2 ) = (a1 + a2 , b1 + b2 ), (a1 , b1 )(a2 , b2 ) = (a1 a2 , b1 b2 ). The zero element is (0R1 , 0R2 ) and the identity element is (1R1 , 1R2 ). The ring R1 × R2 is called
the direct product of R1 and R2 .
Deﬁnition 16.7. Let S ⊂ R be a subset. S is called a subring of R if the following holds:
(1) 0R , 1R belong to S ;
(2) s1 , s2 ∈ S ⇒ s1 ± s2 ∈ S ;
(3) s1 , s2 ∈ S ⇒ s1 s2 ∈ S .
Note that in this case S is a ring in its own right.
Example 16.8. The easiest examples are Z, Q, R being subrings of C. We’ve already seen
examples of subrings of the ring of 2 × 2 matrices in the assignments.
Consider the subset {(r, 0) : r ∈ R} of the ring R × R. It is closed under addition and
multiplication. It is even a ring because (r, 0)(1, 0) = (r, 0) and so (1, 0) serves as an identity
element for this subset. Nonetheless, it is not a subring of R × R, because the identity element
of R × R, which is (1, 1), does not belong to the set {(r, 0) : r ∈ R} . 61 17. Ideals
Deﬁnition 17.1. Let R be a ring. A (twosided) ideal I of R is a subset of R such that
(1) 0 ∈ I ;
(2) if a, b ∈ I then a + b ∈ I ;
(3) if a ∈ I , r ∈ R, then ra ∈ I and ar ∈ I .
Remark 17.2. Note that if a ∈ I then −a = −1 · a ∈ I . We shall use the notation I R to indicate that I is an ideal of R. Example 17.3. I = {0} and I = R are always ideals. They are called the trivial ideals.
Example 17.4. Suppose that R is a division ring (e.g., a ﬁeld) and I R is a nonzero ideal.
Then I = R. Indeed, there is an element a ∈ I such that a = 0. Then 1 = a−1 a ∈ I and so for
every r ∈ R we have r = r · 1 ∈ I . That is, I = R. We conclude that a division ring has only the
trivial ideals. (Note also that the argument shows for any ring R that if an ideal I contains an
invertible element of R then I = R.)
Example 17.5. Let R be a commutative ring. Let r ∈ R. The principal ideal (r) is deﬁned as
(r) = {ra : a ∈ R} = {ar : a ∈ R}.
We also denote this ideal by rR or Rr. This is indeed an ideal: First 0 = r · 0 is in (r). Second,
given two elements ra1 , ra2 in (r) we have ra1 + ra2 = r(a1 + a2 ) ∈ (r) and for every s ∈ R we
have s(ra1 ) = (sr)a1 = (rs)a1 = r(sa1 ) ∈ R (using commutativity!), (ra1 )s = r(a1 s) ∈ R.
Deﬁnition 17.6. Let R be a commutative ring. If every ideal of R is principal, one calls R a
principal ideal ring.
Theorem 17.7. Z is a principal ideal ring. In fact, the list
(0), (1), (2), (3), (4), . . .
is a complete list of the ideals of Z. (Note that another notation is 0, 2Z, 3Z, 4Z, . . . .)
Proof. We already know these are ideals and we note that for i > 0 the minimal positive number
in the ideal (i) is i. Thus, these ideals are distinct.
Let I be an ideal of Z. If I = {0} then I appears in the list above. Else, there is some nonzero
element a ∈ I . If a < 0 then −a = −1 · a ∈ I and so I has a positive element in it. Choose the
smallest positive element in I and call it i.
First, since i ∈ I so is ia for any a ∈ Z and so (i) ⊂ I . Let b ∈ I . Divide b by i with
residue: b = qi + r, where 0 ≤ r < i. Note that r = b − qi is an element of I , smaller than i. The
only possibility is that r = 0 and so b ∈ (i). Thus, I = (i).
Theorem 17.8. Let F be a ﬁeld. The ring F[x] is a principal ideal ring. Two ideals (f (x)), (g (x))
are equal if and only if f ∼ g . 62 Proof. The proof is very similar to the case of Z. Let I be an ideal. If I = {0} then I = (0),
the principal ideal generated by 0. Else, let f (x) ∈ I be a nonzero polynomial whose degree is
minimal among all nonzero elements of I . On the one hand I ⊇ (f (x)). On the other hand,
let g (x) ∈ I and write g (x) = q (x)f (x)+ r(x), where r(x) is either zero or of degree small than f s.
But r(x) = g (x) − q (x)f (x) ∈ I . Thus, we must have r(x) = 0 and so g (x) = q (x)f (x) ∈ (f (x)).
That is, I ⊆ (f (x)).
Suppose that (f (x)) ⊃ (g (x)) then g (x) = f (x)h(x) for some polynomial h(x)bF[x]. That
is f (x)g (x). Thus, if (f (x)) = (g (x)) then f g and g f and so f ∼ g .
If f g , say g (x) = f (x)h(x) then any multiple of g (x), say g (x)t(x) is equal to f (x)[h(x)t(x)] and
so (g (x)) ⊂ (f (x)). If f ∼ g then f g and g f and so, by the argument above, (f (x)) = (g (x)).
Example 17.9. Let F be a ﬁeld. One can show that all the ideals of F[] are {0} = (0), F[] = (1)
and () = {b : b ∈ F} and so the ring of dual numbers is also a principal ideal ring. Example 17.10. The ring of polynomials C[x, y ] in two variables with complex coeﬃcients is
not a principal ideal ring. We claim that the set of polynomials I = {f (x, y ) : f (0, 0) = 0},
namely, polynomials with zero constant term, is an ideal that is not principal. We leave that as
an exercise.
Example 17.11. Let R1 , R2 be rings with ideals I1 , I2 , respectively. Then I1 × I2 is an ideal
of R1 × R2 .
Example 17.12. Let us consider the ring F[x] and in it the set S = {f (x) : f (x) = a0 + a2 x2 + a3 x3 + . . . }, of polynomials with no x term. Note that 0 ∈ S and s1 , s2 ∈ S ⇒ s1 + s2 ∈ S . However, S is not
an ideal. We have 1 ∈ S but x = x · 1 ∈ S .
√
Example 17.13. Consider the ring Z[ 5]. In this ring we consider
√
I = {5a + b 5 : a, b ∈ Z}. We claim that I is an ideal. Clearly 0 ∈ I and I is closed under addition. To show I is an ideal
√
it is enough to show that for every x ∈ I and c ∈ Z also cx and 5x are in I . Indeed then for
√
√
√
√
√
any element of the ring c + d 5 we have (c + d 5)x = cx + d( 5x); 5x ∈ I so d 5x ∈ I and
√
also cx ∈ I and so cx + d 5x ∈ I .
√
√
√
√
The veriﬁcation is easy: c(5a + b 5) = 5(ac) + (bc) 5 and so is in I . Also 5(5a + b 5) =
√
5b + (5a) 5 is in I .
Example 17.14. Let R be a ring and I1 , I2 two ideals of R. Then
I1 + I2 = {i1 + i2 : i1 ∈ I1 , i2 ∈ I2 } is an ideal of R. Inductively, the sum of n ideals I1 + I2 + · · · + In is an ideal. A particular case
is the following: Let R be a commutative ring and Ii = ri R a principal ideal. Then
r1 R + r 2 R + · · · + r n R 63 is an ideal of R; we often denote it by (r1 , r2 , . . . , rn ) or r1 , r2 , . . . , rn (so in particular a principle
ideal (r) may also be denoted r).
√
√
Let us consider the situation of the ring R = Z[ −5] and the ideal 2, 1 + −5. We know
abstractly that this is an ideal. I claim that this ideal is not principle. In particular, this
√
shows that this ideal is not Z[ −5] and, more importantly, gives us an example of a ring with
nonprincipal ideals.
√
√
√
√
Suppose that 2, 1 + −5 = a + b −5. It follows that 2 = (a + b −5)(c + d −5) and so
√
√
that 2 = (a − b −5)(c − d −5) (check!). Therefore, by multiplying these two equations, 4 = (a2 + 5b2 )(c2 + 5d2 ). This is an equation in integers and so a ∈ {±1, ±2}, b = 0 and we conclude
√
√
√
that 2, 1+ −5 = a is equal to 1 or 2. Now, if 2, 1+ −5 = 2 this implies that 1+ −5 =
√
√
√
2(c + d −5), which is a contradiction. If 2, 1 + −5 = 1 then 1 = 2(c1 + d1 −5) + (1 +
√
√
√
−5)(c2 + d2 −5) = (2c1 + c2 − 5d2 ) + −5(2d1 + c2 + d2 ). Therefore, 2d1 + c2 + d2 = 0, that
is, −2d1 − c2 = d2 and we get 1 = 2c1 + c2 − 5d2 = 2c1 + c2 + 10d1 + 5c2 = 2(c1 + 3c2 + 5d1 ).
This is an equation in integers and it implies that 1 is even. Contradiction. 18. Homomorphisms
Let R, S be rings. A function f : R → S is a ring homomorphism if the following holds:
(1) f (1R ) = 1S ;
(2) f (r1 + r2 ) = f (r1 ) + f (r2 );
(3) f (r1 r2 ) = f (r1 )f (r2 ).
Here are some formal consequences (that are nonetheless very useful).
• f (0R ) = 0S . Indeed, f (0R ) = f (0R +0R ) = f (0R )+ f (0R ). Let y = f (0R ) then y = y + y .
Adding −y to both sides we ﬁnd 0S = f (0R ).
• We have f (−r) = −f (r). Indeed: 0S = f (0R ) = f (r + (−r)) = f (r) + f (−r) and
so f (−r) = −f (r) (just because it sums with f (r) to 0S !)
• We have f (r1 − r2 ) = f (r1 ) − f (r2 ), because f (r1 − r2 ) = f (r1 +(−r2 )) (this, by deﬁnition)
and so f (r1 − r2 ) = f (r1 ) + f (−r2 ) = f (r1 ) − f (r2 ). Note, in particular, that f (0R ) = 0S is a consequence of axioms (2), (3). On the other hand f (1R ) =
1S does not follow from (2), (3) and we therefore include it as an axiom (though not all authors
do that). Here is an example. Consider,
f : R → R × R, f (r) = (r, 0). This map satisﬁes f (r1 + r2 ) = f (r1 ) + f (r2 ) and f (r1 r2 ) = f (r1 )f (r2 ), but f (1) = (1, 0) is not
the identity element of R × R. So this is not a ring homomorphism.
On the other hand, if S ⊂ R is a subring then the inclusion map i : S → R, i(s) = s, is a ring
homomorphism. Note that this explains why in the deﬁnition of a subring we insisted on 1R ∈ S . 64 Proposition 18.1. Let f : R → S be a homomorphism of rings. The image of f is a subring
of S .
Proof. As we saw, f (0R ) = 0S . Also, by deﬁnition f (1R ) = 1S and so 0S , 1S ∈ Im(f ). Let
now s1 , s2 ∈ Im(f ), say si = f (ri ). Then, s1 ± s2 = f (r1 ) ± f (r2 ) = f (r1 ± r2 ) and so s1 ± s2 ∈
Im(f ). Similarly, s1 s2 = f (r1 r2 ) and so s1 s2 ∈ Im(f ).
Deﬁnition 18.2. Let f : R → S be a homomorphism of rings. The kernel of f , Ker(f ), is
deﬁned as follows:
Ker(f ) = {r ∈ R : f (r) = 0}.
Proposition 18.3. Ker(f ) is an ideal of R. The map f is injective if and only if Ker(f ) = {0}.
Proof. First, since f (0R ) = 0S we have 0R ∈ Ker(f ). Suppose that r1 , r2 ∈ Ker(f ) then f (ri ) = 0S
and we ﬁnd that f (r1 + r2 ) = f (r1 ) + f (r2 ) = 0S + 0S = 0S so r1 + r2 ∈ Ker(f ).
Now suppose that r1 ∈ Ker(f ) and r ∈ R is any element. We need to show that rr1 , r1 r ∈
Ker(f ). We calculate f (rr1 ) = f (r)f (r1 ) = f (r)0S = 0S , so rr1 ∈ Ker(f ). Similarly for r1 r.
We have so far proven that Ker(f ) is an ideal. Suppose now that f is injective. Then f (r) = 0S
implies f (r) = f (0R ) and so r = 0R . That is, Ker(f ) = {0R }.
Suppose conversely that Ker(f ) = {0R }. If f (r1 ) = f (r2 ) then 0S = f (r1 ) − f (r2 ) = f (r1 − r2 )
and so r1 − r2 ∈ Ker(f ). Since Ker(f ) = {0R }, we must have r1 − r2 = 0R ; that is, r1 = r2 . We
proved that f is injective.
We now look at some examples:
Example 18.4. Let n ≥ 1 be an integer. Deﬁne a function,
f : Z → Z/nZ,
by f (a) = a (the congruence class of a modulo n). Then f is a homomorphism:
¯
(1) f (1) = ¯ and ¯ is the indeed the identity element of Z/nZ;
1
1
(2) f (a + b) = a + b = a + ¯ = f (a) + f (b);
¯b
(3) f (ab) = ab = a ¯ = f (a)f (b).
¯b
The kernel of f is {a : a ≡ 0 (mod n)} = {a : na} = (n).
¯
Example 18.5. Let R, S be any rings and deﬁne
f : R × S → R, f ((r, s)) = r. This is a homomorphism:
(1) f (1R×S ) = f ((1R , 1S )) = 1R ;
(2) f ((r1 , s1 ) + (r2 , s2 )) = f ((r1 + r2 , s1 + s2 )) = r1 + r2 = f ((r1 , s1 )) + f ((r2 , s2 ));
(3) f ((r1 , s1 )(r2 , s2 )) = f ((r1 r2 , s1 s2 )) = r1 r2 = f ((r1 , s1 ))f ((r2 , s2 )).
The kernel of f is {(r, s) : r = 0} = {(0, s) : s ∈ S } = {0} × S . 65 Example 18.6. Let F be a ﬁeld and F[] the ring of dual numbers. Deﬁne
f : F[] → F,
Then f
(1)
(2)
(3) f (a + b) = a. is a homomorphism:
f (1) = 1;
f ((a + b) + (c + d)) = f (a + c + (b + d)) = a + c = f (a + b) + f (c + d);
f ((a + b)(c + d)) = f (ac + (ad + bc)) = ac = f (a + b)f (c + d). The kernel of f is {a + b : a = 0, b ∈ F} = {b : b ∈ F}. We claim that this is the ideal (). On the
one hand b certainly is in () for any b. That is Ker(f ) ⊆ (). On the other hand (c + d) = c
and that shows () ⊆ Ker(f ).
Example 18.7. Let F be a ﬁeld. Let a ∈ F be a ﬁxed element. Deﬁne
α : F[x] → F, α(g (x)) = g (a). Then α is a homomorphism:
(1) α(1) is the value of the constant polynomial 1 at a which is just 1, so α(1) = 1.
(2) We have α(f + g ) = (f + g )(a) = f (a) + g (a) = α(f ) + α(g );
(3) Similarly, α(f g ) = (f g )(a) = f (a)g (a) = α(f )α(g ).
Therefore, α is a homomorphism. It is called the specialization homomorphism or the evaluation
homomorphism. The kernel of α is {f ∈ F[x] : f (a) = 0} and is equal to the principle ideal (x − a).
Indeed: if g (x) ∈ (x − a) then g (x) = (x − a)g1 (x) and so g (a) = (a − a)g1 (a) = 0. Conversely,
if g (a) = 0, Theorem 15.12 says that g (x) = (x − a)g1 (x) for some polynomial g1 (x) and so
g (x) ∈ (x − a).
Example 18.8. Let A be the set of all continuous functions f : [0, 1] → R. Deﬁne the sum (resp.
product) of two functions f, g be the function f + g (resp. f g ) whose value at any x is f (x) + g (x)
(resp. f (x)g (x)). That is:
(f + g )(x) = f (x) + g (x), (f g )(x) = f (x)g (x). This is a ring (in particular, these are operations – the sum and product of continuous functions
is continuous!). Its zero element is the constant function zero and its identity element is the
constant function 1. Let x ∈ [0, 1] be a ﬁxed element. Deﬁne
ϕ : A → R, ϕ(f ) = f (x). Then ϕ is a ring homomorphism whose kernel are all the functions vanishing at the point x. 66 18.1. Units. Let R be any ring. The units of R are denoted R× and deﬁned as follows:
R× = {x ∈ R : ∃y ∈ R, xy = yx = 1}.
For example, 1R is always a unit. If R is a ﬁeld then, by deﬁnition, R× = R − {0}.
Lemma 18.9. We have the following properties:
(1) If r1 , r2 ∈ R× then r1 r2 ∈ R× . (2) Let f : R → S be a homomorphism of rings then f (R× ) ⊆ S × .
Proof. Suppose that r1 , r2 ∈ R× and r1 y1 = y1 r1 = 1, r2 y2 = y2 r2 = 1. Let y = y2 y1
then (r1 r2 )y = r1 (r2 y2 )y1 = r1 · 1 · y1 = r1 y1 = 1. A similar computation gives y (r1 r2 ) = 1 and so r1 r2 ∈ R× .
Let now f : R → S be a homomorphism and r ∈ R× with ry = yr = 1R . Then f (r)f (y ) =
f (ry ) = f (1R ) = 1S and f (y )f (r) = f (yr) = f (1R ) = 1S . It follows that f (r) ∈ S × . Example 18.10. We have Z× = {±1}. We have Q× = Q − {0}.
Example 18.11. We have F[] = {a + b : a = 0}. Indeed, if a = 0 then (a + b)(a−1 − a−2 b) = 1
(where a−2 is by deﬁnition (a2 )−1 . It satisﬁes a−2 a = a−1 ). Conversely, if (a + b)(c + d) = 1
then ac = 1 and so a = 0.
Example 18.12. Let n = ±1 be a square free integer. We have
√
√
Z[ n]× = {a + b n : a2 − b2 n = ±1}. √
√
√
Indeed, if a2 − b2 n = ±1 then (a + b n)(a − b n) = ±1 and so a + b n is invertible with
√
√
√
√
inverse ±(a − b n). Conversely, if a + b n is invertible, say (a + b n)(c + d n) = 1 (for
√
√
some c, d ∈ Z) then ad + bc = 0 and so also (a − b n)(c − d n) = 1. We get that
√
√
√
√
(a + b n)(a − b n)(c + d n)(c − d n) = 1.
√
√
√
√
But (a + b n)(a − b n) = a2 − b2 n and (c + d n)(c − d n) = c2 − d2 n are integers. So
√
√
(a + b n)(a − b n) = a2 − b2 n = ±1. Example 18.13. Let F be a ﬁeld. The units of the ring M2 (F) are the matrices
ab
: ad − bc = 0 .
cd
Indeed, suppose that for the matrix ab
cd we have ad − bc = 0. Consider the matrix (ad − bc)−1 d −b
−c a 67 ab
(where by t
cd
inverse. We have we mean
ta tb
ab
. It is equal to
t). We claim that this is the
tc td
cd
ab
0
−1 ad − bc
(ad − bc)
= (ad − bc)
cd
0
ad − bc
10
=
.
01
ab
d −b
10
−1
Similarly, one checks that
(ad − bc)
=
.
cd
−c a
01
ab
Suppose now that
is invertible. The expression ad − bc is called the determinant of the
cd
ab
matrix M =
and is denoted det(M ). One can verify by a laborious but straightforward
cd
−1 d −b
−c a calculation that for any two matrices M, N we have det(M N ) = det(M ) det(N ).
If the matrix M has an inverse, say M N = N M = I2 , then
det(M N ) = det(M ) det(N ) = det(I2 ) = 1,
and that shows that det(M ) = 0. (One can then show that N is necessarily
but we don’t really need this fact now). (ad−bc)−1
d −b
,
−c a 19. Quotient rings
Consider a surjective ring homomorphism f : R → S . Given an element s ∈ S let r be an
element of R such that f (r) = s. How unique is r? If a ∈ I := Ker(f ) then f (r + a) =
f (r) + f (a) = f (r) + 0 = f (r). Conversely, if f (r1 ) = s then f (r1 − r) = f (r1 ) − f (r) = s − s = 0
so a := r1 − r ∈ I and r1 = r + a. Let us use the notation
r + I = {r + i : i ∈ I }
(this called a coset of I ). Then we have proven that
f −1 (s) = r + I.
Thus, in a sense, we may identify elements of S with cosets of R and from this point of view we
may that the cosets (thought of as being the elements of S ) form a ring.
In this section we perform a key construction that eliminates the need in S . Given a ring R
and a two sided ideal I R we construct a new ring R/I , whose elements are cosets of I . 68 Deﬁnition 19.1. Let R be a ring and I R a two sided ideal. A coset of I is a subset of R of
the form
a + I := {a + i : i ∈ I },
where is a is an element of R. Example 19.2. Suppose that R = Z and I = (n) for some positive integer n. Then a + (n) =
{. . . , a − n, a, a + n, a + 2n, . . . } are precisely the numbers congruent to a modulo n.
Lemma 19.3. We have the following facts: (1) Every element of R belongs to a coset of I .
(2) Two cosets are either equal or disjoint.
(3) The following are equivalent: (i) a + I = b + I ; (ii) a ∈ b + I ; (3) a − b ∈ I . Proof. The ﬁrst claim is easy: the element r belongs to the coset r + I , because r = r + 0
and 0 ∈ I .
Suppose that a + I ∩ b + I = ∅. Then, there is an element of R that can be written as
a + i1 = b + i2 , for some i1 , i2 ∈ I . We show that a + I ⊂ b + I ; by symmetry we have the opposite inclusion
and so the cosets are equal. An element of a + I has the form a + i for some i ∈ I . We have
a + i = b + (i2 − i1 ) + i = b + (i2 − i1 + i). Note that i2 − i1 + i ∈ I and so a + i ∈ b + I .
We next prove the equivalence of (i), (ii) and (iii). Clearly (i) implies (ii) because a = a + 0 ∈
a + I . If (ii) holds then a = b + i for some i ∈ I and so a − b = i ∈ I and (iii) holds. If (iii)
holds then a − b = i for some i ∈ I and so a = a + 0 ∈ a + I , but also a = b + i ∈ b + I . That
is, a + I ∩ b + I = ∅ and so a + I = b + I .
Theorem 19.4. Let R be a ring and I R a twosided ideal. Denote the collection of cosets of I
in R by R/I . Deﬁne addition by
(a + I ) + (b + I ) = a + b + I,
and multiplication by
(a + I )(b + I ) = ab + I.
These operations are well deﬁned and make R/I into a ring (a quotient ring) with zero element
0 + I = I and identity element 1 + I .
Proof. First, our deﬁnition of the operations makes use of writing a coset as a + I . This way of
writing is not unique and so we should check that our deﬁnitions are independent of the choice
of a such that the coset is a + I . Namely, if
a + I = a + I, b + I = b + I,
we need to check that
a + b + I = a + b + I, ab + I = a b + I. Now, (a + b) − (a + b ) = (a − a ) + (b − b ). By the lemma above a − a ∈ I, b − b ∈ I and
so (a + b) − (a + b ) ∈ I and so, by the same lemma, a + b + I = a + b + I . Also ab − a b = 69 (a − a )b + a (b − b ). Now, a − a ∈ I, b − b ∈ I and so (a − a )b ∈ I, a (b − b ) ∈ I and it follows
that ab − a b ∈ I . Therefore, ab + I = a b + I .
We now verify the ring axioms. It will be convenient to write a for a + I . With this notation
¯
we have
a + ¯ = a + b,
¯b a ¯ = ab.
¯b The axioms follow from the deﬁnition of the operations and the fact that they hold for R. To
! make clear when is it that we use that the axioms hold in R we use = at that point.
!
(1) a + ¯ = a + b = b + a = ¯ + a.
¯b
b¯
!
(2) a + (¯ + c) = a + b + c = a + (b + c) = (a + b) + c = a + b + c = (¯ + ¯) + c.
¯
b¯
¯
¯
ab
¯
!
(3) We have ¯ + a = 0 + a = a. ( We remark that ¯ = I .)
0¯
¯
0 (4) We have a + −a = a + (−a) = ¯
¯
0.
! (5) a(¯ c) = a bc = a(bc) = (ab)c = ab c = (¯ ¯) c.
¯ b¯
¯
¯
ab ¯
! !
!
(6) We have a ¯ = a 1 = a and ¯ a = 1 a = a.
¯1
¯
1¯
¯ (7) (¯ + ¯)¯ = a + b c = (a + b)c = ac + bc = ac + bc = a c + ¯ c. Also, c(¯ + ¯) = c a + b =
a bc
¯
¯¯ b¯
¯a b
¯
! !
c(a + b) = ca + cb = ca + cb = c a + c ¯.
¯¯ ¯b
Proposition 19.5. The natural map,
π : R → R/I, a → π (a) := a
¯ is a surjective ring homomorphism with kernel I . Thus, every ideal I R is the kernel of some
ring homomorphism from R to some other ring.
Proof. Note that 1 → ¯ which is the identity element of R/I . We have π (a + b) = a + b = a + ¯ =
1,
¯b
¯ = π (a) π (b). We have shown that π is a ring homomorphism
π (a) + π (b). Also, π (ab) = ab = a b
¯
and it is clearly surjective.
The kernel of π are the elements a ∈ R such that a + I = 0 + I . By the lemma above this is
the set of elements a such that a − 0 ∈ I , namely, the kernel is precisely I .
Example 19.6. Consider the ring Z. If we take the ideal {0} then Z/{0} can be identiﬁed
with Z; the map Z → Z/{0} is a bijective ring homomorphism. Let n > 0 then. The ring Z/(n)
has as elements the cosets a + (n). Two cosets a + (n), b + (n) are equal if and only if a − b ∈ (n),
that is, precisely when n(a − b). We see that the elements of Z/(n) are just the congruence classes
modulo n and the operations on Z/(n) are then just the operations we deﬁned on congruence
classes. 70 19.1. The quotient ring F[x]/(f (x)). Let F be a ﬁeld , f (x) ∈ F[x] a nonconstant polynomial.
Consider the quotient ring F[x]/(f (x)). Suppose that f (x) = xn + an−1 xn−1 + · · · + a0 is a monic
polynomial of degree n. The following lemma is an analogue of Lemma 12.1. Lemma 19.7. Every element of F[x]/(f (x)) is of the form g (x) = g (x) + (f (x)) for a unique
polynomial g (x) which is either zero or of degree less than n.
Proof. Let h(x) be a polynomial. To say that that h(x) + (f (x)) = g (x) + (f (x)) is to say
that h(x) = q (x)f (x) + g (x). The requirement that deg(g ) < deg(f ) is exactly to say that the
expression
h(x) = q (x)f (x) + g (x)
is dividing h by f with residue. We know this is always possible and in a unique fashion. Theorem 19.8. Let F be a ﬁeld , f (x) ∈ F[x] a nonconstant irreducible polynomial of degree n.
Then F[x]/(f (x)) is a ﬁeld. If F is a ﬁnite ﬁeld of cardinality q then F[x]/(f (x)) is a ﬁeld of q n
elements.
Proof. We already know that F[x]/(f (x)) is a commutative ring. We note that ¯ = ¯ because 1 =
01
1 − 0 ∈ (f ) (if it did, f would be a constant polynomial). Thus, we only need to show that a nonzero element has an inverse. Let g (x) be a nonzero element. That means that f (x) g (x)
and so that gcd(f, g ) = 1 (here is where we use that f is irreducible). Therefore, there are
polynomials u(x), v (x) such that
u(x)f (x) + v (x)g (x) = 1.
Passing to the quotient ring, that means that v g = ¯ which is the identity of the quotient ring.
¯¯ 1,
Finally, by the Lemma, every element of F[x]/(f (x)) has a unique representative of the
form an−1 xn−1 + · · · + a1 x + a0 . If F has q elements, we get q n such polynomials. Example 19.9. A ﬁeld with 4 elements. Take the ﬁeld F to be Z/2Z and consider the
polynomial x2 + x + 1 over that ﬁeld. Because it is of degree 2 and has no root in Z/2Z it must be
irreducible. Therefore, Z/2Z[x]/(x2 + x + 1) is a ﬁeld K with 4 elements. Let us list its elements:
K = {¯, ¯, x, x + 1}.
01¯ We can give tables of addition and multiplication:
+ ¯
0 ¯
1 ¯
0 ¯
0 ¯
1 ¯
1 ¯
1 ¯
0 x+1 x
¯ x
¯ x
¯ x+1 ¯
0 x+1 x+1 x
¯ ¯
1 x
¯
x
¯ ¯
0 ¯
1 x
¯ x+1 ¯
0 ¯
0 ¯
0 ¯
0 ¯
0 ¯
1 ¯
0 ¯
1 x
¯ ¯
1 x
¯ ¯
0 x
¯ x+1 ¯
1 ¯
0 x+1 ¯
0 x+1 ¯
1 x
¯ x+1 · x+1
, x+1 71 Example 19.10. A ﬁeld with 9 elements. Consider the polynomial x2 + 1 over Z/3Z. It
is quadratic and has no root in Z/3Z, hence is irreducible over Z/3Z. We conclude that L =
Z/3Z[x]/(x2 + 1) is a ﬁeld with 9 elements. Note that in Z/3Z the element −1 = 2 is not a square. However, in L we have x2 = x2 − (x2 + 1) = −1 and so −1 is a square now – its root
is x (viewed as an element of L). In fact, any quadratic polynomial over Z/3Z has a root in L,
because the discriminant “b2 − 4ac” is either 0, 1, 2 and all those are squares in L.
For example, consider the polynomial t2 + t + 2. It has discriminant −7 ≡ −1 (mod 3), which
is not a square in Z/3Z. In the ﬁeld L we have x2 = −1. The solutions of the polynomial are
√
then (−1 ± −1)/2 = 2(1 ± x) = 1 ± 2x. On the other hand, one can prove that the polynomial t3 + t2 + 2 is irreducible in Z/3Z and
stays irreducible in L. In MATH 370 we learn a systematic theory for deciding which polynomials
stay irreducible and which not.
Example 19.11. Fields with 8 and 16 elements. A polynomial of degree 3 is irreducible if
and only if it doesn’t have a root. We can verify that x3 + x + 1 doesn’t have a root in Z/2Z
and conclude that Z/2Z[x]/(x3 + x + 1) is a ﬁeld with 8 elements. Consider the ﬁeld K with 4
elements constructed above. We note that the polynomial t2 + t + x is irreducible over K (simply
¯
by substituting for t any of the four elements of K and checking). Thus, we get a ﬁeld L with 16
elements
L = K[t]/(t2 + t + x).
¯ 19.2. Every polynomial has a root in a bigger ﬁeld.
Theorem 19.12. Let F be a ﬁeld and f (x) ∈ F[x] a nonconstant polynomial. There is a ﬁeld L
containing F and an element ∈ L such that f () = 0.
Proof. If g f and g () = 0 then also f () = 0, so we may assume that f is irreducible. Let L =
F[x]/(f (x)). This is a ﬁeld. We have a natural map F → L, a → a. This map is an injective ring
¯
homomorphism and we identify F with its image in L so as to say that L ⊃ F .
Now, suppose that f (x) = an xn + . . . a1 x + a0 . To say that f has a root in L is to say that for
some element ∈ L we have
an n + . . . a1 + a0 = 0.
We check that this hold for the element = x. Indeed,
¯
an xn + . . . a1 x + a0 = an xn + . . . a1 x + a0 = f (x) = 0L .
¯
¯
Example 19.13. According to this result, −1 has a square root in the ﬁeld R[x]/(x2 + 1). One
can show that R[x]/(x2 + 1) ∼ C.
= 72 19.3. Roots of polynomials over Z/pZ. We can now continue our discussion, begun in § 15.7,
of the eﬃcient determination of whether a small degree polynomial f (x) over Z/pZ has a root
in Z/pZ. Recall that the only remaining point was whether the Euclidean algorithm step,
xp − x = q (x)f (x) + r(x),
can be done rapidly. Now we can answer that aﬃrmatively. Note that r(x) + x is exactly the
representative of xp in the ring F[x]/(f (x)). This representative can be calculated quickly by the
method we already used for calculating powers. We need to calculate
x, x2 , x4 , x8 , . . .
and the express p in base 2, p = ai 2i , ai ∈ {0, 1}, xp = 2i
{i:ai =0} x and so. We see that the slowing factor now is how quickly we can carry out multiplication in the ring F[x]/(f (x)). It is
not hard to see that this depends on the degree of f and not on p. 20. The First Isomorphism Theorem
20.1. Isomorphism of rings.
Deﬁnition 20.1. Let R, S be rings. A ring homomorphism f : R → S is called an isomorphism
if f is bijective.
Lemma 20.2. If f : R → S is a ring isomorphism then the inverse function g = f −1 , g : S → R
is also a ring homomorphism, hence an isomorphism. (The inverse function is deﬁned by g (s) =
r, where r is the unique element such that f (r) = s.)
Proof. First, because f (1R ) = 1S we have g (1S ) = 1R . Next, let s1 , s2 ∈ S . We need to
prove g (s1 + s2 ) = g (s1 ) + g (s2 ) and g (s1 s2 ) = g (s1 )g (s2 ). It is enough to prove that
f (g (s1 + s2 )) = f (g (s1 ) + g (s2 )), f (g (s1 s2 )) = f (g (s1 )g (s2 )), because f is injective. But f (g (s1 ) + g (s2 )) = f (g (s1 )) + f (g (s2 )) = s1 + s2 = f (g (s1 + s2 ))
and f (g (s1 )g (s2 )) = f (g (s1 ))f (g (s2 )) = s1 s2 = f (g (s1 ))f (g (s2 )).
Deﬁnition 20.3. Let R, S be rings. We say that R and S are isomorphic if there is a ring
isomorphism R → S .
Lemma 20.4. Being isomorphic is an equivalence relation on rings.
Proof. First, the identity function is always a ring homomorphism from R to R, so this relation is
reﬂexive. Secondly, if f : R → S is an isomorphism then g : S → R is an isomorphism, where g is
the inverse function to f . Thus, the relation is symmetric. Now suppose f : R → S and g : S → T
are ring isomorphisms between the rings R, S, T . To show the relation is transitive we need to
prove that g ◦ f : R → T is an isomorphism. Indeed:
(1) (g ◦ f )(1R ) = g (f (1R )) = g (1S ) = 1T ; 73 (2) (g ◦ f )(r1 + r2 ) = g (f (r1 + r2 )) = g (f (r1 ) + f (r2 )) = g (f (r1 )) + g (f (r2 )) = (g ◦ f )(r1 ) +
(g ◦ f )(r2 );
(3) (g ◦ f )(r1 r2 ) = g (f (r1 r2 )) = g (f (r1 )f (r2 )) = g (f (r1 ))g (f (r2 )) = (g ◦ f )(r1 )(g ◦ f )(r2 ). We shall denote the fact that R is isomorphic to S by R ∼ S .
=
20.2. The First Isomorphism Theorem.
Theorem 20.5. Let f : R → S be a surjective homomorphism of rings. Let I = ker(f ) then
there is an isomorphism F : R/I → S , such that the following diagram commutes
R
f
π
R/I S,
F where π : R → R/I is the canonical map g → g .
¯
Proof. We deﬁne a function,
F : R/I → S, by F (¯) = f (g ). We ﬁrst prove that this map is well deﬁned. Suppose that g = g1 . We need to
g
¯
show that f (g ) = f (g1 ). This holds because g = g1 means g − g1 ∈ I = Ker(f ). Now:
¯
• F (1R/I ) = F (¯R ) = f (1R ) = 1S ;
1
¯
• F (¯ + h) = F (g + h) = f (g + h) = f (g ) + f (h) = F (¯) + F (h);
g¯
g
¯
• F (¯ h) = F (gh) = f (gh) = f (g ) · f (h) = F (¯) · F (h).
g¯
g We also have (F ◦ π )(g ) = F (¯) = f (g ),
g so F ◦ π = f . Because of this we have that F is surjective. We next show F is injective. Suppose
that F (¯) = 0S then f (g ) = 0S and so g ∈ I . Thus, g = 0R/I .
g
¯
Example 20.6. We consider again the homomorphism Z → Z/nZ. It is a surjective ring homomorphism with kernel (n) and we conclude that
Z/(n) ∼ Z/nZ,
=
a fact we have noticed somewhat informally before.
Example 20.7. We have R[x]/(x2 + 1) ∼ C. To show that, deﬁne a ring homomrophism
=
by n j
j =0 aj x → n j
j =0 aj i . R[x] → C,
This is a well deﬁned function taking 1 to 1. It is easy to verify it is a homomorphism. In fact, recall that C[x] → C, f → f (i), is a homomorphism. Our 74 map is the restriction of the evaluationati homomorphism to the subring R[x] and so is also a
homomorphism. It is also clear that this homomorphism is surjective
The kernel I deﬁnitely contains x2 + 1, and so all its multiples. That is, I contains the
ideal (x2 + 1). Now, I = (f ) for some polynomial f . Because x2 + 1 ∈ (f ), f (x2 + 1). Since x2 + 1 is irreducible over R, either f ∼ 1 or f ∼ x2 + 1. If f ∼ 1 we have (f ) = R[x] and so any
polynomial is in the kernel, which is clearly not the case (for example, 1 is not in the kernel).
Thus f ∼ x2 + 1 and I = (x2 + 1); by the ﬁrst isomorphism theorem we have
R[x]/(x2 + 1) ∼ C.
= 20.3. The Chinese Remainder Theorem.
Theorem 20.8. Let m, n be positive integers such that (m, n) = 1. Then
Z/mnZ ∼ Z/mZ × Z/nZ.
=
Proof. We deﬁne a function
f : Z → Z/mZ × Z/nZ, f (a) = (a (mod m), a (mod n)). This function is a ring homomorphism:
• f (1) = (1 (mod m), 1 (mod n)) = (1Z/mZ , 1Z/nZ ); • f (a + b) = (a + b (mod m), a + b (mod n)) = (a (mod m), a (mod n)) + (b (mod m), b
(mod n)) = f (a) + f (b);
• f (ab) = (ab (mod m), ab (mod n)) = (a (mod m), a (mod n))·(b (mod m), b (mod n)) =
f (a)f (b). The kernel of the map is the set {a : ma, na} = {a : mna} (using that (m, n) = 1), that is, the
kernel is the principal ideal (mn). That means that the integers 0, 1, . . . , mn − 1 all have diﬀerent
images in the target. Since the target has mn elements, we conclude that f is surjective. By the
ﬁrst isomorphism theorem
Z/(mn) ∼ Z/mZ × Z/nZ.
= This theorem is very useful. It says that to solve an equation modulo mn, (m, n) = 1, is the same
as solving it modulo m and modulo n. That is, for given integers a0 , . . . , an and an integer A we
have an An + · · · + a1 A + a0 ≡ 0 (mod mn) if and only if we have an An + · · · + a1 A + a0 ≡ 0
(mod m) and an An + · · · + a1 A + a0 ≡ 0 (mod n). Here is an example:
Example 20.9. Solve the equation 5x + 2 = 0 modulo 6.
We consider the equation modulo 2 and get x = 0 (mod 2); we consider it modulo 3 and
get 2x +2 = 0 (mod 3) and get that x = 2 (mod 3). There is an x ∈ Z such that x (mod 2) = 0, x 75 (mod 3) = 2 and in fact x is unique modulo 6 (this is the CRT). We can guess that x = 2 will
do in this case, but it raises the general problem of ﬁnding the inverse isomorphism to
Z/mnZ → Z/mZ × Z/nZ.
20.3.1. Inverting Z/mnZ → Z/mZ × Z/nZ. Suppose we know how to ﬁnd elements e1 , e2 such
that e1 = 1 (mod m), e1 = 0 (mod n) and e2 such that e2 = 0 (mod m), e2 = 1 (mod n), then
we would have solved our problem. Indeed, given now two congruence classes a (mod m), b
(mod n) take the element ae1 + be2 .
Since (m, n) = 1 we may ﬁnd u, v such that 1 = um + vn. Put
e1 = 1 − um, e2 = 1 − vn. Example 20.10. Solve the equation 56x + 23 = 0 (mod 323).
We have 323 = 17 · 19.
• Solution modulo 17.
We have the equation 5x + 6 = 0 (mod 17). Or x = −6 · 5−1 = 11 · 5−1 . To ﬁnd 5−1
we look for u, v such that 1 = u5 + v 17.
17 = 3 · 5 + 2, 5 = 2 · 2 + 1 so 1 = 5 − 2 · 2 = 5 − 2 · (17 − 3 · 5) = 7 · 5 − 2 · 17 and
so 7 · 5 = 1 (mod 17). We conclude that x = 11 · 7 = 77 = 9 (mod 17).
• Solution modulo 19.
We have the equation −x + 4 = 0 so x = 4 (mod 19) is a solution.
• Finding e1 , e2 .
We have 19 = 17 + 2, 17 = 8 · 2 + 1 so 1 = 17 − 8 · 2 = 9 · 17 − 8 · 19. It follows
that e1 = 1 − 9 ∗ 17 = −152, e2 = 1 + 8 ∗ 19 = 153.
• We conclude that the solution to the equation 56x + 23 = 0 (mod 323) is 9 ∗ e1 + 4 ∗ e2 =
−1368 + 612 = −756 and modulo 323 this is 213.
Example 20.11. Solve the equation x2 = 118 (mod 323). As before we reduce to solving x2 = 118 = 16 (mod 17) and x2 = 118 = 4 (mod 19). There are two solutions in each case, given by
x = ±4 (mod 17) and x = ±2 (mod 19). We conclude that over all we have 4 solutions given by
±4(−152) ± 2(153) (mod 323). One can then reduce those numbers to standard representatives and ﬁnd that 21, 55, 268, 302
(mod 323) are the four solutions.
One can be more precise about the connection between solutions mod mn and solutions mod m
and mod n. First, let us generalize the Chinese Remainder Theorem:
Theorem 20.12. Let m1 , . . . , mk be relatively prime nonzero integers (that is (mi , mj ) = 1 for
i = j ). Then there is an isomorphism
given by Z/m1 m2 . . . mk Z ∼ Z/m1 Z × Z/m2 Z × · · · × Z/mk Z,
= a (mod m1 m2 . . . mk ) → (a (mod m1 ), a (mod m2 ), . . . , a (mod mk )). 76 The theorem is not hard to prove by induction on k . The main case, k = 2, is the one we
proved above. Now, let g (x) = an xn + . . . a1 x + a0 be a polynomial with integer coeﬃcients. Let
S be the solutions of g in Z/m1 m2 . . . mk Z and Si the solutions of g in Z/mi Z. Then we have a
bijection
S ↔ S 1 × S 2 × · · · × Sk , given by a (mod m1 m2 . . . mk ) → (a (mod m1 ), a (mod m2 ), . . . , a (mod mk )). Indeed, the expression g (a) (mod m1 m2 . . . mk ) is mapped to
(g (a) (mod m1 ), g (a) (mod m2 ), . . . , g (a) (mod mk )) and g (a) ≡ 0 (mod m1 m2 . . . mk ) if and
only if for every i we have g (a) ≡ 0 (mod mi ). This shows that we have a map
S → S 1 × S 2 × · · · × Sk .
But, conversely, given solutions ri to g (x) mod mi , there is a unique r (mod m1 m2 · · · mk ) such
that r ≡ ri (mod mi ) and g (r) ≡ 0 (mod m1 m2 · · · mk ) because it is true modulo every mi .
In particular, we may draw the following conclusion.
Corollary 20.13. Let m1 , . . . , mk be relatively prime integers. Let s be the number of solutions
to the equation an xn + . . . a1 x + a0 = 0 (mod m1 m2 · · · mk ) and let si be the number of solutions
modulo mi . Then
s = s1 s2 · · · sk .
Example 20.14. The equation x2 = 1 has 8 solutions modulo 2 · 3 · 5 · 7 = 490, because it has
one solution mod 2, and 2 solutions mod 3,5 or 7.
Example 20.15. The equation 34x = 85 (mod 17 · 19) has 17 solutions, because it has 17
solutions modulo 17 (it is then the equation 0 · x = 0 (mod 17)) and has a unique solution
modulo 19 (it is then the equation 4x = 10 (mod 19) and x = 12 is the unique solution).
On the other hand, the equation 34x = 5 (mod 17 · 19) has no solutions, because it has no
solutions modulo 17.
There remains the question how to calculate a solution mod m1 m2 · · · mk from solutions mod
m1 , mod m2 , . . . , mod mk . That is, how to ﬁnd explicitly the inverse to the map
Z/m1 m2 . . . mk Z → Z/m1 Z × Z/m2 Z × · · · × Z/mk Z.
We explain how to do that for 3 numbers m1 , m2 , m3 , though the method is general.
We ﬁrst ﬁnd integers 1 , 2 such that
1 ≡ 1 (mod m1 ), 1 ≡ 0 (mod m2 m3 ), 2 ≡ 0 (mod m1 ), 2 ≡ 1 (mod m2 m3 ). and 77 This we know how to do because we are only dealing with two relatively prime numbers, that is,
m1 and m2 m3 . Then, ﬁnd λ2 , λ3 such that
and
Then, the numbers λ2 ≡ 1 (mod m2 ), λ2 ≡ 0 (mod m3 ), λ3 ≡ 0 (mod m2 ), λ3 ≡ 1 (mod m3 ). µ1 = 1 , µ2 = 2 λ2 , µ3 = 2 λ3 , are congruent to (1, 0, 0), (0, 1, 0) and (0, 0, 1) respectively in the ring Z/m1 Z × Z/m2 Z × Z/m3 Z.
To ﬁnd an integer mod m1 m2 m3 mapping to (a, b, c) in Z/m1 Z × Z/m2 Z × Z/m3 Z take aµ1 +
bµ2 + cµ3 :
aµ1 + bµ2 + cµ3 → (a, b, c) ∈ Z/m1 Z × Z/m2 Z × Z/m3 Z.
Let us illustrate all this with a numerical example. Example 20.16. Find the solutions to the equation x2 + x + 2 = 0 (mod 7 · 11 · 23).
The solutions modulo 7 are S1 = {3}; the solutions modulo 11 are S2 = {4, 6}; the solutions
modulo 23 are S3 = {9, 13}. (Those are found by brute computation.) To ﬁnd the corresponding
solutions modulo 7 · 11 · 23 we ﬁrst ﬁnd the µi above.
First 11 · 23 = 36 · 7 + 1 and so 1 = 253, 2 = −252 satisfy 1 ≡ 1 (mod 7), 1 ≡ 0 (mod 253),
2 ≡ 0 (mod 7), 2 ≡ 1 (mod 253). To ﬁnd λ1 , λ2 we note that 1 = 23 − 2 · 11 and so λ1 = 23,
λ2 = −22 satisfy λ1 ≡ 1 (mod 11), λ1 ≡ 0 (mod 23) and λ2 ≡ 0 (mod 11), λ2 ≡ 1 (mod 23).
The numbers µ1 , µ2 , µ3 are then 253, −252 · 23, −252 · (−22). They should be understood as
numbers modulo 7 · 1 · 23 = 1771 and so we can replace −252 · 23 by 1288 (which is congruent
to it modulo 1771) and −252 · (−22) by 231. Then to get a number congruent to (a, b, c) in
Z/7Z × Z/11Z × Z/23Z, we take a · 253 + b · 1288 + c · 231. In particular, the solution (3, 4, 9)
produces 3 · 253 + 4 · 1288 + 9 · 231 = 7990 which we can replace by the number 906 which
is congruent to it modulo 1771. By our theory 906 is one of the 4 solutions to the equation
x2 + x + 2 = 0 (mod 1771) and this can be veriﬁed.
Do you think you could have found this solution in an easier way?! (I can’t see how).
21. Prime and maximal ideals
(planned) 78 Part 6. Groups
22. First definitions and examples
22.1. Deﬁnitions and some formal consequences.
Deﬁnition 22.1. A group G is a nonempty set with an operation
G × G → G, (a, b) → ab, such that the following axioms hold:
(1) (ab)c = a(bc). (Associativity)
(2) There exists an element e ∈ G such that eg = ge for all g ∈ G. (Identity)
(3) For every g ∈ G there exists an element d ∈ G such that dg = gd = e. (Inverse)
Here are some formal consequences of the deﬁnition:
(1) e is unique. Say e has the same property then e = ee, using the property of e, but
˜
˜
˜
also ee = e, using the property of e. Thus, e = e.
˜
˜
˜
(2) d appearing in (3) is unique (therefore we shall call it “the inverse of g ” and denote it
˜
˜
˜
by g −1 ). Say d also satisﬁes dg = g d = e. Then
˜˜
˜
˜
d = de = d(gd) = (dg )d = ed = d.
(3) Cancelation: ab = cb ⇒ a = c, and ba = bc ⇒ a = c. If ab = cb then (ab)b−1 = (cb)b−1 and so a = a(bb−1 ) = c(bb−1 ) = c. (4) (ab)−1 = b−1 a−1 .
To show that we need to show that b−1 a−1 “functions as the inverse of ab”. We
have (ab)(b−1 a−1 ) = a(bb−1 )a−1 = aea−1 = aa−1 = e. Similarly, (b−1 a−1 )(ab) =
b−1 (a−1 a)b = b−1 eb = b−1 b = e.
(5) (a−1 )−1 = a.
This is because aa−1 = a−1 a = e also shows that a is the inverse of a−1 .
(6) Deﬁne a0 = e, an = an−1 a for n > 0 and an = (a−1 )−n for n < 0. Then we have
am an = am+n , (am )n = amn . 22.2. Examples.
Example 22.2. The trivial group G is a group with one element e and multiplication law ee = e.
Example 22.3. If R is a ring, then R with addition only is a group. It is a commutative group.
The operation in this case is of course written g + h. In general a group is called commutative or
abelian if for all g, h ∈ G we have gh = hg . It is customary in such cases to write the operation
in the group as g + h and not as gh, but this is not a must.
These examples thus include Z, Q, R, C, F, F[], M2 (F), Z/nZ with the addition operation. 79 Example 22.4. Let R be a ring. Recall that the units R× of R are deﬁned as
{u ∈ R : ∃v ∈ R, uv = vu = 1}.
This is a group. If u1 , u2 ∈ R with inverses v1 , v2 , respectively, then, as above, one checks
that v2 v1 is an inverse for u1 u2 and so R× is closed under the product operation. The associative
law holds because it holds in R; 1R serves as the identity. If R is not commutative there is no
reason for R× to be commutative, though in certain cases it may be.
Thus we get the examples of Z× = {±1}, Q× = Q − {0}, R× = R − 0, C× = C − {0}, and more generally, F× = F − {0}. We also have, GL2 (F) = {M ∈ M2 (F) : det(M ) = 0}, F[]× = {a + b :
a = 0}.
Proposition 22.5. Let n > 1 be an integer. The group Z/nZ× is precisely
{1 ≤ a ≤ n : (a, n) = 1}.
Proof. If a is invertible then ab = 1 (mod n) for some integer b; say ab = 1 + kn for some k ∈ Z.
¯
If da, dn then d1. Therefore (a, n) = 1.
Conversely, suppose that (a, n) = 1 then for some u, v we have 1 = ua + vn and so ua = 1
(mod n).
One deﬁnes Euler’s ϕ function on positive integers by
1
n=1
ϕ(n) =
×  n > 1.
Z/nZ
One can prove that this is a multiplicative function, namely, if n, m) = 1 then ϕ(nm) = ϕ(n)ϕ(m).
I invite you to try and prove that based on the Chinese Remainder Theorem.
Here are some speciﬁc examples:
n Z/nZ× 2 {1} 3
4
5
6
7
8
9 ϕ(n)
1 {1, 2} 2 {1, 2, 3, 4} 4 {1, 3} 2 {1, 5} 2 {1, 3, 5, 7} 4 {1, 2, 3, 4, 5, 6} 6
{1, 2, 4, 5, 7, 8} 6 Example 22.6. If G, H are groups then G × H is a group with the operation
(g1 , h1 )(g2 , h2 ) = (g1 g2 , h1 h2 ). 80 The identity is (eG , eH ) and (g, h)−1 = (g −1 , h−1 ).
22.3. Subgroups. Let G be a group. A subset H ⊆ G is called a subgroup if the following holds:
(1) eG ∈ H ;
(2) a, b ∈ H ⇒ ab ∈ H ;
(3) a ∈ H ⇒ a−1 ∈ H . Clearly then H is a group in its own right.
Example 22.7. The subset S 1 of C× consisting of all complex numbers of absolute value 1 is
a subgroup. Indeed 1 ∈ S 1 . If s1 , s2 ∈ S 1 then s1 s2  = s1  s2  = 1 so s1 s2 ∈ S 1 . If z is any
nonzero complex number then 1 = 1 = zz −1  = z  · z −1  and so z −1  = 1/z . If z ∈ S 1 it therefore follows that z −1 ∈ S 1 .
Let n ≥ 1 be an integer. The subset µn of C× consisting of all complex numbers x such
that xn = 1 is a subgroup of C× , and in fact of S 1 , having n elements. It is called the nth roots
of unity. The proof is left as an exercise.
Example 22.8. Let F be a ﬁeld and
H=
1a
:a∈F .
01 Then H is a subgroup of GL2 (F).
Deﬁnition 22.9. Let G be a group. G is called cyclic if there is an element g ∈ G such that G =
{g n : n ∈ Z}; that is, any element of G is a power of g . The element g is then called a generator
of G.
Example 22.10. Let G be any group. Let g ∈ G and deﬁne
g := {g n : n ∈ Z}. This is a cyclic subgroup of G (it may be ﬁnite or inﬁnite).
Example 22.11. The group Z is cyclic. As a generator we may take 1 (or −1).
Example 22.12. The group (Z/5Z)× = {1, 2, 3, 4} is cyclic. The elements 2, 3 are generators.
The group Z/8Z× is not cyclic. One can check that the square of any element is 1. 23. The permutation and dihedral groups
23.1. Permutation groups.
Deﬁnition 23.1. A permutation of a set T is a bijective function f : T → T . We shall denote
the set of permutations of T by ST . If T = {1, 2, · · · , n} then we shall denote ST as Sn . It is
called “the symmetric group on n letters”. 81 Proposition 23.2. For every nonempty set T , ST is a group under composition of functions.
The cardinality of Sn is n!.
Proof. The product of two permutations f, g is their composition f ◦ g ; it is again a permutation.
We have [(f ◦ g ) ◦ h](t) = (f ◦ g )(h(g )) = f (g (h(t))) = f ((g ◦ h)(t)) = [f ◦ (g ◦ h)](t). Thus, as
functions, we have (f ◦ g ) ◦ h = f ◦ (g ◦ h) and so our operation is associative.
The identity is just the identity function. The inverse of a permutation f is the inverse
function f −1 , which satisﬁes f ◦ f −1 = f −1 ◦ f = IdT .
Finally, to deﬁne a permutation f on {1, 2, . . . , n} we can choose the image of 1 arbitrarily (n
choices), the image of 2 could be any element diﬀerent from f (1) (n − 1 choices), the image of 3
can be any elements diﬀerent from the images of 1 and 2 (n − 1 choices), and so on. Altogether,
we have n · (n − 1) · (n − 1) · · · 2 · 1 = n! choices.
Example 23.3.
(1) For n = 1, S1 consist of a single element and so is the trivial group.
(2) For n = 2 we have two permutations.
(a) Id. Id(1) = 1, Id(2) = 2.
(b) σ . σ (1) = 2, σ (2) = 1.
We may also represent those permutations as tables:
12
12
Id =
,
σ=
.
12
21
(3) For n = 3 we have 6 permutations. One of them is σ given by σ (1) = 2, σ (2) = 3, σ (3) = 1,
or in table form
123
σ=
.
231
The table form is a better
12
12
12
21 notation and we list all elements of S3 in that form.
3
123
123
,
,
,
3
231
312
3
123
123
,
,
.
3
132
321
123
The ﬁrst line is in fact a cyclic subgroup of S3 . It is the subgroup generated by
.
231 The groups Sn are not commutative for n ≥ 3. For example:
123
123
123
123
123
123
=
,
=
.
213
231
132
231
213
321
Here is another example of multiplication, in S5 this time:
12345
12345
12345
=
.
32145
52431
52413 82 23.2. Cycles. There is still more eﬃcient notation for permutations in Sn . Fix n ≥ 1. A cycle
(in Sn ) is an expression of the form
(a1 a2 · · · at ), where ai ∈ {1, 2, . . . , n} are distinct elements. This expression is understood as the permutation σ
given by ai+1 a = ai , i < n, σ (a) = a1
a = an , a
else.
123
123
For example, the permutation
is the cycle (1 2 3) and the permutation
231
213
is the cycle (1 2). A cycle with two elements (i j ) (not necessarily consecutive) is called a
transposition. Deﬁnition 23.4. Let G be a group. The order of G, denoted G, or G, is the number of
elements of G (written ∞ if not ﬁnite).
Let g ∈ G. The order of g is deﬁned as g , the order of the cyclic group generated by g . It
is also denoted by o(g ), or ord(g ).
Lemma 23.5. Let g ∈ G, o(g ) is the minimal positive integer k such that g k = e.
Proof. Let k be the minimal integer such that g k = e (∞ if such doesn’t exist). Suppose ﬁrst that o(g ) is ﬁnite, say equals r. Then the r + 1 elements {e, g, g 2 , . . . , g r } cannot be distinct and so g i = g j for some 0 ≤ i < j ≤ r. It follows that g j −i = e and so k ≤ j − i ≤ r.
In particular k is also ﬁnite. So r is ﬁnite implies k is ﬁnite and k ≤ r.
Suppose now that k is ﬁnite. Let n be an integer and write n = ak + b where 0 ≤ b < k .
Then g n = (g k )a g b = eb g b = g b . We conclude that g ⊆ {e, g, · · · , g k−1 }, and so k is ﬁnite
implies that r is ﬁnite and r ≤ k .
Example 23.6. Let (a1 a2 · · · at ) be a cycle. Its order is t. Two cycles σ, τ are called disjoint if they contain no common elements. In this case, clearly
στ = τ σ . Moreover, (στ )n = σ n τ n and since σ n and τ n are disjoint, (στ )n = Id if and only if
σ n = Id and τ n = Id. Thus o(σ )n, o(τ )n and we deduce that o(στ ) (namely, the least n such
that (στ )n = Id is lcm(o(σ ), o(τ )). Arguing in the same way a little more generally we obtain:
Lemma 23.7. Let σ1 , . . . , σn be disjoint permutations of orders r1 , . . . , rn , respectively. Then
the order of the permutation σ1 ◦ σ2 ◦ · · · ◦ σn is lcm(r1 , r2 , . . . , rn ).
Combining this lemma with the following proposition allows us to calculate the order of every
permutation very quickly.
Proposition 23.8. Every permutation is a product of disjoint cycles. 83 We shall not provide formal proof of this proposition, but illustrate it by examples.
12345
Example 23.9. Consider the permutation σ =
. To write it as a product of
34152
cycles we begin by (1 and check where 1 goes to. It goes to 3. So we write (13 and check where
3 goes to. It goes to 1 and so we have (13). The ﬁrst number we didn’t consider is 2. 2 goes to 4
and so we write (13)(24 and 4 goes to 5 and so we write (13)(245 . Now, 5 goes to 2 and so we
have σ = (13)(245). The order of σ is lcm(2, 3) = 6.
1 2 3 4 5 6 7 8 9 10
Example 23.10. Consider the permutation σ =
. This is
3 4 2 1 10 6 9 5 7 8 written as a product of disjoint transposition as follows (1324)(79)(5 10 8). To ﬁnd this expression,
we did the same procedure described above: We start with (1 , continue with (13 , because 1
goes to 3, and then with (132 , because 3 goes to 2. Then we ﬁnd that 2 goes to 4 which goes
to 1 and we have found (1324). The ﬁrst number not in this list is 5 which goes to 10 and so we
have (1324)(5 10. Since 10 goes to 8 and 8 to 5 we get now (1324)(5 10 8). The ﬁrst number not
in this list is 6 that goes to 6 and that gives (1324)(5 10 8)(6). We then continue with 7. Since 7
goes to 9 which goes to 7 we have (1324)(5 10 8)(6)(79). We have considered all numbers and so
σ = (1324)(5 10 8)(6)(79) = (1324)(5 10 8)(79). The order of σ is lcm(4, 3, 2) = 12.
Example 23.11. Suppose we want to ﬁnd a permutation of order 10 in S7 . We simply take
(12345)(67). If we want to ﬁnd a permutation of order 10 in S10 we can take either (12345)(67)
or (123456789 10) (and all variants on this).
Finally we remark on the computation of σ −1 for a permutation σ . If σ is given in the form of a
table, for example:
1 2 3 4 5 6 7 8 9 10
σ=
,
3 4 2 1 10 6 9 5 7 8
then because σ (i) = j ⇔ σ −1 (j ) = i, the table describing σ −1 is the same table but read from
the bottom to the top. That is
3 4 2 1 10 6 9 5 7 8
−1
σ=
.
1 2 3 4 5 6 7 8 9 10
Only that we follow our convention and write the columns in the conventional order and so we
get
1 2 3 4 5 6 7 8 9 10
σ −1 =
.
4 3 1 2 8 6 9 10 7 5
If σ is a cycle, say σ = (i1 i2 . . . ik−1 ik ), then σ −1 is easily seen to be (ik ik−1 . . . i2 i1 ). So,
(i1 i2 . . . ik−1 ik )−1 = (ik ik−1 . . . i2 i1 ). 84
−−
−
Now, if σ is a product of disjoint cycles, σ = σ1 σ2 . . . σr then σ −1 = σr 1 . . . σ2 1 σ1 1 (by a generalization of the rule (ab)−1 = b−1 a−1 ), but, since those cycles are disjoint and so commute,
−−
−
we can also write this as σ −1 = σ1 1 σ2 1 . . . σr 1 . (This last manipulation is wrong if the cycles
are not disjoint!) Thus, for example, the inverse of σ = (1324)(5 10 8)(79) is (4231)(8 10 5)(97),
which we can also write, if we wish, as (1423)(5 8 10)(79). 23.3. The Dihedral group. Consider a regular polygon with n sides in the plane, n ≥ 3,
symmetric around (0, 0). The dihedral group Dn is deﬁned as the symmetries of the polygon. Let
us number the vertices of the polygon by 1, 2, . . . , n in the clockwise direction and say the ﬁrst
vertex 1 lies on the xaxis. One sees that every symmetry must permutes the vertices and in fact
either maintains or reverses their order. In fact, if σ is a symmetry then σ (1) = j and σ (2) = j +1
or j − 1 (where we understand n + 1 as 1 and 1 − 1 as n) and σ is uniquely determined by these
conditions. For example, the permutation y given by the cycle (1 2 3 · · · n) is an element of the
dihedral group rotating the polygon by angle 3600 /n in the clockwise direction (so if t is a point
on the boundary of the polygon such that the line from (0, 0) to t forms an angle θ with the xaxis, then y (t) is the point forming an angle θ − 3600 /n). Another symmetry, x, is reﬂection through
the xaxis. The symmetry x is given as permutation by the product (2 n), (3 n − 1) · · · (n/2 2 +
n/2) if n is even and (2 n), (3 n − 1) · · · ((n + 1)/2 1 + (n + 1)/2) if n is odd. In terms of angles, x
changes an angle θ to −θ.
Theorem 23.12. The elements of the dihedral group are
Dn = {e, y, . . . , y n−1 , x, yx, y 2 x, . . . , y n−1 x},
and the relations x2 = y n = 1 and xyxy = 1 hold. In particular, Dn has 2n elements.
Proof. It is enough to show that any vertex j ∈ {1, 2, . . . , n} there is a unique element of the set {e, y, . . . , y n−1 , yx, y 2 x, . . . , y n−1 x} that takes 1 to j and 2 to j + 1 and there is a unique
element taking 1 to j and 2 to j − 1. This shows both that every element of Dn is in the list and
that all elements of the list are diﬀerent.
We calculate that
y a (1) = a + 1, y a (2) = a + 2,
and
y a x(1) = a + 1, y a x(2) = y a (n) = a. This proves our claims.
The relations x2 = y n = 1 are evident. We check that xyxy = 1, by checking that xyxy (j ) = j
for j = 1, 2. We have xyxy (1) = xyx(2) = xy (n) = x(1) = 1 and xyxy (2) = xyx(3) = xy (n − 1) =
x(n) = 2.
The nature of the symmetries 1, y, . . . , y n−1 is clear: y j rotates clockwise by angle j · 3600 /n. 85 Proposition 23.13. Let 0 ≤ j < n. The element y j x is a reﬂection through the line forming an
angle −j · 3600 /2n with the xaxis.
Proof. The symmetry y j x is not trivial. If it ﬁxes an angle θ it must be reﬂection through the
line with that angle. Note that y j x sends the angle θ to −θ and then adds −j · 3600 /n so the
equation is θ = −θ − j · 3600 /n (mod 360). That is θ = −j · 3600 /2n. 24. The theorem of Lagrange
24.1. Cosets. Let H < G be a subgroup of G. A left coset of H in G is a subset of the form
gH := {gh : h ∈ H }, for some g ∈ G. The set gH is called the left coset of g ; g is called a representative of the
coset gH .
Example 24.1. Consider the subgroup H of S3 given by {1, (123), (132)}. Here are some
cosets: H = 1H = (123)H = (132)H , (12)H = (13)H = (23)H = {(12), (23), (13)}. We leave the
veriﬁcation to the reader.
Lemma 24.2. Let H be a subgroup of G.
(1) Two left cosets are either equal or disjoint.
(2) Let g1 H, g2 H be two left cosets. The following are equivalent: (i) g1 H = g2 H ; (ii) g1 ∈
−
g2 H ; (iii) g2 1 g1 ∈ H . Proof. Suppose that g1 H ∩g2 H = ∅, so for some h1 , h2 we have g1 h1 = g2 h2 . We prove that g1 H ⊆
g2 H . By symmetry we also have g2 H ⊆ g1 H and so g1 H = g2 H .
Let h ∈ H . Then g1 h = ((g2 h2 )h−1 )h = g2 (h2 h−1 h) ∈ g2 H .
1
1 We now prove the equivalence of the assertions (i)  (iii). Suppose (i) holds. Then g1 = g1 e ∈ g2 H −
and (ii) holds. Suppose (ii) holds; say g1 = g2 h. Then g2 1 g1 = h ∈ H and (iii) holds. Suppose −
that (iii) holds; g2 1 g1 = h for some h ∈ H . Then g1 = g2 h and so g1 H ∩ g2 H = ∅. By what we
have proved in the ﬁrst part, g1 H = g2 H .
Remark 24.3. The Lemma and its proof should be compared with Lemma 19.3. In fact, since R
is an abelian group and an ideal I is a subgroup, that lemma is special case of the lemma above.
Corollary 24.4. G is a disjoint union of cosets of H . Let {gi : i ∈ I } be a set of elements
of G such that each coset has the form gi H for a unique gi . That is, G = i∈I gi H . Then the
{gi : i ∈ I } are called a complete set of representatives. In the same manner one deﬁnes a right coset of H in G to be a subset of the form Hg = {hg :
h ∈ H } and Lemma 24.2 holds for right cosets with the obvious modiﬁcations. Two right cosets 86 are either equal or disjoint and the following are equivalent: (i) Hg1 = Hg2 ; (ii) g1 ∈ Hg2 ; −
(iii) g1 g2 1 ∈ H . Thus, the Corollary holds true for right cosets as well.
We remark that the intersection of a left coset and a right coset may be nonempty, yet not a
coset itself. For example, take H = {1, (12)} in S3 . We have the following table. g
1
(12)
(13)
(23)
(123)
(132) gH
Hg
{1, (12)}
{1, (12)}
{(12), 1}
{(12), 1}
{(13), (123)} {(13), (132)}
{(23), (132)} {(23), (123)}
{(123), (13)} {(123), (23)}
{(132), (23)} {(132), (13)} The table demonstrates that indeed any two left (resp. right) cosets are either equal or disjoint,
but the intersection of a left coset with a right coset may be nonempty and properly contained
in both.
24.2. Lagrange’s theorem.
Theorem 24.5. Let G be a ﬁnite group and H a subgroup of G. Then,
H   G.
Moreover, let {gi : i ∈ I } be a complete set of representatives for the cosets of H , then I  = G
H  . In particular, the cardinality of I does not depend on the choice of a complete set of representatives.
It is called the index of H in G.
Proof. We have,
G=
Let a, b ∈ G. We claim that the function gi H. i∈I f : aH → bH, x → ba−1 x, is a well deﬁned bijection. First, x = ah for some h and so ba−1 x = bh ∈ bH and so the map
is well deﬁned. It is surjective, because given an element y ∈ bH , say y = bh it is the image
of ah. The map is also injective: if ba−1 x1 = ba−1 x2 then multiplying both sides by ab−1 we
get x1 = x2 .
We conclude that each coset gi H has the same number of elements, which is exactly the number
of elements in H = eH . We get therefore that That completes the proof. G = H  · I . Here are some applications of Lagrange’s theorem: 87 (1) Let G be a ﬁnite group of prime order p. Then G is cyclic; in fact, every element of G
that is not the identity generates G.
Indeed, Let g = e. Then H = g is a nontrivial subgroup. So H  > 1 and divides p.
It follows that H  = G and so that g = G.
(2) In a similar vein, we conclude that a group of order 6 say, cannot have elements of order 4,
or 5, or any order not dividing 6. This follows immediately from Lagrange’s theorem,
keeping in mind that ord(g ) = g  . 25. Homomorphisms and isomorphisms
25.1. homomorphisms of groups.
Deﬁnition 25.1. Let G, H be groups and
f : G → H, a function. The function f is called a group homomorphism, if
f (g1 g2 ) = f (g1 )f (g2 ),
In that case, we deﬁne the kernel of f as: ∀g1 , g2 ∈ G. Ker(f ) = {g ∈ G : f (g ) = eH }. Lemma 25.2. Let f : G → H be a group homomorphism. Then:
(1) f (eG ) = eH ;
(2) f (g −1 ) = f (g )−1 ;
(3) The image of f is a subgroup of H .
Proof. We have f (eG ) = f (eG eG ) = f (eG )f (eG ). Multiplying (in H ) both sides by f (eG )−1 we
ﬁnd eH = f (eG ). Now, eH = f (eG ) = f (gg −1 ) = f (g )f (g −1 ), which shows that f (g −1 ) = f (g )−1 .
Finally, we show that Im(f ) is a subgroup of H . Note that eH = f (eG ) ∈ Im(f ). If h1 , h2 ∈ −
Im(f ), say hi = f (gi ) then h1 h2 = f (g1 g2 ) and h−1 = f (g1 1 ). This shows that h1 h2 , h−1 ∈
1
1
Im(f ).
Proposition 25.3. Let f : G → H be a group homomorphism. Ker(f ) is a subgroup of G. The
homomorphism f is injective if and only if Ker(f ) = {eG }.
Proof. First, we proved that f (eG ) = eH and so eG ∈ Ker(f ). Next, if g1 , g2 ∈ Ker(f )
then f (g1 ) = f (g2 ) = eH and so f (g1 g2 ) = f (g1 )f (g2 ) = eH eH = eH . Therefore, g1 g2 ∈ H .
Finally, if g ∈ Ker(f ) then f (g −1 ) = f (g )−1 = e−1 = eH and so g −1 ∈ Ker(f ) as well.
H Suppose f is injective. Then, since f (eg ) = eH , eG is the only element mapping to eH
and so Ker(f ) = {eG }. Conversely, suppose Ker(f ) = {eG } and f (g1 ) = f (g2 ). Then eH = −
−
−
−
f (g1 )−1 f (g2 ) = f (g1 1 )f (g2 ) = f (g1 1 g2 ). That means that g1 1 g2 ∈ Ker(f ) and so g1 1 g2 = eG .
That is, g1 = g2 .
88 25.2. Isomorphism.
Deﬁnition 25.4. A group homomorphism f : G → H is called an isomorphism if it is bijective.
As in the case of rings, one veriﬁes that if f is an isomorphism, the inverse function g = f −1
is automatically a homomorphism and so an isomorphism as well. Also, one easily checks that a
composition of group homomorphisms is a group homomorphism. It follows that being isomorphic
is an equivalence relation on groups. Cf. §20.1.
Example 25.5. Let n be a positive integer. Any two cyclic groups of order n are isomorphic.
Indeed, suppose that G = g , H = h are cyclic groups of order n. Deﬁne, for any integer a,
f (g a ) = ha .
This is well deﬁned; if g a = g b then g a−b = eG and so n(a − b). Thus, a = b + kn and f (g a ) = ha = hb (hn )k = hb = f (g b ). Obviously f is a surjective homomorphism; f is also injective,
because f (g a ) = ha = eH implies that na and so g a = eG .
In particular, we conclude that any cyclic group of order n is isomorphic to the group Z/nZ
with addition.
Example 25.6. Let p be a prime number then any two groups of order p are isomorphic. Indeed,
we’ve seen that such groups are necessarily cyclic.
Theorem 25.7. (Cayley) Let G be a ﬁnite group of order n then G is isomorphic to a subgroup
of Sn .
Proof. Let g ∈ G and let σg : G → G, σg (a) = ga. We claim that σg is a permutation. It is injective, because σg (a) = σg (b) ⇒ ga = gb ⇒ a = b. It
is surjective, because for any b ∈ G, σg (g −1 b) = b.
Identifying the permutations of G with Sn (just call the elements of G, 1, 2, 3, . . . ), we got a
map
G → Sn , g → σg . This map is a homomorphism of groups: σgh (a) = gha = σg (σh (a)). That is, σgh = σg ◦ σh . This homomorphism is injective: if σg is the identity permutation then σg (e) = e and that
implies ge = e, that is g = e. We get that G is isomorphic to its image, which is a subgroup
of Sn , under this homomorphism.
Remark 25.8. We were somewhat informal about identifying the permutations of G with Sn . A
more rigorous approach is the following.
Lemma 25.9. Let T, Z be sets and f : T → Z a bijection. The group of permutations of T and Z
are isomorphic. 89 Proof. Let σ ∈ ST , a permutation of T . Then f ◦ σ ◦ f −1 is a function from Z to itself, and being
a composition of bijections is a bijection itself. We shall write more simply f σ f −1 for f ◦ σ ◦ f −1 .
We therefore got a function
S T → SZ , σ → f σ f −1 . SZ → S T , τ → f −1 τ f. We claim that σ → f σ f −1 is a homomorphism. Indeed, given σ1 , σ2 ∈ ST we have f σ1 σ2 f −1 =
(f σ1 f −1 )(f σ2 f −1 ). Moreover, it is easy to write an inverse to this homomorphism, Therefore, we found a bijective homomorphism ST → SZ , which shows those two permutation
groups are isomorphic.
26. Group actions on sets
26.1. Basic deﬁnitions. Let G be a group and let S be a nonempty set. We say that G acts
on S if we are given a function
G × S → S, (g, s) −→ g s, such that;
(i) e s = s for all s ∈ S ;
(ii) (g1 g2 ) s = g1 (g2 s) for all g1 , g2 ∈ G and s ∈ S .
Given an action of G on S we can deﬁne the following sets. Let s ∈ S . Deﬁne the Orbit of s
Orb(s) = {g s : g ∈ G}.
Note that Orb(s) is a subset of S , equal to all the images of the element s under the action of
the elements of the group G. We also deﬁne the stabilizer of s to be
Stab(s) = {g ∈ G : g s = s}.
Note that Stab(s) is a subset of G. In fact, it is a subgroup, as Lemma 26.1 states.
26.2. Basic properties.
Lemma 26.1.
(1) Let s1 , s2 ∈ S . We say that s1 is related to s2 , i.e., s1 ∼ s2 , if there
exists g ∈ G such that g s1 = s2 . This is an equivalence relation. The equivalence class
of s1 is its orbit Orb(s1 ).
(2) Let s ∈ S . The set Stab(s) is a subgroup of G.
(3) Suppose that both G and S have ﬁnitely many elements. Then
Orb(s) = G
.
Stab(s) 90 Proof. (1) We need to show that this relation is reﬂexive, symmetric and transitive. First,
we have e s = s and hence s ∼ s, meaning the relation is reﬂexive. Second, if s1 ∼ s2
then for a suitable g ∈ G we have g s1 = s2 . Therefore, g −1 (g s1 ) = g −1 s2 and (g −1 g ) s1 = g −1 s2 . It follows that, e s1 = g −1 s2 and so, s1 = g −1 s2 , which
implies that s2 ∼ s1 .
It remains to show the relation is transitive. If s1 ∼ s2 and s2 ∼ s3 then for suitable g1 , g2 ∈ G we have g1 s1 = s2 and g2 s2 = s3 . Therefore, (g2 g1 ) s1 = g2 (g1 s1 ) =
g2 s2 = s3 , and hence s1 ∼ s3 .
Moreover, by the very deﬁnition, the equivalence class of an element s1 of S is all the
elements of the form g s1 for some g ∈ G, namely, Orb(s1 ).
(2) Let H = Stab(s). We have to show that: (i) e ∈ H , (2) if g1 , g2 ∈ H then g1 g2 ∈ H , and
(iii) if g ∈ H then g −1 ∈ H . First, by deﬁnition of group action, we have e s = s. Therefore, e ∈ H . Next, suppose
that g1 , g2 ∈ H , i.e., g1 s = s and g2 s = s. Then, (g1 g2 ) s = g1 (g2 s) = g1 s = s,
which proves that g1 g2 ∈ H . Finally, if g ∈ H then g s = s and so g −1 (g s) = g −1 s. That is, (g −1 g ) s = g −1 s and so e s = g −1 s, or s = g −1 s, and therefore g −1 ∈ H .
(3) We claim that there exists a bijection between the left cosets of H and the orbit of s. If
we show that, then by Lagrange’s theorem,
Orb(s) = no. of left cosets of H = index of H = G/H . Deﬁne a function φ by {left cosets of H } → Orb(s),
φ(gH ) = g s. We claim that φ is a welldeﬁned bijection. First
Welldeﬁned: Suppose that g1 H = g2 H . We need to show that the rule φ would give the same
result whether we take the representative g1 or the representative g2 to the coset, that is, we
−
−
need to show g1 s = g2 s. Note that g1 1 g2 ∈ H , i.e., (g1 1 g2 ) s = s. We get g1 s =
−
−
g1 ((g1 1 g2 ) s) = (g1 (g1 1 g2 )) s = g2 s.
φ is surjective: Let t ∈ Orb(s) then t = g s for some g ∈ G. Thus, φ(gH ) = g s = t , and we get that φ is surjective.
φ is injective: Suppose that φ(g1 H ) = φ(g2 H ). We need to show that g1 H = g2 H . Indeed, φ(g1 H ) =
−
−
−
−
φ(g2 H ) implies g1 s = g2 s and so that g2 1 (g1 s) = g2 1 (g2 s); that is, (g2 1 g1 ) s = (g2 1 g2 ) s
−
−
and so (g2 1 g1 ) s = e s = s. Therefore, g2 1 g1 ∈ Stab(s) = H and hence g1 H = g2 H . Corollary 26.2. The set S is a disjoint union of orbits.
Proof. The orbits are the equivalence classes of the equivalence relation ∼ deﬁned in Lemma 26.1.
Any equivalence relation partitions the set into disjoint equivalence classes.
91 26.3. Some examples.
Example 26.3. Let G be the group of real numbers R. The group operation is addition. Let S
be the the set of points on the sphere in R3 of radius 1 about the origin. The group R acts
by rotating around the z axis. An element r ∈ R rotates by degree r. For every point s ∈ S ,
diﬀerent from the poles, the stabilizer is 2π Z. For the poles the stabilizer is R. The orbit of every
point is the altitude line on which it lies.
Example 26.4. Let G be a group and H a subgroup of G. Then H acts on G by
H × G → G, (h, g ) → hg. Here H plays the role of the group and G the role of the set in the deﬁnition. This is indeed a
group action: eH g = g for all g ∈ G, because by deﬁnition eH = eG . Also, h1 (h2 )g = (h1 h2 )g is
nothing but the associative law.
The orbit of g ∈ G is
Orb(g ) = {hg : h ∈ H } = Hg. That is, the orbits are the right cosets of H . We have that G is a disjoint union of orbits, namely,
a disjoint union of cosets. The stabilizer of any element g ∈ G is {e}. The formula we have
proven, Orb(g) = H /Stab(g ), gives us Hg  = H  for any g ∈ G, and we see that we have
another point of view on Lagrange’s theorem.
Example 26.5. We consider a roulette with n sectors and write n = i1 + · · · + ik , for some
positive (and ﬁxed) integers i1 , . . . , ik . We suppose we have diﬀerent colors c1 , . . . , ck and we
color i1 sectors of the roulette by the color c1 , i2 sectors by the color c2 and so on. The sectors
can be chosen as we wish and so there are many possibilities. We get a set S of colored roulettes.
Now, we turn the roulette a steps clockwise, say, then we get another colored roulette, usually
with diﬀerent coloring. Nonetheless, it is natural to view the two coloring as the same, since
“they only depend on your point of view”. We may formalize this by saying that the group Z/nZ
acts on S , a acts on a colored roulette by turning it a steps clockwise, and by saying that we are
interested in the number of orbits for this action.
Example 26.6. Let G be the dihedral group D8 . Recall that G is the group of symmetries of a
regular octagon in the plane.
G = {e, y, y 2 , . . . , y 7 , x, yx, y 2 x, . . . , y 7 x},
where y is rotation clockwise by angle 2π /8 and x is reﬂection through the xaxis. We have the
relations
x2 = y 8 = e, xyxy = 1.
We let S be the set of colorings of the octagon ( = necklaces laid on the table) having 4 red
vertices (rubies) and 4 green vertices (sapphires). The group G acts on S by its action on the
octagon. 92 For example, the coloring s0 consisting of alternating green and red is certainly preserved
under x and under y 2 . Therefore, the stabilizer of s0 contains at least the set of eight elements
(26.1) {e, y 2 , y 4 , y 6 , x, y 2 x, y 4 x, y 6 x}. Remember that the stabilizer is a subgroup and, by Lagrange’s theorem, of order dividing 16 =
G . On the other hand, Stab(s0 ) = G because y ∈ Stab(s0 ). It follows that the stabilizer has
exactly 8 elements and is equal to the set in (26.1).
Let H be the stabilizer of s0 . According to Lemma 26.1 the orbit of s0 is in bijection with the
left cosets of H = {e, y 2 , y 4 , y 6 , x, y 2 x, y 4 x, y 6 x}. By Lagrange’s theorem there are two cosets.
For example, H and gH are distinct cosets. The proof of Lemma 26.1 tells us how to ﬁnd the
orbit: it is the set {s0 , gs0 }, which is of course quite clear if you think about it. 27. The CauchyFrobenius Formula
Theorem 27.1. (CFF)11 Let G be a ﬁnite group acting on a ﬁnite set S . Let N be the number
of orbits of G in S . Deﬁne
I (g ) = {s ∈ S : g s = s}
(the number of elements of S ﬁxed by the action of g ). Then
N= 12 1
I (g ).
G
g ∈G Remark 27.2. Note that I (g ) is the number of ﬁxed points for the action of g on S . Thus, the
CFF can be interpreted as saying that the number of orbits is the average number of ﬁxed points
(though this does not make the assertion more obvious).
Proof. We deﬁne a function
T : G × S → {0, 1},
Note that for a ﬁxed g ∈ G we have
I (g ) =
1 gs=s
T (g, s) =
0 g s = s.
T (g, s), s∈S 11This is also sometimes called Burnside’s formula.
sum appearing in the formula means just that: If you write G = {g1 , . . . , gn } then g∈G I (g ) is
n
i=1 I (gi ) = I (g1 )+ I (g2 )+ · · · + I (gn ). The double summation
g ∈G
s∈S T (g, s) appearing in the proof
means that if we write S = {s1 , . . . , sm } then the double sum is T (g1 , s1 ) + T (g1 , s2 ) + · · · + T (g1 , sm ) +
T (g2 , s1 ) + T (g2 , s2 ) + · · · + T (g2 , sm ) + · · · + T (gn , s1 ) + T (gn , s2 ) + · · · + T (gn , sm ) .
12The 93 and that for a ﬁxed s ∈ S we have
Stab(s) = T (g, s). g ∈G Let us ﬁx representatives s1 , . . . , sN for the N disjoint orbits of G in S . Now,
I (g ) =
T (g, s) =
T (g, s)
g ∈G g ∈G =
s∈S = N
s∈S s∈S Stab(s) =
N
i=1 s∈S G
Orb(s)
N i=1 s∈Orb(si ) = g ∈G
G
=
Orb(s) i=1 s∈Orb(si ) G
Orb(si ) N
G
· Orb(si ) =
G
Orb(si )
i=1 = N · G.
Remark 27.3. If N , the number of orbits, is equal 1 we say that G acts transitively on S . It
means exactly that: For every s1 , s2 ∈ S there exists g ∈ G such that g s1 = s2 . Note that if G
and S are ﬁnite then if G acts transitively then the number of elements in S divides the number
of elements in G,
S   G,
because, if S = Orb(s) then S  = G/Stab(s).
Corollary 27.4. Let G be a ﬁnite group acting transitively on a ﬁnite set S . Suppose that S  > 1.
Then there exists g ∈ G without ﬁxed points.
Proof. By contradiction. Suppose that every g ∈ G has a ﬁxed point in S . That is, suppose that
for every g ∈ G we have
I (g ) ≥ 1. Since I (e) = S  > 1 we have that g ∈G I (g ) > G. By CauchyFrobenius formula, the number of orbits N is greater than 1. Contradiction. 94 27.1. Some applications to Combinatorics.
Example 27.5. How many roulettes with 11 wedges painted 2 blue, 2 green and 7 red are there
when we allow rotations?
Let S be the set of painted roulettes. Let us enumerate the sectors of a roulette by the num
11
9
bers 1, . . . , 11. The set S is a set of
= 1980 elements (choose which 2 are blue, and
2
2
then choose out of the nine left which 2 are green).
Let G be the group Z/11Z. It acts on S by rotations. The element 1 rotates a painted roulette
by angle 2π /11 anticlockwise. The element n rotates a painted roulette by angle 2nπ /11 anticlockwise. We are interested in N – the number of orbits for this action. We use CFF.
The identity element always ﬁxes the whole set. Thus I (0) = 1980. We claim that if 1 ≤ i ≤ 10
then i doesn’t ﬁx any element of S . We use the following fact that we have proved before: Let G
be a ﬁnite group of prime order p. Let g = e be an element of G. Then g = G.
Suppose that 1 ≤ i ≤ 10 and i ﬁxes s. Then so does i = Z/11Z (the stabilizer is a subgroup).
But any coloring ﬁxed under rotation by 1 must be single colored! Contradiction.
Applying CFF we get
10 1
1
I (n) =
N=
· 1980 = 180.
11
11
n=0 Example 27.6. How many roulettes with 12 wedges painted 2 blue, 2 green and 8 red are there
when we allow rotations?
Let S be the set of painted roulettes. Let us enumerate the sectors of a roulette by the num
12
10
bers 1, . . . , 12. The set S is a set of
= 2970 elements (choose which 2 are blue, and
2
2 then choose out of the ten left which 2 are green).
Let G be the group Z/12Z. It acts on S by rotations. The element 1 rotates a painted roulette
by angle 2π /12 anticlockwise. The element n rotates a painted roulette by angle 2nπ /12 anticlockwise. We are interested in N – the number of orbits for this action. We use CFF.
The identity element always ﬁxes the whole set. Thus I (0) = 2970. We claim that if 1 ≤ i ≤ 11
and i = 6 then i doesn’t ﬁx any element of S . Indeed, suppose that i ﬁxes a painted roulette.
Say in that roulette the rth sector is blue. Then so must be the i + r sector (because the rth
sector goes under the action of i to the r + ith sector). Therefore so must be the r + 2i sector.
But there are only 2 blue sectors! The only possibility is that the r + 2i sector is the same as
the r sector, namely, i = 6.
If i is equal to 6 and we enumerate the sectors of a roulette by the numbers 1, . . . , 12 we may
write i as the permutation
(1 7)(2 8)(3 9)(4 10)(5 11)(6 12). 95 In any coloring ﬁxed by i = 6 the colors of the pairs (1 7), (2 8), (3 9), (4 10), (5 11) and (6 12)
must be the same. We may choose one pair for blue, one pair for green. The rest would be red.
Thus there are 30 = 6 · 5 possible choices. We summarize:
element g I (g )
0
2970
i = 6
0
i=6
30
Applying CFF we get that there are
N= 1
(2970 + 30) = 250
12 diﬀerent roulettes.
Example 27.7. In this example S is the set of necklaces made of four rubies and four sapphires
laid on the table. We ask how many necklaces there are when we allow rotations and ﬂippingover.
We may talk of S as the colorings of a regular octagon, four vertices are green and four are
red. The group G = D8 acts on S and we are interested in the number of orbits for the group G.
The results are the following
element g
e
y, y 3 , y 5 , y 7
y2, y6
y4
xy i for i = 0, . . . , 7 I (g )
70
0
2
6
6 We explain how the entries in the table are obtained:
8
The identity always ﬁxes the whole set S . The number of elements in S is
= 70 (choosing
4 which 4 would be green).
The element y cannot ﬁx any coloring, because any coloring ﬁxed by y must have all sections
2 of the same color (because y = (1 2 3 4 5 6 7 8)). If y r ﬁxes a coloring s0 so does (y r )r = y (r )
because the stabilizer is a subgroup. Apply that for r = 3, 5, 7 to see that if y r ﬁxes a coloring so
does y , which is impossible. 13
Now, y 2 , written as a permutation, is (1 3 5 7)(2 4 6 8). We see that if, say 1 is green so
are 3, 5, 7 and the rest must be red. That is, all the freedom we have is to choose whether the
cycle (1 3 5 7) is green or red. This gives us two colorings ﬁxed by y 2 . The same rational applies
to y 6 = (8 6 4 2)(7 5 3 1).
13y (32 ) = g 9 = g because y 8 = e, etc. 96 Consider now y 4 . It may written in permutation notation as (1 5)(2 6)(3 7)(4 8). In any
coloring ﬁxed by y 4 each of the cycles (1 5)(2 6)(3 7) and (4 8) must be single colored. There are
4
thus
= 6 possibilities (Choosing which 2 out of the four cycles would be green).
2
It remains to deal with the elements xy i . We recall that these are all reﬂections. There are
two kinds of reﬂections. One may be written using permutation notation as
(i1 i2 )(i3 i4 )(i5 i6 )
(with the other two vertices being ﬁxed. For example x = (2 8)(3 7)(4 6) is of this form). The
other kind is of the form
(i1 i2 )(i3 i4 )(i5 i6 )(i7 i8 ).
(For example xy = (1 8)(2 7)(3 6)(4 5) is of this sort). Whatever is the case, one uses similar
reasoning to deduce that there are 6 colorings preserved by a reﬂection.
One needs only apply CFF to get that there are
N= 1
(70 + 2 · 2 + 6 + 8 · 6) = 8
16 distinct necklaces.
Example 27.8. Consider a tetrahedron with faces marked 1, 2, 3, 4. It takes a little thinking,
but one can see that each symmetry is determined by its action on the faces. Let us consider
symmetries of the tetrahedron that preserve orientation. For example, (123) is such a symmetry; it rotates the tetrahedron relative to the plane on which the face 4 lies. On the other
hand, (12) is not such a symmetry. We conclude that the symmetries that preserve orientation are a subgroup of S4 that is not equal to S4 . Let us call this subgroup A4 (later on,
we shall deﬁne the groups An in general and our notation is consistent). Clearly A4 contains
{1, (123), (132), (234), (243), (134), (143), (124), (142)} and since it is closed under multiplication
(being a subgroup) also (132)(134) = (12)(34) and similarly, (13)(24) and (14)(23), are elements
of A4 . We have already identiﬁed 12 elements in A4 and since the order of A4 divides the order
of S4 , which is 24, A4 is in fact equal to
{1, (123), (132), (234), (243), (134), (143), (124), (142), (12)(34), (13)(24), (14)(23)}.
Let us count how many colorings of the faces of the tetrahedron are there using 4 distinct colors,
each once. The number of coloring is 4! = 24 (choose for each face its color). No symmetry but
the identity preserves a coloring and so by CFF we get that the number of colorings up to A4
identiﬁcations is 2.
Suppose now we want to color with 2 colors, say red and blue, painting two faces red and two
faces blue. The total number of colorings are ( 4 ) = 6. In this case, a three cycle cannot ﬁx a
2
coloring, while each permutation of the type (ab)(cd) ﬁxes exactly two colorings (choose if the 97 faces a, b are both red or both blue). Therefore, the number of colorings up to symmetries is
N= 1
(6 + 3 × 2) = 1.
12 28. Cauchy’s theorem: a wonderful proof
One application of group actions is to provide a simple proof of an important theorem in the
theory of ﬁnite groups. Every other proof I know is very complicated.
Theorem 28.1. (Cauchy) Let G be a ﬁnite group of order n and let p be a prime dividing n.
Then G has an element of order p.
Proof. 14 Let S be the set consisting of ptuples (g1 , . . . , gp ) of elements of G, considered up to
cyclic permutations. Thus if T is the set of ptuples (g1 , . . . , gp ) of elements of G, S is the set of
orbits for the action of Z/pZ on T by cyclic shifts (). One may therefore apply CFF and get
S  = np − n
+ n.
p () Note that n S  ().
Now deﬁne an action of G on S . Given g ∈ G and (g1 , . . . , gp ) ∈ S we deﬁne
g (g1 , . . . , gp ) = (gg1 , . . . , ggp ).
This is a well deﬁned action ().
Since the order of G is n, since n S , and since S is a disjoint union of orbits of G, there must
be an orbit Orb(s) whose size is not n. However, the size of an orbit is G/Stab(s) , and we
conclude that there must an element (g1 , . . . , gp ) in S with a nontrivial stabilizer. This means
that for some g ∈ G, such that g = e, we have
(gg1 , . . . , ggp ) is equal to (g1 , . . . , gp ) up to a cyclic shift. This means that for some i we have
(gg1 , . . . , ggp ) = (gi+1 , gi+2 , gi+3 , . . . , gp , g1 , g2 , . . . , gi ).
Therefore, gg1 = gi+1 , g 2 g1 = ggi+1 = g2i+1 , . . . , g p g1 = · · · = gpi+1 = g1 (we always read the
indices mod p). That is, there exists g = e with
g p = e.
Let k be the order of g . Then 2 ≤ k ≤ p. Write p = qk + r with 0 ≤ r < k . Then
g r = g p · (g k )−q = e · (e)q = e.
details are left out. We use the symbol to point out that a detail is left out and the interested
person should ﬁll it in.
14Some 98 Since k is the minimal positive power such that g k = e we must have k p. This implies k = p and
therefore we found an element of order p (namely, g ).
29. The first isomorphism theorem for groups
29.1. Normal subgroups.
Deﬁnition 29.1. Let G be a group and H a subgroup of G. H is called a normal subgroup if
for every g ∈ G we have
gH = Hg.
Note that gH = Hg if and only if gHg −1 = H , where gHg −1 = {ghg −1 : h ∈ H }. Thus, we could also deﬁne a normal subgroup H to be a subgroup such that gHg −1 = H for all g ∈ G,
equivalently, ∀g ∈ G, ∀h ∈ H, ghg −1 ∈ H .
Lemma 29.2. Let H be a subgroup of a group G. Then H is normal if and only if
gHg −1 ⊂ H, ∀g ∈ G. Proof. Clearly if H is normal, gHg −1 ⊂ H, ∀g ∈ G. Suppose then that gHg −1 ⊂ H, ∀g ∈ G.
Given g ∈ G we have then gHg −1 ⊂ H and also g −1 H (g −1 )−1 ⊂ H . The last inclusion is
just g −1 Hg ⊂ H , which is equivalent to H ⊂ gHg −1 . We conclude that gHg −1 = H . Our main example of a normal subgroup is the kernel of a homomorphism.
Proposition 29.3. Let f : G → H be a group homomorphism. Then Ker(f ) is a normal subgroup
of G.
Proof. We proved already that Ker(f ) is a subgroup. Let g ∈ G, h ∈ Ker(f ); we need to show that ghg −1 ∈ Ker(f ), that is f (ghg −1 ) = eH . We calculate f (ghg −1 ) = f (g )f (h)f (g −1 ) =
f (g )eH f (g −1 ) = f (g )f (g −1 ) = f (g )f (g )−1 = eH . Example 29.4. For any group G, {eG } and G are normal subgroups. If G is a commutative
group, any subgroup of G is a normal subgroup.
29.2. Quotient groups. Similar to the construction of a quotient ring, we construct quotient
groups.
Let G be a group and H a normal subgroup of G. We let the quotient group G mod H ,
denoted G/H , be the collection of left cosets of H . We deﬁne multiplication by
(aH )(bH ) = abH.
We claim that this is well deﬁned, namely, if aH = a1 H, bH = b1 H then abH = a1 b1 H . Indeed,
we have a = a1 h for some h ∈ H and b = b1 h for some h ∈ H . Also, hb1 ∈ Hb1 = b1 H 99 and so hb1 = h b1 for some h ∈ H . Then, abH = a1 hb1 h H = a1 b1 h h H = a1 b1 H (if t ∈ H
then tH = H ).
We now verify the group axioms. We use the notation a for aH . Then the group law is
¯
a ¯ = ab.
¯b
We have (¯ ¯)¯ = ab c = (ab)c = a(bc) = a bc = a(¯ c). Thus, this is an associative operation.
abc
¯
¯
¯b¯
We have a eG = aeG = a and eG a = eG a = a. So there is an identity element eG/H and it is
¯¯
¯
¯¯
¯
equal to eG = H . Finally, a a−1 = aa−1 = eG = eG/H and a−1 a = a−1 a = eG = eG/H . Thus,
¯
¯
¯
¯
¯
every a is invertible and its inverse is a−1 (that is, (aH )−1 = a−1 H ).
¯
29.3. The ﬁrst isomorphism theorem.
Theorem 29.5. Let f : G → H be a surjective group homomorphism. The canonical map
π : G → G/Ker(f ) is a homomorphism with kernel Ker(f ). There is an isomorphism F :
G/Ker(f ) → H , such that the following diagram commutes:
G f
π
H.
F G/Ker(f ) Proof. First we check that π : G → G/Ker(f ) is a homomorphism, where π (a) = a = aKer(f ).
¯
¯. The kernel is {a ∈ G : aKer(f ) = Ker(f )} = Ker(f ).
Indeed, this is just the formula ab = a b
¯
Let us deﬁne F : G/Ker(f ) → H, F (¯) = f (a).
a This is well deﬁned: if a = ¯ then b−1 a ∈ Ker(f ), so f (b) = f (b)f (b−1 a) = f (b(b−1 a)) = f (a).
¯
b
Clearly F ◦ π = f .
F is a homomorphism: F (¯ ¯) = F (ab) = f (ab) = f (a)f (b) = F (¯)F (¯). Furthermore, F is
ab
a
b
surjective, since given h ∈ H we may ﬁnd a ∈ G such that f (a) = h and so F (¯) = h. Finally, F
a
is injective, because F (¯) = f (a) = eH means that a ∈ Ker(f ) so a = eG/H .
a
¯
Example 29.6. Let F be a ﬁeld. Recall the group of matrices GL2 (F),
ab
GL2 (F) = M =
: a, b, c, d ∈ F, det(M ) = ad − bc = 0 .
cd
We have also noted that the determinant is multiplicative
det(M N ) = det(M ) det(N ).
We may now view this fact as saying that the function
det : GL2 (F) → F× , 100 is a group homomorphism. It is a surjective group homomorphism, because given any a ∈ F×
a0
the matrix
has determinant a. The kernel is called SL2 (F), it is equal to the matrices
01
with determinant 1. It is a normal subgroup of GL2 (F) and by the ﬁrst isomorphism theorem GL2 (F)/SL2 (F) ∼ F× .
= Example 29.7. The homomorphic images of S3 . We wish to identify all the homomorphic
images of S3 . If f : S3 → G is a group homomorphism then Ker(f ) is a normal subgroup of S3 .
We begin therefore by ﬁnding all normal subgroups of S3 .
We know that every nontrivial subgroup of S3 is of the form (ij ) for some transposition (ij )
or the subgroup A3 := (123). That there are no other subgroups follows from the following
observation: if H ⊂ K ⊂ G are groups and G is ﬁnite, then G/K  divides G/H , because the
quotient is K /H . In our situation, for a nontrivial subgroup H we have S3 /H  is either 2
or 3 and those are prime. It follows that either K  = G or K  = H  and so that either K = G
or K = H .
The subgroups of order 2 are not normal. For example, (13)(12)(13)−1 = (13)(12)(13) =
(23), which shows that {1, (12)} is not normal, etc. On the other hand, the subgroup A3 :=
{1, (123), (132)} is normal. This follows from it being of index 2 (see assignments); another
argument appears below. Since S3 /A3 has order 2, it must be isomorphic to Z/2Z.
We conclude that there are three options:
(1) Ker(f ) = {1}. In this case, S3 is isomorphic to its image.
(2) Ker(f ) = S3 . In this case S3 /Ker(f ) = S3 /S3 ∼ {1} is the trivial group.
=
(3) Ker(f ) = A3 . In this case S3 /A3 is a group of 2 elements, obviously cyclic. Thus S3 /A3 ∼
=
Z/2Z
29.4. Groups of low order.
29.4.1. Groups of order 1. There is a unique group of order 1, up to isomorphism. It consists of
its identity element alone. There is only one way to deﬁne a homomorphism between two groups
of order 1 and it is an isomorphism.
29.4.2. Groups of order 2, 3, 5, 7. Recall that we proved that every group G of prime order is
cyclic, and, in fact, any nontrivial element is a generator. This implies that any subgroup of G
diﬀerent from {eG } is equal to G. We also proved that any two cyclic groups having the same
order are isomorphic. We therefore conclude:
Corollary 29.8. Every group G of prime order p is isomorphic to Z/pZ; it has no subgroups
apart from the trivial subgroups {eg }, G.
In particular, this corollary applies to groups of order 2, 3, 5, 7. 101 29.4.3. Groups of order 4. Let G be a group of order 4.
First case: G is cyclic.
In this case we have G ∼ Z/4Z. Its subgroups are {0}, Z/4Z and H = 2 = {0, 2}. There are
=
no other subgroups because if a subgroup J contains an element g is contains the cyclic subgroup
generated by g . In our case, the elements 1 and 3 are generators, so any subgroup not equal to
G is contained in {0, 2}.
∼
Since G is abelian, H is normal. G/H has order G/H  = 4/2 = 2 and so G/H = Z/2Z.
Second case. G is not cyclic.
Claim: Every element of G diﬀerent from eG has order 2.
Proof : we have ord(g ) = g  and it divides G. So, in our case, ord(g ) = 1, 2 or 4. If ord(g ) = 4,
we get that G is cyclic and if ord(g ) = 1 then g = eG . Thus, we must have ord(g ) = 2.
Claim: Let G be a group in which every element diﬀerent from the identity has order 2. Then G
is commutative.
Proof : Note ﬁrst that if a ∈ G has order 2 (or is the identity) then aa = eG and so a−1 = a. Now,
we need to show that for every a, b ∈ G we have ab = ba. But this is equivalent to ab = b−1 a−1 .
Multiply both sides by ab and we see that we need to prove that abab = eG . But, abab = (ab)2
and so is equal to eG , by assumption.
One example of a group of order 4 satisfying all these properties is Z/2Z × Z/2Z. We claim
that G ∼ Z/2Z × Z/2Z. Pick two distinct elements g1 , g2 of G that aren’t the identity either.
=
Deﬁne a map
f : Z/2Z × Z/2Z → G, ab
f (a, b) = g1 g2 . ab
This is well deﬁned: if (a, b) = (a , b ) then a = a + 2c, b = b + 2d and we get f (a, b) = g1 g2 =
a
2
b
2
ab
g1 (g1 )c g2 (g2 )d = g1 g2 = f (a , b ). The map is also a homomorphism: f ((a1 , b1 ) + (a2 , b2 )) =
a
b
aabb
f (a1 + a2 , b1 + b2 ) = g1 1 +a2 g21 +b2 = g1 1 g1 2 g21 g22 . Because G is commutative we can rewrite this
abab
as f (a1 + a2 , b1 + b2 ) = g1 1 g21 g1 2 g22 = f (a1 , b1 ) · f (a2 , b2 ).
The image of f is a subgroup with at least 3 elements, namely, eG , g1 , g2 . By Lagrange the
image then must be G. It follows that f is surjective and so is also injective. The nontrivial subgroups of Z/2Z × Z/2Z are all cyclic. They are {(0, 0), (0, 1)}, {(0, 0), (1, 0)}
and {(0, 0), (1, 1)}. Since the group is commutative they are all normal and the quotient in every
case has order 2, hence isomorphic to Z/2Z.
29.4.4. Groups of order 6. We know three candidates already Z/6Z, Z/2Z × Z/3Z and S3 . Now,
in fact, Z/6Z ∼ Z/2Z × Z/3Z (say by CRT). And since S3 is not commutative it is not isomorphic
=
to Z/6Z. In fact, every group of order 6 is isomorphic to either Z/6Z or S3 . We don’t prove it
here. 102 The subgroups of Z/6Z: Let n be a positive integer. We have a surjective group homomorphism π : Z → Z/nZ. Similar to the situation with rings one can show that this gives a bijection
between subgroups H of Z that contain nZ and subgroups K of Z/nZ. The bijection is given by
H → π (H ), K → π −1 (K ). The subgroups of Z are all cyclic, having the form nZ for some n (same proof as for ideals,
really). We thus conclude that the subgroups of Z/nZ are cyclic and generated by the elements
m such that mn. Thus, for n = 6 we ﬁnd the cyclic subgroups generated by 1, 2, 3, 6. Those are
the subgroups Z/6Z, {0, 2, 4}, {0, 3}, {0}. They are all normal and the quotients are isomorphic
respectively to {0}, Z/2Z, Z/3Z, Z/6Z.
The subgroups of S3 : Those were classiﬁed above.
29.5. Odds and evens. Let n ≥ 2 be an integer. One can show that there is a way to assign a
sign, ±1, to any permutation in Sn such that the following properties hold:
• sgn(στ ) = sgn(σ ) · sgn(τ ).
• sgn((ij )) = −1 for i = j . We do not prove that here, but we shall prove that next term in MATH 251. Note that since
any permutation is a product of transpositions, the two properties together determine the sign
of any permutation. Here are some examples: sgn((12)) = −1, sgn((123)) = sgn((13)(12)) =
sgn((13)) · sgn((12)) = 1, sgn((1234)) = sgn((14)(13)(12)) = −13 = −1. The property sgn(στ ) = sgn(σ ) · sgn(τ ) could be phrased as saying that the function
sgn : Sn −→ {±1} is a surjective group homomorphism. We deﬁne An as the kernel of the homomorphism sgn. It
is called the alternating group on n letters and its elements are called even permutations. The
elements of Sn \ An are called odd permutations. The group An is a normal subgroup of Sn , being
a kernel of a homomorphism. Its cardinality is n!/2. Here are some examples:
• A2 = {1};
• A3 = {1, (123), (132)};
• A4 = {1, (12)(34), (13)(24), (14)(23), (123), (132), (234), (243), (124), (142), (134), (143)}.
(Easy to check those are distinct 12 even permutations, so the list must be equal to A4 ).
29.6. Odds and Ends.
Example 29.9. We prove that in Z/pZ any element is a sum of two squares.
Clearly this holds for p = 2, so we assume p > 2. To begin with, Z/pZ× = {1, . . . , p − 1} is a
group under multiplication; it has p − 1 elements. Consider the homomorphism:
sq : Z/pZ× → Z/pZ× , sq (x) = x2 . 103 Let H be its image – a subgroup of Z/pZ× . The kernel of sq is the solutions to x2 = 1,
which are precisely ±1. Note that 1 = −1. It follows that H ∼ Z/pZ× /{±1} is a group
=
with (p − 1)/2 elements consisting precisely of the nonzero congruence classes that are squares.
Let H ∗ = H ∪ {0}; it is a subset of Z/pZ with (p + 1)/2 elements consisting of all squares.
Let a ∈ Z/pZ then the two sets H ∗ and a − H ∗ := {a − h : h ∈ H ∗ } have size (p + 1)/2 and
so must intersect (because Z/pZ has p < 2 · p+1
2 elements). That is, there are two squares x2 , y 2 such that a − x2 = y 2 and so a = x2 + y 2 . We next tie together the notions of homomorphism and group action.
Lemma 29.10. Let G be a group and T a nonempty set. To give an action of G on T is
equivalent to giving a homomorphism ρ : G → ST .
Proof. Suppose that we are given an action of G on S . Pick an element g ∈ G. We claim that
the function
T → T,
t → g t, is a permutation of T . Indeed, if gt1 = gt2 then g −1 (gt1 ) = g −1 (gt2 ), so (g −1 g )t1 = (g −1 g )t2 ;
that is, et1 = et2 and so t1 = t2 . Also, given t ∈ T we have g (g −1 t) = (gg −1 )t = et = t, showing
surjectivity. Let us denote then this function by ρ(g ), ρ(g )t = gt. We have a function
ρ : G → ST . We claim that this function is a homomorphism. We need to show that ρ(g1 g2 ) = ρ(g1 ) ◦ ρ(g1 ),
i.e., that for every t ∈ T we have ρ(g1 g2 )(t) = (ρ(g1 ) ◦ ρ(g2 ))(t). Indeed, ρ(g1 g2 )t = (g1 g2 )t =
g1 (g2 )t = g1 (ρ(g2 )t) = ρ(g1 )(ρ(g2 )(t)) = (ρ(g1 ) ◦ ρ(g2 ))(t).
Conversely, suppose that
ρ : G → ST
is a group homomorphism. Deﬁne an action of G on S by
g t := ρ(g )(t).
We claim this is a group action. Since ρ is a homomorphism we have ρ(e) = IdT and so e ∗ t =
ρ(e)(t) = IdT (t) = t. Now, g1 (g2 t) = ρ(g1 )(ρ(g2 )(t)) = (ρ(g1 ) ◦ ρ(g2 ))(t) = ρ(g1 g2 )(t) =
(g1 g2 ) t.
Index
(r), 61 Orb(s), 89 (r1 , r2 , . . . , rn ), 63
Dn , 84
In , 59
Mn (F), 59 Stab(s), 89
sgn, 102
×, 8
{ }, 6
e, 78 R/I , 68 eiθ , 23
o(g ), 82 R[x], 47
Sn , 80
C, 7
F[], 59
Fp , 42 associate (polynomial), 51
binomial
coeﬃcient, 43
theorem, 43
Burnside’s formula, 92 Im(z ), 20
N, 6
N+ , 20
Q, 6
R, 7
Re(z ), 20
⇒, 9
Z, 6
Z/nZ, 40
z , 21
¯
∩, 7
∩i∈I , 7
∼, 73
=
∪, 7
∪i∈I , 7
≡ (mod n), 40
∃, 15
∀, 7, 15
∈, 7
g , 80 r1 , r2 , . . . , rn , 63
→, 24
, 61
∈, 7
φ, 7
\, 8
∼, 39
, 9
⊂, 7
⊆, 7 Cantor’s diagonal argument, 19
cardinality, 17
CauchyFrobenius formula, 92
Cayley’s theorem, 88
Chinese remainder theorem, 74
complete set of representatives, 40, 85
complex conjugate, 21
congruence, 40
coset, 68, 85
cycle, 82
disjoint, 82
direct product
rings, 60
division with residue, 30, 48
Euclid’s Theorem, 35
Euclidean algorithm, 32, 50
Euler’s constant, 38
Fermat’s little theorem, 43
ﬁbre, 16
ﬁeld, 27
First isomorphism theorem, 73, 99
function, 14
bijective, 16
composition, 16
graph, 15
104 105 identity, 14
injective (oneone), 15
inverse, 16
sujective (onto), 15
Fundamental Theorem of Algebra, 24
Fundamental Theorem of Arithmetic, 34
generator, 80
Goldach’s conjecture, 36
greatest common divisor, 55
greatest common divisor (gcd), 31, 37, 49, 55
group, 78
abelian, 78
action, 89
transitive, 93
cyclic, 80
dihedral, 84
homomorphism, 87
isomorhism, 88
order, 82
quotient, 98
symmetric, 80
trivial, 78
units, 79
homomorphism, 63, 87
ideal, 61
nonprincipal, 63
principal, 61
sum, 62
trivial, 61
index, 86
induction, 11
isomorphism, 72, 88
kernel, 64, 87
Lagrange’s theorem, 86
necklace, 91
number
complex, 6, 7, 20
integer, 6
irrational, 37
natural, 6 prime, 33
rational, 20
real, 7
operation, 26
orbit, 89
order
element, 82
group, 82
permutation, 80
even, 102
odd, 102
pigeonhole principle, 13
polar representation, 22
polynomial
complex, 24
constant, 47
irreducible, 51
monic, 47
rational, 24
real, 24
zero, 24, 47
preimage, 16
prime
number, 33
Prime Number Theorem, 36
principal ideal ring, 61
proof
by contradiction, 10
contrapositive, 10
induction, 11
pigeonhole, 13
prove or disprove, 13
quotient group, 98
quotient ring, 68
relation, 39
congruence, 40
equivalence, 39
ring, 26
commutative, 27
division, 27
dual numbers, 59
Euclidean, 30 106 homomorphism, 63
integral domain, 30
isomorphism, 72
matrices, 58
polynomial, 47
subring, 28
root, 24
of unity, 24
roulette, 91
RSA, 45
set, 6
countable, 18
diﬀerence, 8
intersection, 7
product, 8
union, 7
sieve of Eratosthenes, 33
sign (of a permutation), 102
stabilizer, 89
subgroup, 80
normal, 98
subring, 28
Theorem
Euclidean algorithm, 32
Fundamental Theorem of Algebra, 24
Fundamental Theorem of Arithmetic, 34
Prime Number, 36
transposition, 82
Twin Prime conjecture, 36
unique factorization
integers, 34
polynomials, 52
rationals, 37
unit, 66
zero, 24
zero divisor, 42 ...
View
Full Document
 Fall '07
 Goren
 Statistics, Algebra, Complex number, Jim, George Cantor, Notation of Mathematics

Click to edit the document details