This preview shows page 1. Sign up to view the full content.
Unformatted text preview: 1
Abstract. We presented a series of four lectures to the
Department of Mathematics at Ohio State University,
entitled “Adventures in the Foundations of Mathematics”.
Lecture 1, “The Structure of Proofs”, April 28, 2003.
Lecture 2, “The Axioms for Mathematics”, April 30, 2003.
Lecture 3, “Do We Need Less Axioms?”, May 5, 2003. Lecture
4, “Do We Need More Axioms?”, May 7, 2003. These are the
edited notes for the lectures. VIGRE LECTURES
ADVENTURES IN THE FOUNDATIONS OF MATHEMATICS
LECTURE 1
THE STRUCTURE OF PROOFS
Harvey M. Friedman
Department of Mathematics
Ohio State University
April 28, 2003 http://www.math.ohiostate.edu/~friedman/ friedman@math.ohiostate.edu In mathematics, we back up our discoveries with rigorous
deductive proofs. Mathematicians develop a keen instinctive
sense of what makes a proof rigorous. In logic, we strive
for a *theory* of rigorous proofs.
In this first lecture, we give an idea of what this theory
looks like  in any deductive context.
There are many unexplored issues surrounding this purely
deductive aspect of mathematics.
1. PROPOSITIONAL CALCULUS.
But before we talk about proofs, we need to talk about the
form of mathematical statements to be considered. After
all, a proof is to be a proof of a mathematical statement.
We start with a discussion of a greatly simplified form of
mathematical statement. This is usually called the
Propositional Calculus, and has many variant names such as
the Sentential Calculus, etc.
We will abbreviate the Propositional Calculus as PROP (not
a standard abbreviation). 2
From the point of view of PROP, all mathematical statements
are built up from the five connectives
not (ÿ)
and (Ÿ)
or (⁄)
implies (Æ)
iff (´)
Any other part of a mathematical statement is taken as
“primitive”, or “irreducible”.
We need some symbols for these primitive or irreducible
parts of mathematical statements.
We use the infinite list p1,p2,..., called atoms or
propositional variables. In practice, we use p,q,r,...
until we run out of appropriate letters.
Here are some examples of “formulas” of PROP.
pÆp
p ⁄ ÿp
p Ÿ ÿp
pŸq
qŸp
(p Æ q) ⁄ (q Æ p)
ÿp Ÿ q
Notice that in the last example, ÿp Ÿ q, there is an
ambiguity. We might mean either of
ÿ(p Ÿ q)
(ÿp) Ÿ q
which are quite different. One has to use enough
parentheses. However, too many parentheses are a pain in
the , and so conventions have arisen that support the
use of less parenttheses. The whole business has been
carefully worked out in great generality in the theory of
formal grammar and parsing algorithms, in theoretical
computer science.
All of this is a warmup to the following inductive
definition of the formulas of PROP.
1. every pn, n ≥ 1, is a formula of PROP. 3
2. if A,B are formulas of PROP then so are (ÿA), (A Ÿ B),
(A ⁄ B), (A Æ B), (A ´ B).
3. formulas of PROP are obtained only by 1,2.
Strictly speaking, none of our seven examples have as many
parentheses as is required by this definition. We will not
go further into such grammatical issues.
PROP does not adequately reflect statements in real
mathematics. E.g., the use of “for all” and “there exists”
is an essential component of mathematical statements. We
will get to the real thing  predicate calculus  later in
the lecture.
These formulas are not meaningless strings of symbols. They
have meaning. Let us reexamine the seven:
p Æ p true no matter what p is
p ⁄ ÿp true no matter what p is
p Ÿ ÿp false no matter what p is
pŸq
q Ÿ p this and the previous one have the same status no
matter what p,q are
(p Æ q) ⁄ (q Æ p) true no matter what p,q are
(ÿp) Ÿ q matters what p,q are
We want a rigorous definition to support such judgments.
The truth values are the items T and F. A truth assignment
is a function from the set of atoms into the set of truth
values.
Let f be a truth assignment and A be a PROP formula. We
inductively define SAT(A,f), which means “A is true under
f”.
1. SAT(pn,f) iff f(pn) = T.
2. SAT(ÿA,f) iff not SAT(A,f).
3. SAT(A Ÿ B,f) iff SAT(A,f) and SAT(B,f).
4. SAT(A ⁄ B,f) iff SAT(A,f) or SAT(B,f).
5. SAT(A Æ B,f) iff either not SAT(A,f) or SAT(B,f).
6. SAT(A ´ B,f) iff SAT(A,f),SAT(B,f) both hold, or both
fail.
We now return to those seven examples.
p Æ p. SAT under every truth assignment 4
p ⁄ ÿp. SAT under every truth assignment
p Ÿ ÿp. SAT under no truth assignment
p Ÿ q.
q Ÿ p. this and the previous are SAT under the same truth
assignments
(p Æ q) ⁄ (q Æ p) SAT under every truth assignment
(ÿp) Ÿ q. SAT under some but not all truth assignments
If SAT(A,f) holds under all truth assignments, then A is
said to be a tautology.
We say that A,B are tautologically equivalent iff for all
truth assignments f,
SAT(A,f) iff SAT(B,f).
We say that A tautologically implies B iff for all truth
assignments f,
if SAT(A,f) then SAT(B,f).
2. HOW TO EARN ONE MILLION DOLLARS OFF OF PROP.
This is generally more than graduate student stipends, even
here.
How do we tell whether or not a PROP formula is a
tautology?
The obvious way is to try out all truth assignments.
Of course there are infinitely many truth assignments. But
there is an easy theorem to the effect that for a given
PROP formula, you only have to consider restricted truth
assignments which assign only to the atoms that actually
occur in the given PROP formula.
If the number of atoms is n, there are exactly 2n such
restricted truth assignments. The PROP formula might well
have, say, n distinct atoms, and about n2 total occurrences
of atoms, which is far smaller than 2n = the number of
restricted truth assignments to be looked at.
We would like to have a reasonably efficient computer
algorithm that correctly tells us whether we are looking at
a tautology. E.g., that runs in computer time bounded by a
polynomial in the size of the input. 5 This kind of talk is made fully rigorous in computational
complexity theory. Furthermore, so many problems of this
type have shown to be equivalent to each other, that there
is a simple name for all such problems, including this one.
This problem is called
P = NP?
It is believed that there is no tautology testing algorithm
of the kind we seek, and equivalently,
P ≠ NP.
This problem is among the seven Clay Institute problems,
worth 1 million dollars each. See
http://www.claymath.org/Millennium_Prize_Problems/
3. PROOFS IN PROP.
We now come to the principal topic in logic  proofs.
We want to discuss proofs of PROP formulas.
Obviously, whatever “proof” means, if a PROP formula has a
proof, then it should be a tautology. This is because, in a
proof of the kind we are interested in, nothing
(substantive) is assumed at the outset, and therefore the
atoms could be any assertions whatsoever.
Obviously, the ideal would be for the converse to hold:
every tautology has a proof.
We can have this state of affairs in a stupid way  just
list all tautologies, and call them axioms. However, we
want our solution to reflect only legitimate moves made in
actual proofs.
Getting a formal representation that completely and
naturally reflects all of the moves made in actual
mathematical proofs is still very much an open question.
But we don’t have to do this. We merely have to reflect
enough of them so that every tautology has a proof.
There are almost as many solutions to this problem as there
are logic texts. We present one solution. 6 In this approach, a not too large number of not too
complicated tautologies are listed as Axioms. Then the
following two rules of inference are provided:
1. Substitution. If a formula A has been proved, then we
may consider any substitution instance of A proved. I.e.,
we can replace equal letters by equal formulas.
2. Modus Ponens. If formulas A and A Æ B have been proved,
then we may consider B proved.
We can think of these proofs as a finite sequence of PROP
formulas, each of which is
i) an axiom; or
ii) a PROP formula B, where some A,A Æ B are previous
entries; or
iii) a substitution instance of a previous entry.
THEOREM. (Completeness). Every PROP formula that is an
entry in such a proof is a tautology, and every tautology
is an entry in such a proof.
We can even be more stringent than i)iii) and get the same
completeness result:
a) a substitution instance of an axiom; or
b) a PROP formula B, where some A,A Æ B are previous
entries.
This is the most elegant solution to the completeness
problem, if one is to forget about the fact that the finite
list of axioms used is not that small, and some formulas
are downright ugly. For an example of such a list of
axioms, see the Appendix.
PROBLEM. What is the least n such that there is a finite
set of PROP axioms that works under this approach, where
each formula in the finite set has at most n occurrences of
atoms? Put more generally, how “small” can the finite set
of axioms be for completeness?
Normally one wants more from a deductive system. We want to
accommodate assumptions. 7
For this purpose, let S be any set of PROP formulas. An Sproof is a finite sequence of PROP formulas, each of which
is
a) a substitution instance of an axiom; or
b) a PROP formula B, where some A,A Æ B are previous
entries; or
c) an element of S.
We say that S tautologically implies a PROP formula A if
and only if every truth assignment that satisfies all
elements of S also satisfies A.
THEOREM. (Relative completeness). Every PROP formula that
is an entry in an Sproof is tautologically implied by S,
and every PROP formula tautologically implied by S is an
entry in an Sproof.
There is a purely mathematical consequence of this Theorem,
that doesn’t mention proofs (i.e., is purely semantic).
THEOREM. (Compactness). If S tautologically implies A then
some finite subset of S tautologically implies A.
The Compactness theorem can be looked at topologically. The
mathematical essence of this situation is equivalent to the
fact that the Cantor space is compact. The space of all
truth assignments is a copy of the Cantor space.
4. PREDICATE CALCULUS WITHOUT QUANTIFIERS.
All of this is warmup for the much more subtle Predicate
Calculus, written PRED.
The predicate calculus supports “quantification”. The two
quantifiers are
" (for all  universal quantifier)
$ (there exists  existential quantifier).
But “for all what?” “there exists what?”
We clearly need an apparatus that refers to objects. In
PROP we only have the apparatus that refers to statements i.e., the atoms. 8
We introduce the apparatus necessary for referring properly
to objects, and then postpone dealing seriously with
quantifiers until later in the lecture.
Let us start with the example of groups. Groups are based
on 0, a binary operation +, a unary operation , and =. The
usual group axioms are
G1. x + 0 = x.
G2. x + (x) = 0.
G3. (x + y) + z = x + (y + z).
Ordered Abelian groups have more structure. They are based
on 0,+,,<,=, with the axioms
OAG1.
OAG2.
OAG3.
OAG4.
OAG5.
OAG6.
OAG7.
OAG8. x + 0 = x.
x + (x) = 0.
(x + y) + z = x + (y + z).
x + y = y + x.
(0 < x Ÿ 0 < y) Æ 0 < x + y.
ÿ 0 < 0.
0 < x ⁄ 0 < x ⁄ x = 0
x < y ´ 0 < y + (x). Here 0 is a constant symbol, + is a binary function symbol,
 is a unary function symbol, < is a binary relation
symbol, and = is the special binary relation symbol whose
meaning is fixed.
Note that these statements are interpreted universally,
with the universal quantifiers suppressed.
This motivates the so called free variable predicate
calculus, which we will write as FVPRED.
The vocabulary of FVPRED is
1.
2.
3.
4.
5.
6.
7. ÿ, Ÿ, ⁄, Æ, ´.
variables xn, n ≥ 1.
constant symbols cn, n ≥ 1.
function symbols Fnm, n,m ≥ 1.
relation symbols Rnm, n,m ≥ 1.
special relation symbol =.
comma and parentheses. As in PROP, we wish to define the FVPRED formulas. But
before this, we first have to define the FVPRED terms. 9 E.g., in the group axioms, we see the terms
x, x + 0, x + (x), 0, (x + y) + z, x + (y + z).
The terms of FVPRED are defined inductively:
i. every variable is a term.
ii. every constant symbol is a term.
iii. for n,m ≥ 1, if t1,...,tn are terms then Fnm(t1,...,tn)
is a term.
The formulas of FVPRED are defined inductively:
a. if s,t are terms then s = t is a formula.
b. for n,m ≥ 1, if t1,...,tn are terms then Rnm(t1,...,tn) is
a formula.
c. if A,B are formulas then (ÿA), (A Ÿ B), (A ⁄ B), (A Æ
B), (A ´ B) are formulas.
Formulas falling under a,b are called atomic formulas.
Let us come back to groups. It is well known that the group
axioms imply
G4.
G5.
G6.
G7. 0 + x = x.
(x) + x = 0.
x = x.
0 = 0. We want to support such a claim in FVPRED.
Accordingly, let S be a set of formulas of FVPRED, and let
A be a formula of FVPRED. We want to define “S universally
implies A”. As a special case, we want to verify that “The
group axioms {G1,G2,G3} universally imply each of
G4,G5,G6,G7.
Obviously, this situation is much more involved than the
truth assignments of PROP.
The strategy is to first define a structure (as in
algebraic structure, such as an ordered group). Then to
define variable assignments. Then to define SAT of a
formula under a variable assignment in a structure. Then to
define universal SAT of a set of formulas in a structure.
Finally, define S universally implies A. 10 A relational structure consists of a nonempty set D called
the domain, together with interpretations of all constant,
function, and relation symbols. Constant symbols are
interpreted as elements of D. Function symbols with
superscript n are interpreted as nary functions from D
into D. Relation symbols with superscript n are interpreted
as nary relations on D.
= is understood to be special, always being interpreted as
equality on D.
Typically, we only care about the interpretations of just
some of the constant, relation, and function symbols. E.g.,
an ordered group (G,0,+,,<) is a relational structure
where we only care about the interpretations of one
constant symbol, one binary function symbol, one unary
function symbol, and one binary relation symbol. We take =
for granted in FVPRED, with its usual meaning.
Let M be a relational structure. A variable assignment in
M, or an Massignment, is a function f from the set of
variables into the domain of M, dom(M).
Let A be a formula. We wish to define SAT(M,A,f). This is
read “M satisfies A under f”, or “A holds in M under f”.
Before we can define SAT(M,A,f), we must first define the
values of terms.
We define VAL(M,t,f), the value in M of t under f”,
inductively as follows.
1. Val(M,xn,f) = f(xn).
2. Val(M,cn,f) is the interpretation of cn in M.
3. Val(M,Fnm(t1,...,tn),f) = F(Val(M,t1,f),...,Val(M,tn,f)),
where F is the interpretation of Fnm in M.
We are now prepared to inductively define SAT(M,j,f) as
follows.
a. SAT(M,s = t,f) iff VAL(M,s,f) = VAL(M,t,f).
b. SAT(M,Rnm(t1,...,tn),f) iff R(VAL(M,t1,f),...,VAL(M,tn,f),
where R is the interpretation of Rnm in M.
c. SAT(M,(ÿA),f) iff not SAT(M,A,f).
d. SAT(M,(A Ÿ B),f) iff SAT(M,A,f) and SAT(M,B,f).
e. SAT(M,(A ⁄ B),f) iff SAT(M,A,f) or SAT(M,B,f). 11
f. SAT(M,(A Æ B),f) iff either not SAT(M,A,f) or
SAT(M,B,f).
g. SAT(M,(A ´ B),f) iff either SAT(M,A,f),SAT(M,B,f) both
hold, or SAT(M,A,f),SAT(M,B,f) both fail.
We now define SAT(M,A). Here we have suppressed the f. This
means that for all Massignments f, SAT(M,A).
More generally, let S be a set of formulas and f be an Massignment.
We take SAT(M,S,f) to mean that SAT(M,A,f) holds for all A
Œ S. We take SAT(M,S) to mean that SAT(M,A) holds for all A
Œ S.
We come back to groups and ordered Abelian groups. A group
is simply a structure M such that SAT(M,{G1,G2,G3}). In M,
we suppress the interpretations of all but 0,+,.
An ordered group is obviously a structure M such that
SAT(M,{OAG1,OAG2,OAG3,OAG4,OAG5,OAG6,OAG7,OAG8}). In M, we
suppress the interpretations of all but 0,+,,<.
Finally, we define “S universally implies A”. This means
that for all structures M, if SAT(M,S) then SAT(M,A).
Consider the correct statement “in every group G, 0 + x = x
holds universally”. This is obviously equivalent to
“{G1,G2,G3} universally implies 0 + x = x”.
5. PROOFS IN FREE VARIABLE PREDICATE CALCULUS.
There is a fundamental theorem to the effect that for all
sets of formulas S in FVPRED and formulas A in FVPRED, S
universally implies A iff there is a proof of A from S in
some appropriate deductive system.
We will make use of an obvious extension of the notion of
tautology to the context of FVPRED.
We have already defined the formulas of FVPRED. An atomic
formula is a formula of FVPRED that has no connectives.
A truth assignment for FVPRED is a function from the set of
atomic formulas into the set of truth values {T,F}. 12
Using this definition, we define SAT(A,h) for truth
assignments h, exactly as we did for PROP.
We say that A is a tautology in FVPRED iff SAT(A,h) holds
for all truth assignments h.
Do not confuse SAT(A,h) with our earlier notion SAT(M,A,f).
We now present a proof system for FVPRED. It will be
convenient to leverage off of PROP.
The equality axioms consist of the following formulas.
1. x = x, where x is a variable.
2. x = y Æ t = t’, where x,y are variables and t’ is the
result of replacing zero or more occurrences of x in t by
y.
Let S be a set of formulas of FVPRED. An Sproof is a
finite sequence of formulas of FVPRED such that every entry
is
i. a tautology; or
ii. an equality axiom; or
iii. an element of S; or
iv. a term substitution instance of a previous entry; or
v. a formula B where some A,A Æ B are earlier entries.
Here term substitution instances result in the replacement
of variables by terms, the same variables being replaced by
the same terms.
We can be more restrictive as follows:
i. a tautology; or
ii. an equality axiom; or
iii’. a term substitution of an element of S; or
v. a formula B where some A,A Æ B are earlier entries.
THEOREM. (Relative completeness). S logically implies A iff
A is an entry in some Sproof (using iv or using
i,ii,iii’,v).
As in PROP, we have the following consequence, which does
not involve proofs (i.e., is purely semantic). 13
THEOREM. (Compactness theorem). If S universally implies A
then some finite subset of S universally implies A.
Will you get a check for a million dollars if you find an
efficient procedure for testing whether A universally
implies B in FVPRED?
Definitely, since this is known to be impossible.
THEOREM. There is no algorithm for testing whether A
universally implies B, in FVPRED.
6. PREDICATE CALCULUS.
We now discuss the full Predicate Calculus, PRED.
Let L(x,y) be “x loves y”. We need to support such
statements as
1.
2.
3.
4. Everybody loves somebody.
Somebody loves everybody.
Everybody is loved by somebody.
Somebody is loved by everybody. We want to know, rigorously, what the logical relationships
are here.
2 logically implies 3. 4 logically implies 1. There are no
other logical implications.
1.
2.
3.
4. Everybody loves somebody. ("x)($y)(L(x,y)).
Somebody loves everybody. ($x)("y)(L(x,y)).
Everybody is loved by somebody. ("x)($y)(L(y,x).
Somebody is loved by everybody. ($x)("y)(L(y,x)). The vocabulary of PRED is the same as for FVPRED except
that we also have the two quantifiers ",$, read “for all”,
“there exists”.
The terms of PRED are the same as the terms of FVPRED. In
the definition of formulas, we have to add this additional
clause for the quantifiers:
d. If A is a formula and x is a variable, then ("x)(A),
($x)(A) are formulas. 14
We define relational structures, Massignments (variable
assignments), and VAL(M,t,f) as before.
The inductive definition of SAT(M,t,f) needs two additional
clauses to take care of the quantifiers.
h. SAT(M,("xn)(A),f) iff for all d Œ D, SAT(M,A,f[xn/d]).
i. SAT(M,($xn)(A),f) iff there exists d Œ D such that
SAT(M,A,f[xn/d]).
Here f[xn/d] is the same as f except that it is forced to be
d at the argument xn.
We define SAT(M,A) iff for all Massignments f, SAT(M,A,f).
We define SAT(M,S,f) iff for all A Œ S, SAT(M,A,f). We
define SAT(M,S) iff for all A Œ S, SAT(M,A). Finally, we
say that A is valid iff SAT(M,A) holds for all relational
structures M.
Note
read
into
same that all of the definitions in the previous paragraph
exactly the same as those made for FVPRED, but take
account a wider class of formulas (same structures and
assignments). We say that S universally implies A iff for all relational
structures M, if SAT(M,S) then SAT(M,A). This also reads
the same as for FVRED.
The tautologies of PRED are defined the same way as the
tautologies of FVPRED, except that instead of just using
the atomic formulas as the “basis”, we also use the
formulas of PRED that begin with a quantifier. So the h’s
now have domain the set of all formulas of PRED that either
atomic or begin with a quantifier.
A formula A of PRED is said to be valid if and only if for
all structures M, SAT(M,A). This notion is crucially
important in the development of PRED, but not in FVPRED.
This is because in FVPRED, a formula is valid if and only
if it is a tautology.
7. PROOFS IN PREDICATE CALCULUS.
Of course, there are almost as many solutions to
Completeness as there are logic books. Here is one well
known solution. 15
A proof is a finite sequence of formulas such that each
entry is
1. a tautology; or
2. an equality axiom; or
3. of the form ("x)(A) Æ A[x/t];
4. of the form A[x/t] Æ ($x)(A);
5. of the form A Æ ("x)(B) where
entry; or
6. of the form ($x)(A) Æ B where
entry; or
7. a formula B, where some A,A Æ or
or
A Æ B is a previous
A Æ B is a previous
B are previous entries. Here A,B are formulas of PRED and A[x/t] is the result of
replacing all free (i.e., unquantified) occurrences of x in
A by the term t.
There are technical restrictions needed on 36. We give
these restrictions in the Appendix.
Here are some illustrative examples as to what can go very
wrong without these technical restrictions on 36.
In 3, ("x)($y)(R(x,y)) Æ ($y)(R(y,y)) is bad, using t = y.
In 4, ("y)(R(y,y)) Æ ($x)("y)(R(x,y)) is bad, using t = y.
In 5, R(x) Æ ("x)(R(x)) from R(x) Æ R(x) is bad.
In 6, ($x)(R(x)) Æ R(x) from R(x) Æ R(x) is bad.
THEOREM. (Completeness). A formula in PRED is valid iff it
is an entry in some proof.
Let S be a set of formulas of PRED. An Sproof is the same
as a proof except that an entry is allowed to be any
element of S.
THEOREM. (Relative completeness). S universally implies A
in PRED iff A is an entry in some Sproof.
THEOREM. (Compactness). In PRED, S universally implies A
iff some finite subset of S universally implies A.
8. SOME OPEN ISSUES. 16
The complete and relatively complete systems we have
presented fall into the category of what are called Hilbert
systems. Although they serve the limited purposes we have
discussed quite well, they are not set up to closely
reflect what is going on, logically, in actual mathematical
proofs, polished or otherwise.
Other breeds of systems called sequent calculi and natural
deduction systems, are at least partially intended to more
closely reflect actual logical reasoning.
A number of important results concerning these alternative
systems establish that certain of the rules can be
eliminated. The focus has been on “cut elimination” and the
related normalization. However, I am confident that a
systematic study of just what rules can or cannot be
eliminated, and at what cost, driven by the examination of
actual mathematical proofs, will be quite interesting.
Another issue is more pragmatic. There is the problem of
developing a deductive system that makes it easy to read
and write mathematical proofs. There is a lot to do in this
direction, even if we restrict our attention to the
rigorous presentation of mathematical assertions  before
we even consider proofs.
It is commonly believed among logicians that completely
formal proofs can be constructed for even the deepest and
most complex state of the art mathematics. Put differently,
that completely formal proofs of the entire mathematical
corpus, put on paper in a normal sized font, could fit into
a large hall.
Mathematicians are perhaps skeptical, with some underlying
feeling that since proofs have to be “felt” to be
understood, there may be substantial “jumps” made that are
clear to humans, but which, when fully unraveled, could
exponentiate the size of a formal proof.
I side with the logicians on this, but developing the tools
to make this convincing is an interesting problem. APPENDIX
A well known complete formal system for PROP based on only
the two connectives ÿ,Æ, is as follows. We follow standard 17
conventions concerning the elimination of unnecessary
parentheses.
Axioms.
p Æ (q Æ p)
(p Æ (q Æ r)) Æ ((p Æ q) Æ (p Æ r))
(ÿq Æ ÿp) Æ ((ÿq Æ p) Æ q).
Rules.
from A and A Æ B, derive B.
from A derive any substitution instance of A.
Alternatively,
from A and A Æ B, derive B.
from any axiom derive any substitution instance.
This system was taken from Elliot Mendelson, Introduction
to Mathematical Logic, Van Nostrand, 1964.
The remaining connectives Ÿ, ⁄, ´, can be introduced as
abbreviations in the well known way, or we can expand the
set of axioms to accommodate them. A crude but systematic
expansion of the set of axioms is as follows.
p Æ (q Æ p)
(p Æ (q Æ r)) Æ ((p Æ q) Æ (p Æ r))
(ÿq Æ ÿp) Æ ((ÿq Æ p) Æ q)
(p Ÿ q) Æ ÿ(p Æ ÿq)
ÿ(p Æ ÿq) Æ (p Ÿ q)
(p ⁄ q) Æ ((ÿp) Æ q)
((ÿp) Æ q) Æ (p ⁄ q)
(p ´ q) Æ ÿ((p Æ q) Æ ÿ(q Æ p))
ÿ((p Æ q) Æ ÿ(q Æ p)) Æ (p ´ q).
Again we can use either of the two versions of the rules.
We give the restrictions that have to be made in the
following complete axiomatization for PRED.
1.
2.
3.
4. a tautology; or
an equality axiom; or
of the form ("x)(A) Æ A[x/t]; or
of the form A[x/t] Æ ($x)(A); or 18
5. of the form A Æ ("x)(B) where A Æ B is a previous
entry; or
6. of the form ($x)(A) Æ B where A Æ B is a previous
entry; or
7. a formula B, where some A,A Æ B are previous entries.
Here A[x/t] is the result of replacing each free occurrence
of x in A by the term t.
We require the following.
a. In 3,4, t is substitutable for x in A, in the sense that
no free occurrence of x lies within the scope of a
quantifier that uses a variable appearing in t.
b. In 5, x is not free in A.
c. In 6, x is not free in B.
For the definitions of free variable, free occurrence,
scope, and substitutability (of terms for variables),
consult any logic text or any logician. LECTURE 2
THE AXIOMS FOR MATHEMATICS
Harvey M. Friedman
Department of Mathematics
Ohio State University
April 30, 2003 http://www.math.ohiostate.edu/~friedman/ friedman@math.ohiostate.edu In logic, mathematics is viewed as proceeding by rigorous
deduction starting with certain axioms for mathematics.
Such axioms are needed in order to support the varied
constructions of mathematical objects that occur throughout
mathematics. The most effective way known to achieve such
unified foundations for mathematics is through the axioms
for set theory, and the set theoretic interpretation of
mathematics. In this second lecture, we discuss the
standard ZFC axioms (Zermelo Frankel set theory with the
axiom of choice). On one hand, ZFC seems like overkill,
since so little of its power is really used in typical
mathematical contexts. On the other hand, ZFC is known to
be insufficient in certain kinds of mathematical contexts.
The set theoretic interpretation of mathematics uses what
is called pure set theory, where absolutely every object is
a set and the only relations between sets are that of 19
membership and equality. Even natural numbers must be
defined as certain sets.
1. HEREDITARILY FINITE SETS.
We begin with a discussion of the finite part of (pure) set
theory. In this section we will proceed purely
mathematically, taking the natural numbers as given, as
well as standard mathematical concepts such as “finite”.
However, in set theory, everything is a set, including
natural numbers. All concepts must be defined in terms of
sets. In section 2, we fill in these gaps.
A particularly fundamental principle of set theory asserts
that two sets are equal iff they have the same elements.
When we discuss axioms (later), this will be called the
Axiom of Extensionality.
The set with no elements is called the empty set, and is
denoted by ∅. If x1,...,xn are any objects whatsoever, we
write {x1,...,xn} for the set whose elements are exactly
x1,...,xn. By extensionality, repetitions don’t count.
We are now going to define an infinite sequence of
particular sets, indexed along the natural numbers, by
recursion.
V(0) = .
V(n+1) is the set of all subsets of V(n).
We use S for the power set. Thus we can rewrite the above
as
V(0) = ∅.
V(n+1) = S(V(n)).
The V(n)’s are rather intricate, and enjoy a number of
interesting properties which can be verified by induction
on n.
We “compute” a few of these V(n)’s as follows.
V(1) = {∅}.
V(2) = {∅, {∅}}.
V(3) = {∅, {∅}, {{∅}}, {∅, {∅}}}. 20
We call a set transitive iff every element of an element is
an element. We use x for the cardinality of the finite
set x.
EXERCISE. For all nonnegative integers n,
i) V(n+1) = 2V(n);
ii) V(n) Œ V(n+1);
iii) V(n) is transitive;
iv) V(n) Õ V(n+1);
v) V(n) œ V(n);
vi) every element of V(n) is finite.
A set is said to be hereditarily finite iff it is a member
of some V(n).
We will henceforth abbreviate the adjective “hereditarily
finite” by HF. (HF is my initials, but I can assure you
that this is totally accidental).
Let x be a set. We say that y is an Œ minimal element of x
iff
a) y lies in x;
b) no element of x lies in y.
We say that y is an Œ maximal element of x iff
c) y lies in x;
d) no element of x has y in it.
EXERCISE. Every nonempty HF set has an Œ minimal element.
EXERCISE. Every nonempty HF set has an Œ maximal element.
EXERCISE. Every element of an HF set is HF. Every subset of
an HF set is HF. Every finite set of HF sets is HF.
The existence of the set of all HF sets is a major step
that we don’t take just yet. We will come back to this
later.
EXERCISE. The set of all HF sets, if it exists, is not an
HF set.
There is a way of characterizing the HF sets without
building the V(n)’s. The transitive closure of a set is the 21
least transitive superset of that set. (We won’t worry
about the existence of the transitive closure right now).
EXERCISE. A set is HF iff its transitive closure is finite.
Finally, recall the common set theoretic operations:
x » y = {z: z Œ x or z Œ y}.
x « y = {z: z Œ x and z Œ y}.
x\y = {z: z Œ x and z œ y}.
EXERCISE. If x,y are HF, then so are x » y, x « y, x\y.
2. FINITE MATHEMATICS IN HEREDITARILY FINITE SET THEORY.
Recall that in the (set theoretic) foundations of
mathematics, everything is going to be a set, and the only
relations between sets are going to be that of membership
and equality.
Correspondingly, in the (set theoretic) foundations of
finite mathematics, everything is an HF set, and the only
relations between such sets are that of membership and
equality.
We now discuss several of the constructions that are used
for the foundations of finite mathematics in the HF sets.
First, we need a workable way to interpret the natural
numbers as HF sets. A very convenient way was discovered by
von Neumann.
Here are the first five von Neumann natural numbers:
∅
{∅}
{∅, {∅}}
{∅, {∅}, {∅, {∅}}}
{∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}}.
These five are denoted by 0,1,2,3,4.
We can define these “numbers” by recursion on the natural
numbers from ordinary mathematics, as follows. 0* = ∅.
(n+1)* = n* » {n*}. 22
We say that x is a NN (natural number) if and only if x is
some n*.
EXERCISE. For all n,
i) n* = n;
ii) n* = {0*,1*,...,(n1)*};
iii) n* is transitive;
iv) n* Õ (n+1)*;
v) n* œ n*;
vi) n* is HF.
EXERCISE. For all n,m, n < m ´ n* Œ m*. For all n,m, n* Œ
m* ⁄ m* Œ n* ⁄ n* = m*.
Note that not every subset of a NN is a NN. E.g., {1*} is
not a NN.
EXERCISE. The set of all NN, if it exists, is not HF.
There is a way of characterizing the NN without building
them by induction.
We say that a set x is Œ connected iff
for all y,z Œ x, y Œ z or z Œ y or y = z.
EXERCISE. A set is a NN iff it is transitive, Œ connected,
and finite.
Ordered pairs are essential. For HF x,y, we define
<x,y> = {{x},{x,y}}.
The crucial fact that makes this work is this.
EXERCISE. For HF x,y,z,w, <x,y> = <z,w> ´ (x = z Ÿ y = w).
Now we can introduce functions. Let x be HF. We say that x
is a function if and only if
i) every element of x is an ordered pair;
ii) if <y,z>,<y,w> Œ x then z = w.
For HF x, we define
dom(x) = {y: ($z)(<y,z> Œ x)}. 23
For HF x, we define
rng(x) = {y: ($z)(<z,y> Œ x).
EXERCISE. If x,y are HF, then dom(x),rng(x) are HF.
We define
x(y) = z ´ <y,z> Œ x.
We write
x:y Æ z ´ (x is a function Ÿ
dom(x) = y Ÿ rng(x) Õ z).
Let x,y be HF. We define
x•y = {<z,w>: z Œ x Ÿ w Œ y}.
EXERCISE. Let x,y be HF. x•y is HF. The set of all
functions from x into y is HF.
Let x be HF. We say that x is a finite sequence iff x is a
function whose domain is a NN.
We need the theory of cardinality for HF sets.
EXERCISE. Let x be HF. There is a oneone function from x
onto a unique NN. We write x for this unique NN.
EXERCISE. Let x be HF. Every function from x onto x is oneone, and every oneone function from x into x is onto.
We now indicate how we develop the usual number systems
with the usual arithmetic operations and order. In finite
mathematics, we have, principally, the ordered ring of
integers, and the ordered field of rationals.
The essence of this is addition and multiplication on the
NN.
We made the following definitions by recursion. Define
S(n*) = n* » {n*}. It is easy to see that S(n*) = (n+1)*,
S(n*) = S(m*) Æ n* = m*, and S(n*) ≠ 0* = ∅. Also we
define
n* + 0* = n*. 24
n* + S(m*) = S(n* + m*).
n* • 0* = 0*.
n* • S(m*) = (n* • m*) + n*.
Some serious argument is needed to support these two
definitions by recursion.
For the first pair of equations, we prove that for all n*,
there exists a unique function fn with domain NN such that
f(0*) = n* and each f(S(m*)) = S(f(m*)). To prove this, fix
n. Then prove by induction that for all r, there is such a
unique function that works for m* from 0 through r. Then
these functions can be put together into the desired single
function.
Using these unique functions fn, we prove that for all
n*,m*, there exists a unique function g with domain n*•m*
such that g(n*,m*) = fn(m*).
The second pair of equations is handled similarly, with the
help of g from the previous paragraph.
Armed with these equations, we can proceed by induction to
prove many facts about the NN under addition and
multiplication, including the commutativity and
associativity of addition and multiplication, and
distributivity.
We will now leave the *’s off. We already have the order Œ
on NN. We write < for Œ between NN’s. We write n £ m for (n
< m ⁄ n = m). By induction, we have
(n < m Ÿ r £ s) ´ n + r < m + s
(n < m Ÿ 0 < r £ s) ´ n • r < m • s
n+m=n+r´m=r
n ≠ 0 Æ (n • m = n • r ´ m = r)
< is a linear ordering on NN
n £ m ´ ($k)(n = k = m).
We can now move to the ring of integers. An integer is an
ordered pair (i,n), where i = 0 or 1 (0 is ∅, 1 is {∅}),
and n is a NN. We stipulate that (0,0) is an integer, but
not (1,0).
We define addition and multiplication explicitly for
integers, as well as the order. 25 EXERCISE. Addition, multiplication, and order on the
integers forms a discrete ordered group. Furthermore, every
nonempty finite set of integers has a smallest and largest
element.
EXERCISE. Let n,m be integers, m nonzero. We can find
unique n’,m’ with no common divisors other than +1, m’ >
0, where for some integer d, we have n = d•n’, m = d•m’.
We now define the rationals. These are ordered pairs (n,m)
such that m > 0 and there are no common divisors of n,m
other than +1.
We then define addition, multiplication using the last
Exercise. We also definition <.
EXERCISE. Addition, multiplication, and order on the
rationals forms a densely ordered field.
3. AXIOMS FOR HEREDITARILY FINITE SET THEORY.
We have seen how to develop hereditarily finite set theory
within ordinary mathematics, taking a number of concepts
from mathematics for granted. We now want to develop
hereditarily finite set theory without any prior
mathematics.
We do this by writing down a number of axioms about HF
sets, which use only the concepts of HF set, Œ, and =.
These axioms are presented in PRED (predicate calculus with
equality), discussed in Lecture 1.
As mathematicians, we can recognized the truth of each
axiom in the structure consisting of all HF sets where Œ,=
have their usual mathematical meaning.
These axioms about the HF sets are very powerful. They are
sufficient for us to redo everything we have done up to now
about the HF sets, without resorting to any other concepts
or reasoning.
So let’s start over completely from scratch when thinking
about the HF sets. To make these axioms easier to read, we
freely use some standard abbreviations. 26
Also, in the English presentations, “sets” always means “HF
sets”.
1. EXTENSIONALITY. If two sets have the same elements then
they are equal. ("z)(z Œ x ´ z Œ y) ´ x = y.
2. PAIRING. There is an set consisting of any two given
sets. ($z)("w)(w Œ z ´ (w Œ x ⁄ w Œ y)).
3. UNION. There is an set consisting of the elements of the
elements of any given set. ($y)("z)(z Œ y ´ ($w Œ x)(z Œ
w)).
4. POWER SET. There is an set consisting of all subsets of
any given set. ($y)("z)(z Œ y ´ z Õ x).
5. CHOICE. For every set of nonempty sets there is a set
which has exactly one element in common with each of the
nonempty sets. ("y,z Œ x)(y ≠ ∅ Ÿ y « z = ∅) Æ ($z)("y Œ
x)($!w)(w Œ y « z).
6. FOUNDATION. Every nonempty set has an Œ minimal element.
x ≠ ∅ Æ ($y Œ x)("z Œ x)(z œ y).
7. SEPARATION. There is a set consisting of the elements of
any given set that satisfy some given condition. ($y)("z)(z
Œ y ´ (z Œ x Ÿ A)), where A is any formula of PRED using
only Œ,=, in which y does not appear.
8. REPLACEMENT. If every element of a given set is related
to exactly one set by a given condition, then there is a
set consisting of the sets such that some element of the
given set is related to it. ("y Œ x)($!z)(A) Æ ($w)("z)(z
Œ w ´ ($y Œ x)(A), where A is any formula of PRED using
only Œ,=, in which w does not appear.
9. FINITENESS. Every nonempty set has an Œ maximal element.
x ≠ ∅ Æ ($y Œ x)("z Œ x)(y œ z).
Specifically note that 7 and 8 are made up of infinitely
many axioms, as there are infinitely many formulas of PRED
using only Œ,=.
It is easy to see that Replacement implies Separation, and
so we can remove Separation. However, it is standard to
include both. 27
All of these axioms are useful. Mathematicians want them
for immediate use.
THEOREM 3.1. Extensionality, pairing, union, foundation,
separation, finiteness, logically implies the remaining
ones: power set, choice, replacement. However, if we drop
finiteness, then we cannot obtain any of these remaining
three.
We can do finite mathematics entirely within these axioms.
We define x is a NN ´ (x is Œ connected Ÿ x is transitive),
where
x is Œ connected ´ ("y,z Œ x)(y Œ z ⁄ z Œ y ⁄ y = z).
x is transitive ´ ("y,z Œ x)(x Œ y Œ z Æ x Œ z).
We can prove all of the basic facts about the NN within
these axioms.
EXERCISE. We can prove from 19 that ∅ is a NN, and for all
NN x,y,
i) x Œ y ⁄ y Œ x ⁄ x = y;
ii) x Õ y ⁄ y Õ x;
iii) x » {x} is a NN;
iv) ("z Œ x)(z is a NN);
v) x ≠ ∅ Æ ($!z)(x = z » {z});
vi) every transitive set of NN’s is an NN.
Let n be a NN. We define n+1 = n » {n}.
We want to make sense of the original definition of the HF
sets:
V(0) = ∅,
V(n+1) = S(V(n)).
Here S(u) denotes the power set of u, which is the set of
all subsets of u.
EXERCISE. We can prove this in 19. Let x be a NN. There is
a unique function f such that
i) dom(f) = x;
ii) f(∅) = ∅,
iii) ("y Œ x)(f(y »{y}) = S(f(y))). 28
Using this exercise, for NN x, we define V(x) as f(x),
where f is the unique function satisfying these three
clauses for the NN x+1.
EXERCISE. We can prove from 19 that ("x)($y)(y is a NN Ÿ x
Œ V(y)). I.e., we can prove from 19 that “every set is
hereditarily finite”.
EXERCISE. We can prove from 19 that every x is in oneone
correspondence with a unique NN.
We can continue the development of the HF sets entirely
within 19.
4. ZERMELO FRANKEL SET THEORY WITH THE AXIOM OF CHOICE.
This system is denoted by ZFC, and is the gold standard for
mathematical proofs. In fact, it is such a widely accepted
gold standard that it is practically never pulled out to
settle disputes. In any case, disputes are invariably based
on some mistake or misunderstanding, and not misuses of the
axioms, or failure to stay within the axioms.
ZFC is the same as 19 except that 9 is dropped and
replaced by a new 9. Thus ZFC is
1. EXTENSIONALITY.
2. PAIRING.
3. UNION.
4. POWER SET.
5. CHOICE.
6. FOUNDATION.
7. SEPARATION.
8. REPLACEMENT.
9’. INFINITY. ($x)(∅ Œ x Ÿ ("y Œ x)(y » {y} Œ x)).
The usual definition of finite set in ZFC is a set which is
in oneone correspondence with an element of w.
Recall the definition of NN in 19 given in the previous
section:
x is a NN ´ (x is Œ connected Ÿ x is transitive).
This definition is not operative in 19’. Instead, the
right side is the definition of ordinal in ZFC. I.e., we
define 29 x is an ordinal ´ (x is Œ connected Ÿ x is transitive).
EXERCISE. (ZFC). If x,y are ordinals, then
i) x Œ y ⁄ y Œ x ⁄ x = y;
ii) x Õ y ⁄ y Õ x;
iii) x » {x} is an ordinal;
iv) every element of an ordinal is an ordinal;
v) every transitive set of ordinals is an ordinal;
vi) every set of ordinals is a subset of an ordinal;
vii) every nonempty set of ordinals has a unique least
element;
viii) the ordinals are exactly the transitive sets of
transitive sets.
x » {x} is called the successor of x, written x+1. A limit
ordinal is an ordinal which is not the successor of any
ordinal.
The natural numbers are developed in ZFC as follows.
Using power set, separation, and infinity, we obtain a
least set x obeying the property in infinity. I.e., least
in the sense that it is a subset of any set x satisfying
the property in 9’.
This least x is normally denoted by w and forms the set
theoretic version of the set of all natural numbers.
EXERCISE. (ZFC). w is an ordinal. w is the least limit
ordinal.
We can easily develop the real number system in ZFC, but
not in 19. Using the approach due to Dedekind, we define a
real number as a left cut of rationals; i.e.,
x is a real number ´ (x Õ Q Ÿ x ≠ ∅ Ÿ
("y,z Œ Q)(y < z Œ x Æ y Œ x)).
Using power set, we obviously have the set of all real
numbers, denoted by ¬.
Addition, multiplication, minus, reciprocal, and order can
be appropriately defined on ¬ to form the ordered field of
real numbers. 30
A critical property of ¬ is the least upper bound property.
This asserts that every nonempty x Õ ¬ with an upper bound
has a least upper bound. This upper bound is simply the
union of x, in the sense of the union axiom.
The Cauchy sequence construction is definitely more
involved from our viewpoint, but it has compensating
advantages that are well known to analysts.
Obviously ordinals come in three varieties. There is the
least one, ∅. There are the successors ordinals, which are
of the form a » {a}, normally written a+1. Finally, there
are the limit ordinals, which are the remainder of the
ordinals.
It is customary to write a < b for ordinals a,b where a Œ
b.
We want to build the V’s on all ordinals by the following
transfinite recursion:
V(0) = ∅,
V(a+1) = S(V(a)),
for limit ordinals l, V(l) = »{V(a): a Œ l}.
How do we justify this definition?
EXERCISE. (ZFC). Let a be an ordinal. There is a unique
function f such that
i) dom(f) = a;
ii) f(∅) = ∅, assuming a ≠ ∅;
iii) ("b < a)(f(b+1) = S(f(b)));
iv) (" limits l < a)(f(l) = »{V(a): a Œ l}).
The above uses power set and replacement in essential ways.
Using this exercise, for ordinals a, we define V(a) as
f(a), where f is the unique function satisfying these three
clauses for the ordinal a+1.
These V(a)’s are of crucial importance in set theory, and
form what is called the cumulative hierarchy.
EXERCISE. (ZFC). ("x)($a)(x Œ V(a)). 31
In ZFC, we define the HF sets as the elements of V(w). Note
that we have the set of all HF sets available to us in ZFC,
something we didn’t have in 19.
THEOREM. (ZFC). Every set is in oneone correspondence with
an ordinal.
This theorem uses the Axiom of Choice in an essential way.
For every set x, the least ordinal in oneone
correspondence with x is called the cardinal of x.
5. BIG ISSUES.
There is a remarkable relationship between axioms 19 and
ZFC. Axioms 19 form an appealing and useful set of axioms
that a mathematician immediately recognizes as holding of
the HF sets. ZFC is exactly the same as 19 except that the
one axiom that obviously focuses on finiteness  axiom 9 is replaced by an axiom of infinity, 9’.
This naturally suggests that we should be able to say just
what kind of statement true about the hereditarily finite
sets lift to the full set theoretic universe.
Obviously, not every kind of statement can be so lifted.
Look at axiom 9, which is false of w  i.e., w has no Œ
maximal element. Nobody has been able to find such a
transfer principle from HF sets to arbitrary sets.
The identification of natural numbers with any particular
sets is going to have to be ad hoc. Same with real numbers.
We can acknowledge this state of affairs by saying that the
usual set theoretic foundations for mathematics forms an
interpretation of mathematics in set theory.
We can ask for a foundation for mathematics that is more
literal. For instance, mathematicians often take the
concept of natural number as primitive, and only care about
its properties. Or mathematicians may only consider natural
number systems, instead of natural numbers. One can ask for
a foundation for mathematics that is directly sensitive to
these viewpoints.
There have been some attempts to create an alternative
foundation for mathematics along these lines. One approach 32
is called categorical foundations of mathematics. However,
there are serious difficulties involved in making this an
autonomous foundation for mathematics, since one tends to
define categories set theoretically. Attempts to rid
categorical foundations of set theory entirely wind up
slavishly transporting the axioms of set theory into a
categorical context, resulting in an equivalent but more
cumbersome form of set theoretic foundations. These issues
are controversial.
Two other big issues form the topics of the next two
lectures. These regard the senses in which ZFC is overkill
(too powerful), and senses in which ZFC is underkill (too
weak). LECTURE 3
DO WE NEED LESS AXIOMS?
Harvey M. Friedman
Department of Mathematics
Ohio State University
May 5, 2003 http://www.math.ohiostate.edu/~friedman/ friedman@math.ohiostate.edu In the first two lectures, we have laid the basic
groundwork that is needed for an appreciation of the work
in the foundations of mathematics that began in the 1930’s
and continues to this day.
The material presented in the first two lectures concerned
rather spectacular results from the point of view of
philosophy and history of mathematics.
In these last two lectures, we will relate this basic
material to focused topics in mathematics.
1. PURELY ADDITIVE NUMBER THEORY.
By purely additive number theory, we will mean the study of
the integers and rationals and reals, where we allow only
the order and addition. We explicitly exclude
multiplication.
This is an interesting, yet extremely well understood
mathematical context, where, in a very precise sense,
absolutely every question can be answered. 33
In fact, in this context, absolutely everything can be
answered within a very weak fragment of ZFC.
How do we convey this? Consider the structure, in the sense
of the first lecture,
(¬,Q,Z,<,+,,0,1).
Here the domain is the set of all reals, ¬. We have a 1ary
relation Q, a 1ary relation Z. We have the binary relation
< on ¬, and the binary function + from ¬ into ¬. We have
the unary function  from ¬ into ¬. We have the constants 0
and 1.
What are we allowed to say? I.e., what problems are we
allowed to pose?
As in the first lecture, we use PRED. We can say anything
grammatical involving:
1.
2.
3.
4.
5.
6.
7.
8. ÿ, Ÿ, ⁄, Æ, ´. (Not, and, or, implies, iff).
", $. (For all reals, there exists a real).
Q(_), Z(_). (Being a rational, being an integer).
<. (One real less than another).
+. (The addition of two reals).
. (The negative of a real).
0, 1. (Two specific reals).
=. (Two reals are equal). Here are some examples of allowable sentences.
("x,y)(x < y
("x)($y)(x =
("x,y)((Z(x)
("x)($y)(x =
("x)(Q(x) Æ
("x)(Z(x) Æ Æ ($z)(x < z < y Ÿ Q(z))).
y + y).
Ÿ Z(y)) Æ Z(x + y)).
y + y).
($y)(Q(y) Ÿ x = y + y)).
($y)(Z(y) Ÿ x = y + y)). All but the last are true in (¬,Q,Z,<,+,,0,1). Here is a
fix for the last one.
("x)(Z(x) Æ ($y)(Z(y) Ÿ (x = y + y ⁄ x = (y + y) + 1))).
THEOREM 1.1. Every sentence in PRED about (¬,Q,Z,<,+,,0,1)
can be proved or refuted within a very small fragment of
ZFC. 34
So we need far less axioms than ZFC in this limited
mathematical context.
Things are rather delicate, because if we throw in
multiplication, then things go haywire. In particular,
things are very bad for (Z,+,•) and also for (Q,+,•).
THEOREM 1.2. There are sentences in PRED about (Z,+,•) that
cannot be proved or refuted within ZFC. The same holds for
(Q,+,•).
Actually, things are much worse than even Theorem 1.2
suggests. Recall from the second lecture that ZFC was
axiomatized by finitely many axioms and axiom schemes. It
is clear intuitively what an axiom scheme is, and there is
a comprehensive definition of what an axiom scheme is that
we do no have time to go into here.
We say that a set of axioms of PRED is consistent iff it
does not prove a contradiction. By the fundamental
completeness theorems from the first lecture, this is the
same as the existence of a structure in which the set of
axioms holds.
THEOREM 1.3. Let T be any consistent extension of ZFC by
finitely many new axioms and axiom schemes. There is a
sentence in PRED about (Z,+,•) that cannot be proved or
refuted in T. The same holds for (Q,+,•).
There is a very special but familiar family of sentences
about (Z,+,•) known as Diophantine problems. A Diophantine
problem is to determine the existence of a solution in
integers to an equation of the form
P(x1,...,xn) = 0
where P is a polynomial in the n variables shown with
integer coefficients.
THEOREM 1.4. Let T be any consistent extension of ZFC by
finitely many new axioms and axiom schemes. There is a
Diophantine problem with a “no” answer which cannot be
proved in T to have a “no” answer.
Theorem 1.4 is essentially the same as the negative
solution of Hilbert’s 10th problem. The original problem
concerns algorithms and not formal systems such as ZFC. 35 THEOREM 1.5. (Negative solution to Hilbert’s 10th problem).
There is no computer algorithm that correctly answers any
given Diophantine problem.
However, do not get discouraged by this kind of negative
result. Here are some reasons for keeping a happy face.
1. All statements in PRED about the structure (¬,Q,Z,<,+,,0,1) and many other important structures can be proved or
refuted in even very weak fragments of ZFC. We will give
more examples later.
2. Nobody knows whether all Diophantine problems for
(Q,+,•) can be answered (with proofs!) in ZFC or even in a
very weak fragment of ZFC.
3. Even for Diophantine problems in (Z,+,•), nobody has
been close to being able to give a reasonable example that
has “logical difficulties”; i.e., cannot be answered in ZFC
or even weak fragments of ZFC.
The following result comes immediately from work of number
theorists.
THEOREM 1.6. Every quadratic Diophantine problem for
(Z,+,•) and for (Q,+,•) can be answered within a very small
fragment of ZFC.
It is open whether Theorem 1.6 holds for cubics in two
variables over Z. Same with Theorem 1.5.
Where do Theorems 1.5 and 1.6 kick in, with respect to the
number of variables and the degree?
This problem has been studied intensively over (N,+,•)
rather than (Z,+,•), where the negative results kick in
earlier than they do for (Z,+,•).
For (N,+,•), some known results are that Diophantine
problems of degree n in m variables cannot be solved and
cannot all be answered within ZFC, where, e.g.,
i) n = 1.6 x 1045, m = 9;
ii) n = 4, m = 58;
iii) n = 24, m = 26. 36
To convert these to (Z,+,•), it is easy to see that it
suffices to multiply n by 2 and m by 4 using Lagrange’s
theorem that every natural number is the sum of four
squares of integers.
We now come back to (¬,Q,Z,<,+,,0,1).
We can say much more about the principles needed to answer
any question in PRED about this structure.
We can list some simple and obvious statements in PRED that
are obviously true in this structure, and show that every
sentence in PRED about this structure can be proved or
refuted from these simple and obvious statements.
1. The ordered Abelian group axioms for (¬,<,+,,0).
2. 0 < 1.
3. Z(0) Ÿ Z(1).
4. Z(x) Æ Q(x).
5. Z(x) Æ (x £ 0 ⁄ x ≥ 1).
6. (Q(x) Ÿ Q(y)) Æ Q(x + y).
7. (Z(x) Ÿ Z(y)) Æ Z(x + y).
8. Z(x) Æ Z(x).
9. Q(x) Æ Q(x).
10. ("x,y)(x < y Æ ($z,w)(Q(z) Ÿ Z(w) Ÿ x < z < y < w)).
11. ("x)($y)(y + ... + y = x), where there are any finite
number of one or more y’s.
12. ("x)(Q(x) Æ ($y)(Q(y) Ÿ y + ... + y = x)), where there
are any finite number of one or more y’s.
13. ("x)(Z(x) Æ ($y)($r)(Z(y) Ÿ Z(r) Ÿ x = y + ... + y + r
Ÿ 0 £ r < 1 + ... + 1)), where there are d y’s and d 1’s, d
≥ 1.
13 is the formalization of the quotient remainder theorem,
and consists of infinitely many axioms.
It can be shown that 113 cannot be given a finite
axiomatization; i.e., is not logically equivalent to a
finite set of axioms in PRED.
THEOREM 1.7. A sentence of PRED is true in (¬,Q,Z,<,+,,0,1) iff it has a proof from 113. We can algorithmically
decide whether a sentence of PRED is true in (¬,Q,Z,<,+,,0,1).
The method of proof for such results is what is called
quantifier elimination. 37 We say that a set T of axioms (like 113 above) admits
quantifier elimination (QE) if and only if for every
formula A in PRED using only the symbols used in T, there
is a quantifier free formula B in PRED using only the
symbols used in T, such that
T proves A ´ B.
An important related concept is that of a structure M
admitting QE. This means that for every formula A in PRED
using only the symbols used in M, there is a quantifier
free formula B in PRED using only the symbols used in N,
such that
SAT(M,A ´ B).
In the case of the set of axioms 113 and the structure
(¬,Q,Z,<,+,,0,1), we don’t have quantifier elimination as
things stand. However, if we add the new unary relation
symbols P1,P2,... with the axioms
Pi(x) ´ ($y)(Z(y) Ÿ y + ... + y = x)
to 113, then this expansion of 113 admits QE. Also if we
expand (¬,Q,Z,<,+,,0,1) to (¬,Q,Z,<,+,,0,1,P1,P2,...),
where each Pi is the unary relation
Pi(x) iff x is an integer divisible by i
then (¬,Q,Z,<,+,,0,1,P1,P2,...) admits QE.
We now give a very easy example of QE, and show how it is
used to obtain results like Theorem 1.7. Let us consider
the structure (Z,0).
QE for (Z,0) means that every formula A in PRED about (Z,0)
is equivalent in (Z,0) to a formula B in PRED about (Z,0).
(Recall that = is a freebie in PRED).
So what? If we have any sentence about (Z,0), then it is
equivalent to a quantifier free formula. Hence by obvious
trickery, it is equivalent to a quantifier free sentence
(just plug in 0 for all variables).
But the quantifier free sentences are trivial. They are
just combinations of ÿ, Ÿ, ⁄, ´, ´, and 0 = 0. These can 38
be proved or refuted. Hence the original sentence about
(Z,0) can be proved or refuted.
We now write down axioms that support the QE. The usual
axioms for equality are taken for granted, as in Lecture 1.
The only other axioms that we will need are the axioms
*) ($x1)...($xn)(x1,...,xn are all unequal)
where n is any positive integer. The inside is a
conjunction of inequalities of quadratic length.
Now how does the QE work? A big induction shows that we
just have to eliminate a single quantifier. We can also
tidy things up a bit using propositional calculus. So
things boil down to looking at formulas of the form
($x)(y1 = z1 Ÿ ... Ÿ yn = zn Ÿ w1 ≠ u1 Ÿ ... Ÿ wm ≠ um)
where the y’s, z’s, w’s, u’s are variables or 0, some of
which may be x.
By purely logical reasoning in PRED, we can move all of the
equations and inequations that don’t mention x outside the
scope of the quantifier:
($x)(x = a1 Ÿ ... Ÿ x = ap Ÿ x ≠ b1 Ÿ ... Ÿ x ≠ bq) Ÿ A
where p,q ≥ 0 and A is a conjunction of equations and
inequations that don’t mention x. We can also assume that
none of the a’s and none of the b’s are x. There are still
degenerate cases here, but they cause no trouble.
If p > 0 then we can simply remove the quantifier and
replace every occurrence of x by a1. Then QE is done.
So we can assume that we have
($x)(x ≠ b1 Ÿ ... Ÿ x ≠ bq) Ÿ A
with A as before, and none of the b’s are x.
However, the first conjunct is obviously provable from our
axioms *). Hence we are left with A, and again QE is done.
This argument establishes QE for the axioms *) and for the
structure (Z,0). Moreover, the QE has been algorithmically 39
achieved. One can easily draw the conclusions of Theorem
1.7 for *) and (Z,0).
2. THE REAL AND COMPLEX FIELDS.
The method of QE has been applied to the structure (¬,<,+,,•,0,1), the ordered field of real numbers. This result of
Tarski has been historically viewed as the beginning of
real algebraic geometry.
It was also successfully applied by Tarski to the structure
(C,+,,•,0,1), and this is easier than the real case.
These QE results immediately give the following.
THEOREM 2.1. Every sentence of PRED about (¬,<,+,,•,0,1)
is provable or refutable in a weak fragment of ZFC. The
same holds for (C,+,,•,0,1).
The axioms for (C,+,,•,0,1) are as follows.
1. (C,+,,•,0,1) is a commutative field with unit.
2. 1 + ... + 1 ≠ 0, where there are any finite nonzero
number of 1’s. (Characteristic zero).
3. ($z)(zn + cn1zn1 + ... + c1z + c0 = 0), where n ≥ 1, and
the c’s are variables other than z. (Algebraically closed).
Note that in 3, we have used powers of z to abbreviate the
result of multiplying z with itself many times. There are
obviously infinitely many axioms in 3. We can use one for
every positive n.
THEOREM 2.2. 13 above and (C,+,,•,0,1) admit QE. Every
sentence in PRED holds in (C,+,,•,0,1) iff it is provable
from 13.
COROLLARY 2.3. Every sentence of PRED that holds about one
algebraically closed field of characteristic zero holds
about all algebraically closed fields of characteristic
zero.
Actually, these results also hold in any finite
characteristic. The characteristic of a field is the least
p ≥ 1 such that 1 + ... + 1 = 0, with p 1’s. If none exists,
the characteristic is considered 0. If p exists, then p can
be shown to be a prime.
We use these axioms, for any particular prime p. 40 1(p). (C,+,,•,0,1) is a commutative field with unit.
2(p). 1 + ... + 1 = 0, where there are p 1’s.
3(p). 1 + ... + 1 ≠ 0, where there are fewer than p 1’s.
4(p). ($z)(zn + cn1zn1 + ... + c1z + c0 = 0), where n ≥ 1,
and the c’s are variables other than z. (Algebraically
closed).
THEOREM 2.4. Let p be a prime. 1(p)4(p) admits QE. Every
sentence in PRED that holds in some algebraically closed
field of characteristic p holds in all algebraically closed
fields of characteristic p, and is provable from 1(p)4(p).
We now come to the ordered field of reals, (¬,<,+,,•,0,1).
The axioms for (¬,<,+,,•,0,1) are as follows.
i. (¬,<,+,,•,0,1) is a commutative ordered field with
unit.
ii. x > 0 Æ ($y)(y•y = x). Positives have square roots.
iii. ($x)(xn + cn1xn1 + ... + c1x + c0 = 0), where n ≥ 1 is
odd, and the c’s are variables other than x.
Note that there are infinitely many axioms in iii. We can
take one for each odd n.
THEOREM 2.5. iiii above and (¬,<,+,,•,0,1) admit QE.
Every sentence in PRED holds in (¬,<,+,,•,0,1) iff it is
provable from iiii.
There are many other important axiomatizations for this
vitally important structure, that are logically equivalent.
Here are two, which we present informally.
i. (¬,<,+,,•,0,1) is a commutative ordered field with
unit.
iv. Intermediate value for polynomials.
Here is the other.
i. (¬,<,+,,•,0,1) is a commutative ordered field with
unit.
v. Least upper bound principle for all formulas in PRED
about (¬,<,+,,•,0,1).
There has been considerable work on the structure (¬,<,+,,•,exp,0,1), where exp(x) = ex. It is not known whether 41
every sentence true about this structure is provable or
refutable in ZFC or in a weak fragment of ZFC.
This is known if we can prove Schanuel’s Conjecture (SCT).
Alternatively, we know that this is true for ZFC + SCT.
SCT asserts the following. If z1,...zn Œ C are linearly
independent over Q, then some n of the 2n numbers
z1,...,zn,ez_1,...,ez_n are algebraically independent.
SCT implies the (unknown) algebraic independence of e and p
by taking z1 = 1 and z2 = pi. (Note epi = 1).
THEOREM 2.6. Every sentence of PRED about (¬,<,+,,•,exp,0,1) is true if it is provable in a weak fragment of
ZFC together with SCT.
3. FINITE MATHEMATICS.
Many mathematicians think that the essence of mathematics
is finite, and that infinite objects are around just for
convenience. How can we construct foundations for such a
point of view?
In lecture 1, we had given an axiom system 19 for the
hereditarily finite sets, or HF sets. We developed a fair
amount of basic finite mathematics in this system.
Obviously there can be no direct treatment of infinitary
objects such as real numbers in a system like 19, which
proves “every set is finite”.
Can we support the mathematician who wishes to use real
analysis but fundamentally believes only in finite objects?
The current approach to this issue is to create weak
fragments of ZFC that allow for much real analysis to be
done without major modification.
We then show that any theorem of such a fragment of ZFC
that is just about finite objects can already be proved in
19. This allows the mathematician who only believes in
finite objects to engage in real analysis and other kinds
of higher mathematics with confidence. 42
I will give one example of such a result. Recall 19, which
is generally acceptable to the mathematician who only
believes in finite objects:
1. EXTENSIONALITY.
2. PAIRING.
3. UNION.
4. POWER SET.
5. CHOICE.
6. FOUNDATION.
7. SEPARATION.
8. REPLACEMENT.
9’. FINITENESS.
We now introduce another axiom system that accommodates
infinite sets. I don’t mean the replacement of 9’ with 9
(infinity), as this is ZFC, and is much too strong for what
we want to do.
We say that x is an elemental set if and only if x is an
element of some set. Of course, using Pairing we can prove
that every set is an elemental set.
But in this system I am about to introduce, the conceptual
framework is different, and we will only have a modified
Pairing axiom. We should now think of the sets as
the sets that consist entirely of HF sets.
Under this conceptual framework, the elements are just the
HF sets. This is not all sets in this new conceptual
framework, since the set of all HF sets will be among the
sets in this new conceptual framework.
Here are the axioms.
1. EXTENSIONALITY.
2*. ELEMENTAL PAIRING. There is a elemental set consisting
of any two given elemental sets.
3*. ELEMENTAL UNION. There is an elemental set consisting
of the elements of the elements of any given elemental set.
4*. ELEMENTAL POWER SET. There is an elemental set
consisting of all subsets of any given elemental set. 43
5. CHOICE.
6. FOUNDATION.
7*. WEAK SEPARATION. There is a set consisting of the
elements of any given set that satisfy some given
condition, provided that in that given condition, all
quantifiers range over elemental sets only.
8*. ELEMENTAL REPLACEMENT. If every element of a given
elemental set is related to exactly one elemental set by a
given condition, then there is an elemental set consisting
of the elemental sets such that some element of the given
set is related to it, provided that in that given
condition, all quantifiers range over elemental sets only.
9*. ELEMENTAL FINITENESS. Every nonempty elemental set has
an Œ maximal element.
Let us call this system (19)*. Note that we have left
axioms 1,5,6 untouched.
Note that (19)* proves the existence of lots of infinite
sets, and in fact the existence of a largest set. For by
7*, we have the set of all elemental sets, which is clearly
the largest set.
It can be seen that (19)* nicely accommodates much
elementary analysis because of its reasonably flexible
ability to construct infinite sets.
THEOREM 3.1. Let A be a sentence in PRED with Œ,= only, in
which all quantifiers range over elemental sets only. Then
A is provable in (19)* if and only if A is provable in 19.
Some interesting examples have surfaced of theorems about
finite objects, that cannot be proved in 19 (and therefore
not even in (19)*). Moreover, these examples cannot be
proved in rather far reaching strengthenings of 19. In
fact, such theorems have surfaced about finite objects,
where, in some appropriate sense, all proofs must involve
consideration of uncountably many objects.
The original examples concern embeddings between finite
trees. This is a classical topic in combinatorics, and one
of the early results is Kruskal’s theorem. 44 There are several ways to define finite trees, and we will
take the partial ordering approach. A finite tree is a pair
(T,£), where
i) T is a nonempty finite set;
ii) x £ x;
iii) (x £ y Ÿ y £ x) Æ x = y;
iv) (x £ y Ÿ y £ z) Æ x £ z;
v) x,y £ z Æ (x £ y ⁄ y £ x);
vi) there is a £ least element, called the root.
In the mind’s eye, a tree is visualized so that its root is
at the bottom.
Note that in a finite tree (T,£), any two elements have a
greatest lower bound, called the inf.
Let T and T’ be finite trees. We abuse notation by
suppressing the £’s.
An inf preserving embedding h:T Æ T’ is a oneone function
h:dom(T) Æ dom(T’) such that for all x,y Œ dom(T),
h(x inf y) = h(x) inf h(y).
This is essentially the same as having a topological
embedding in the sense of topological spaces, where line
segments connect appropriate vertices.
THEOREM 3.1. Let T1,T2,... be finite trees. There exists i <
j such that Ti is inf preserving embeddable into Tj.
It is known that, in an appropriate sense, the only way to
prove Theorem 3.1 is to work with infinite sequences of
finite objects as objects themselves.
The statement of Theorem 3.1 is not fully in finite
mathematics. But there is a finite form of Theorem 3.1 that
asserts the following:
In any tree with vertices labeled from an r element set,
which is fully k splitting up to a sufficiently large
maximum uniform height, there are two uniform height
truncations, where the lower one is inf and label
preserving embeddable into the higher one, sending the top 45
level of the lower one into the top level of the higher
one.
It is known that even to prove this finite statement, one
must go way beyond 19 in far reaching ways into infinite
mathematics.
On the other hand, I have conjectured that all of the
famous number theory to date can be done within even weak
fragments of 19. This is far from being established.
4. COUNTABLE MATHEMATICS.
Countable mathematics is another significant, and much more
liberal, kind of mathematics. Here all objects are to be
countable.
Real numbers are to be viewed as countable objects  sets
of rationals, and rationals are finite objects.
The real line itself is uncountable, and it seems that in
countable mathematics, one cannot manipulate, say, complete
separable metric spaces as objects.
However, from the logical point of view, any complete
separable metric space is really a countable object. That
is because it can be coded as a countable metric space i.e., a metric space on one of its countable dense sets. We
then define the notion of a point in its completion without
having the completion itself: points are Cauchy sequences
in the countable metric space. Of course, generally every
Cauchy sequence is equivalent to many others, but that is
not a problem. E.g., we can take preferred Cauchy
sequences, using the enumeration of the countable dense
set, or simply ignore the problem, choosing instead to
define equality.
Now once we have an entirely countable presentation of
Polish spaces and their “points”, we can go on to define
the Borel measurable sets and functions in and between
Polish spaces.
Recall that the Borel measurable subsets of a Polish space
form the least s algebra containing the open sets. We don’t
even want to try to use this definition in countable
mathematics. 46
Instead, we use the well known development of Borel sets in
terms of countable well founded trees. The terminal nodes
are labeled by open sets.
Each nonterminal node is labeled by “complement” or “union”
or “intersection”. Of course, if “complement” is the label
then there must be no splitting.
We can represent open subsets of Polish spaces by using the
set of all pairs (x,q), where x lies in the countable dense
subset and q is a positive rational such that the open ball
about x with radius q is a subset of the open subset being
coded. Under this representation, open subsets of Polish
spaces are countable objects, and this allows us to use
them as labels of the terminal nodes in the countable well
founded trees discussed above.
Borel measurable functions can be handled by working with
the inverse images of open balls with center from the
countable dense set, with positive rational radius.
There is a convenient fragment of ZFC that easily supports
the development of countable mathematics as outlined above.
I will put a big fat DELETE sign against the single axiom
that we are going to delete from ZFC for this purpose.
Recall the axioms of ZFC from lecture 2:
1. EXTENSIONALITY.
2. PAIRING.
3. UNION.
4. POWER SET. DELETE!
5. CHOICE.
6. FOUNDATION.
7. SEPARATION.
8. REPLACEMENT.
9’. INFINITY.
Incidentally, this is far more than enough to do Kruskal’s
tree theorem from the previous section.
In fact, this system, ZFC without the power set axiom,
written ZFC\P, is strong enough to do an overwhelming
preponderance of mathematics in the uniform, direct, and
natural way indicated above.
On the other hand, there are now a number of interesting
theorems of ZFC whose statements lie in countable 47
mathematics, but where we know that there is no proof in
ZFC\P.
We will now discuss some examples concerning Borel
measurable functions.
Historically, the first example was the Borel
diagonalization theorem.
Cantor’s theorem (there are uncountably many reals) can be
put in the form: in any infinite sequence of reals, some
real is missing. We want to talk about “finding” a missing
real.
THEOREM 4.1. There is a Borel measurable function F from
the Polish space ¬• into ¬ such that for all x Œ ¬•, F(x)
is missing from x.
Note that the value of this “diagonalizer” F at a sequence
may vary if we use another sequence with the same range.
THEOREM 4.2. For all range independent Borel F:¬• Æ ¬ there
exists x Œ ¬• such that F(x) is a coordinate of x.
Theorem 4.2 can be proved in ZFC but not in ZFC\P. The
proof uses the Baire category theorem.
The Baire category theorem applied to Polish spaces is not
problem in ZFC\P. However, in the proof of Theorem 4.2, the
Baire category theorem is applied to a very nonseparable
space  namely, ¬•, where ¬ is given the discrete topology!
We have to work with the Borel measurable subsets in this
crazy topology, and such sets cannot be given countable
substitutes as we were able to do for Borel measurable
subsets of Polish spaces.
The proof of Theorem 4.2 doesn’t need too much more than
ZFC\P. For instance the following is enough:
1. EXTENSIONALITY.
2. PAIRING.
3. UNION.
4. POWER SET. REPLACE BY “POWER SET OF SOME INFINITE SET
EXISTS”.
5. CHOICE.
6. FOUNDATION. 48
7. SEPARATION.
8. REPLACEMENT.
9’. INFINITY.
We now move on to even greater uses of ZFC. There is an
important fragment of ZFC called ZC (Zermelo set theory
with the axiom of choice). Here it is:
1. EXTENSIONALITY.
2. PAIRING.
3. UNION.
4. POWER SET.
5. CHOICE.
6. FOUNDATION. DELETE.
7. SEPARATION.
8. REPLACEMENT. DELETE.
9’. INFINITY.
ZC is sufficient for the direct, coding free treatment of
the vast preponderance of mathematics.
Although there is no logical implication between ZC and
ZFC\P, ZC is much stronger in various senses. E.g., any
statement about Borel measurable sets/functions in and
between Polish spaces provable in ZFC\P, is provable in ZC,
but not vice versa.
THEOREM 4.3. Let E be a Borel subset of ¬2 which is
symmetric about the line y = x. There is a Borel function
F:¬ Æ ¬ such that either ("x)((x,F(x) Œ E) or
("x)((x,F(x)) œ E).
The hypothesis means that (x,y) Œ E ´ (y,x) Œ E.
Theorem 4.3 is provable in ZFC but not in ZC. The proof
necessarily uses every countable initial segment of the
cumulative hierarchy. LECTURE 4
DO WE NEED MORE AXIOMS?
Harvey M. Friedman
Department of Mathematics
Ohio State University
May 7, 2003 http://www.math.ohiostate.edu/~friedman/ friedman@math.ohiostate.edu 49
The ZFC axioms are known to be insufficient to settle a
variety of mathematical questions. In this fourth and final
lecture, we survey a number of examples, and give some
indication of the techniques used to establish such
independence results.
Most of the examples are of a distinctly more “set
theoretic” flavor than what is usually encountered in
mathematics.
However, there are now some new kinds of examples involving
Borel measurable sets/functions, and even sets/functions in
the natural numbers.
1. THE AXIOM OF CHOICE.
At one time, the axiom of choice was regarded as extremely
controversial, since it differs so much from the other
axioms of ZFC. Now it is generally regarded as roughly on a
par with the other axioms of ZFC.
The axiom of choice asserts the following: given any set of
pairwise disjoint nonempty sets, there is a set which has
exactly one element in common with each (of these pairwise
disjoint nonempty sets).
Recall the axioms of ZFC yet again:
1. EXTENSIONALITY.
2. PAIRING.
3. UNION.
4. POWER SET.
5. CHOICE.
6. FOUNDATION.
7. SEPARATION.
8. REPLACEMENT.
9’. INFINITY.
Note that every axiom except extensionality and foundation
is a set existence axiom.
In all of the set existence axioms, with the sole exception
of the axiom of choice, we have an explicit description of
the set that is being constructed.
So from the beginning, the axiom of choice stood out as
rather special. 50 The axiom of choice is used in order to get any kind of
decent theory of cardinality:
THEOREM 1.1. (ZF). Choice is equivalent to “given two sets,
there is a oneone map from the first into the second, or a
oneone map from the second into the first”.
How do we get this comparability from choice? This is
through the crucial notion of a well ordering. A well
ordering of a set A is a strict linear ordering on A in
which every nonempty subset of A has a least element. I.e.,
i) ("x,y Œ A)(ÿ(x < x) Ÿ (x < y ⁄ y < x ⁄ x = y));
ii) ("x,y,z Œ A)((x < y Ÿ y < z) Æ x < z);
iii) ("x Õ A)(x ≠ ∅ Æ ($y Œ x)("z Œ x)(ÿ(z < y))).
THEOREM 1.2. (ZF). Choice is equivalent to “every set can
be well ordered”.
A countable set A is a set such that there exists a oneone
function f:A Æ w. Recall that w = {0,1,...} is the least
limit ordinal.
Obviously, if A is countable then we have a well ordering
of A by
x <’ y iff f(x) < f(y).
So countable sets can be well ordered without using choice.
But how do we well order the real line, ¬?
THEOREM 1.3. ZF does not prove that ¬ can be well ordered
(assuming ZF is consistent).
There are much stronger and more dramatic results than
Theorem 1.3.
THEOREM 1.4. ZF does not prove that “¬ cannot be decomposed
into a countable union of countable sets”. I.e., ZF + “¬
can be decomposed into a countable union of countable sets”
is consistent (assuming ZF is consistent).
EXERCISE. Show in ZF that “¬ can be well ordered” implies
“¬ cannot be decomposed into a countable union of countable
sets”. 51
An infinite set is a set that is not finite; i.e., not in
oneone correspondence with any bounded initial segment of
N.
THEOREM 1.5. ZF does not prove that every infinite set of
reals contains a countably infinite subset (assuming ZF is
consistent).
EXERCISE: Show in ZF that a) every infinite set of reals
has a limit point (maybe not in the set); b) every infinite
set of reals can be split into an infinite sequence of
distinct pairwise disjoint infinite sets of reals.
If we go beyond just sets of reals, we might have an
infinite set whose subsets are completely impoverished:
THEOREM 1.6. ZF does not prove that every infinite set can
be split into two disjoint infinite subsets (assuming ZF is
consistent).
Of course, we cannot have such an impoverished infinite set
that is a set of reals. But how simple can such a set be?
THEOREM 1.7. ZF does not prove that every set of countable
sets of reals can be split into two disjoint infinite
subsets (assuming ZF is consistent).
Enough of this. The main reason why working on ZF is no
longer fashionable is that there is an unorganized jungle
of consistent pathology, and nobody has been able to put
any overarching structure into the subject.
For instance, one would like a good answer to the question:
what does it mean to say that the axiom of choice is
violated whenever possible or as much as possible? This
would be interesting even in restricted contexts.
2. CONSISTENCY OF THE AXIOM OF CHOICE.
In the days when the axiom of choice was controversial, a
major issue was whether the adjoining of the axiom of
choice to ZF to form ZFC was dangerous. Dangerous  not in
the sense that it would cause illness or death  but in the
far worse sense that it might allow a contradiction.
THEOREM 2.1. If ZF is consistent then ZFC is consistent. 52
This is called a relative consistency result, and it is
ideal for such a result to be proved using only the most
unimpeachable of reasoning. In fact, the proof of Theorem
2.1 was ultimately couched in terms of rudimentary symbol
manipulation.
We explain the ideas behind the proof of Theorem 2.1.
Recall the cumulative hierarchy of sets:
V(0) = ∅,
V(a+1) = S(V(a)),
V(l) = »{V(b): b < l).
Here a,b are ordinals, l is a limit ordinal, and S is used
for the power set. In particular, S(V(a)) is the set of all
subsets of V(a).
In ZF, we can prove that every set is an element of some
V(a). So this cumulative hierarchy of sets is exhaustive,
demonstrably in ZF.
The thing about the cumulative hierarchy that prevents us
from really getting a handle on the structure of sets
within ZF, is the monstrous power set construction, S.
Actually, this is the cause of our lack of understanding
even with the axiom of choice, AxC.
(As we go more deeply into set theory, we start to complain
about the ordinals, also. Of course, once Einstein went
deeply into physics, he started to complain about classical
space/time.)
So we need a thinner, more manageable, more down to earth
substitute for the cumulative hierarchy that avoids use of
the power set construction.
L(0) = ∅,
L(a+1) = FODO(L(a)),
L(l) = »{L(b): b < l}.
Here FODO means “first order definable over”. In
particular, FODO(L(a)) is the set of all subsets of L(a) of
the form
{x Œ L(a): A(x) holds in (L(a),Œ)} 53
where A is a formula of PRED using Œ,= only, with
parameters from L(a) allowed.
It turns out that this is a nicely behaved hierarchy of
sets, even from the point of view of ZF.
We call a set constructible iff it appears in this
hierarchy; i.e., is an element of some L(a).
So, we are sitting in ZF, without the axiom of choice, and
we make this hierarchical construction of the constructible
sets.
We don’t know in ZF whether every set is constructible. In
fact, many years later it was shown that even in ZFC you
can’t tell if every set is constructible.
Now sitting in ZF, and looking at the constructible sets,
it turns out that we can prove that every single axiom of
ZF (taken individually) holds in the constructible sets. In
other words, if we take any axiom of ZF, and restrict the
quantifiers in it to constructible sets only, then it
becomes a theorem of ZF.
Now here’s the clincher. Let us look at the axiom of
choice, something that we have no reason to be able to
think that we can prove in ZF. (In fact, many years later
it was shown that the axiom of choice cannot be proved in
ZF).
It turns out that when we relativize the axiom of choice to
the constructible sets, then we also get a theorem of ZF!
Putting this together, we can now see how to convert any
contradiction in ZFC = ZF + AxC into a contradiction in ZF.
How? Just relativize all quantifiers in the supposed
contradiction in ZFC to the constructible sets, and you get
a contradiction in ZF (by filling in the missing steps with
the proofs of the relativizations of the axioms of ZFC to
the constructible sets).
So Theorem 2.1 has been established.
3. THE CONSISTENCY OF THE CONTINUUM HYPOTHESIS.
The two featured problems left open by Cantor as he created
(developed, discovered) set theory were 54 A. The Axiom of Choice (AxC).
B. The Continuum Hypothesis (CH).
CH asserts the following. Every uncountable set of reals is
in oneone correspondence with the set of all reals.
The CH is normally asked in a context where one assumes the
AxC (although it is also interesting otherwise). We will
follow this norm.
As I said earlier, Gödel showed that every axiom of ZFC
becomes a theorem of ZF if we relativize all quantifiers to
the constructible sets.
Gödel also tackled CH. With yet more clever ideas, he also
showed that CH becomes a theorem of ZF when all quantifiers
are relativized to the constructible sets.
This establishes the consistency of ZFC + CH, assuming ZF
if consistent.
THEOREM 3.1. If ZF is consistent then ZFC + CH is
consistent.
The generalized continuum hypothesis, GCH, asserts the
following. For any sets A,B, either there is a oneone map
from A into B or a oneone map from S(B) into A.
It is easy to see that GCH for B = w is just CH.
Gödel also showed that GCH becomes a theorem of ZF when all
quantifiers are relativized to the constructible sets.
THEOREM 3.2. If ZF is consistent then ZFC + GCH is
consistent.
4. ADJOINING ELEMENTS TO A MODEL OF SET THEORY.
We want to sketch some of the ideas surrounding P.J.
Cohen’s proof that if ZF is consistent then ZF + ÿAxC and
ZFC + ÿCH are consistent.
The method is called forcing, but the basic information I
am going to tell you is quite memorable and easy to
understand, and does not involve a discussion of forcing. 55
Of course, a modern polished version of forcing is what is
normally used to prove this basic information. But don’t
worry about that.
There are lots of kinds of models of set theory; here we
will use only this kind:
(L(k),Œ)
where k is a countable ordinal, in which ZFC holds.
There are lots of L(k) = (L(k),Œ), k countable, satisfying
ZFC, assuming a bit more set theory than ZF has available
to us.
If we stay within ZF, we can prove in ZF that there are
lots of L(k), k countable, satisfying any given specified
finite fragment of ZF. This will turn out to be good enough
for our purposes.
We would like to adjoin a new set x to L(k) to get something
we call L(k,x), and hope that L(k,x) remains a model of ZFC.
What should L(k,x) be? Recall the relevant part of the
constructible hierarchy (up through k):
L(0) = ∅,
L(a+1) = FODO(L(a)),
L(l) = »{L(b): b < l}
where l is a limit ordinal, a,l £ k.
Now let x be a set. What should L(k,x) be?
L(0,x) = {x},
L(a+1,x) = FODO(L(a,x)),
L(l,x) = »{L(b,x): b < l}
where l is a limit ordinal, a,l £ k.
But how do we know that L(k,x) satisfies ZFC? It is not hard
to find even x Õ w such that L(k,x) does not satisfy ZFC.
It turns out that there are lots of x Õ w that work, but
you can’t get your hands on a good description of any! 56
THEOREM 4.1. Let L(k) satisfy ZFC, k countable, and g < k be
infinite. Then {x Õ g: L(k,x) satisfies ZFC} is of full
category and full measure in the Cantor space S(g).
Why is S(g) a Cantor space? Because g < k is a countably
infinite ordinal.
Just to get one x œ L(k) to work, we apparently need to get
lots of them.
5. UNPROVABILITY OF THE CONTINUUM HYPOTHESIS IN ZFC.
Again let L(k) be a model of ZFC, where k is countable.
There are a number of sharpened versions of Theorem 4.1.
THEOREM 5.1. Let L(k) satisfy ZFC, k countable, and assume
that L(k) thinks that g is uncountable. Then {x Õ g: L(k,x)
satisfies ZFC + ÿCH} is of full category and full measure
in the Cantor space S(g).
So we now have the following.
THEOREM 5.2. If there exists L(k) satisfying ZFC, k
countable, then ZFC + ÿCH is consistent.
Unfortunately, the hypothesis of Theorem 5.2 cannot be
obtained from “ZF is consistent”. So we work with finite
fragments of ZF.
LEMMA 5.3. There exists L(k) satisfying any given specified
finite fragment of ZFC, with k countable. This is provable
in ZF for any specific finite fragment of ZFC.
LEMMA 5.4. Let T be a finite fragment of ZFC. Let L(k)
satisfy a sufficiently large finite fragment of ZFC, where k
is countable, and assume that L(k) thinks that g < k is
uncountable. Then {x Õ g: L(k,x) satisfies T + ÿCH} is of
full category and full measure in the Cantor space S(g).
THEOREM 5.5. If ZF is consistent then ZFC + ÿCH is
consistent.
6. UNPROVABILITY OF THE AXIOM OF CHOICE IN ZF.
Again let L(k) satisfy ZFC, k countable. Let x1,x2,... be an
infinite sequence of sets. We define L(k,x1,...) as
expected: 57 L(0,x1,...) = {x1,...},
L(a+1,x1,...) = FODO(L(a,x1,...)),
L(l,x1,...) = »{L(b,x1,...): b < l}
where l is a limit ordinal, a,l £ k.
THEOREM 6.1. Let L(k) satisfy ZFC, k countable. Then
{(x1,...) Œ S(w)•: L(k,x1,...) satisfies ZF + “¬ is not well
ordered”} is of full category and full measure in the
Cantor space S(w)•.
Arguing as before, using finite fragments of ZFC, we get
the following.
THEOREM 6.2. If ZF is consistent then ZF + “¬ is not well
ordered” is consistent. In particular, ZF + ÿAxC is
consistent (if ZF is consistent).
7. APPLICATIONS OF FORCING.
The method of forcing is what is behind these category/
measure results. They can be proved without resorting to
forcing,. However, it turns out to be too cumbersome to
directly do category/measure arguments in more complicated
situations without the machinery of forcing.
The technique is exquisitely applicable for a great many
but not all set theoretic problems. The area is largely
mined out, but what is missing is organizational results of
an imaginative and striking character.
Let me mention just a small sample of attractive statements
that are known to be neither provable nor refutable in ZFC
(assuming ZF is consistent). (Item i needs a bit more than
ZF is consistent).
i. All sets of reals that are set theoretically definable
from a real and an ordinal are Lebesgue measurable
(alternatively have the property of Baire; i.e., differ
from an open set by a meager set).
ii. All sets of reals of cardinality less than ¬ are of
measure zero (alternatively meager). 58
iii. Every dense linear ordering in which every set of
pairwise disjoint open intervals is countable has a
countable dense subset. (Souslin’s hypothesis).
iv. All homomorphisms from the Banach algebra of continuous
complex valued functions on [0,1] are continuous. (A
conjecture of Kaplansky).
v. Every set of reals, all of whose homeomorphic images are
of measure zero, is countable. (A conjecture of E. Borel).
8. LIMITATIONS OF FORCING.
Although CH is a monumentally important question for set
theory, it has not proved to be so vital for mathematics.
One key reason is that the problem has been answered
positively for “nice” sets of reals, long ago:
THEOREM 8.1. Every uncountable Borel measurable set of
reals is in oneone correspondence with the set of all
reals. In fact, the oneone correspondence can be taken to
be Borel measurable.
The Borel measurable world  i.e., Borel measurable
sets/functions in and between Polish spaces  is more than
sufficient for the vast preponderance of current
mathematical activity. So when a mathematician who is
couching things in unusual generality, gets into logical
trouble, he/she will generally find comfort in hiding out
within (what amounts to) the Borel measurable world (or
less). For instance, the world of finitely generated
algebraic structures lies well within the Borel measurable
world.
Let’s see what happens to our five other examples of
statements that are known to be neither provable nor
refutable in ZFC.
i. All sets of reals that are set theoretically definable
from a real and an ordinal are Lebesgue measurable
(alternatively have the property of Baire). FOR BOREL SETS,
OBVIOUSLY PROVABLE, SINCE BOREL SETS ARE MEASURABLE AND
HAVE THE PROPERTY OF BAIRE. 59
ii. All sets of reals of cardinality less than ¬ are of
measure zero (alternatively meager). FOR BOREL SETS
OBVIOUSLY PROVABLE SINCE THEY MUST BE COUNTABLE.
iii. Every dense linear ordering in which every set of
pairwise disjoint open intervals is countable has a
countable dense subset. (Souslin’s hypothesis). IF THE
ORDERING IS BOREL THEN THIS HAS BEEN PROVED.
iv. All homomorphisms from the Banach algebra of continuous
complex valued functions on [0,1] are continuous. (A
conjecture of Kaplansky). IN VERY GENERAL SETTINGS, EVERY
BOREL HOMOMORPHISM IS CONTINUOUS.
v. Every set of reals, all of whose homeomorphic images are
of measure zero, is countable. (A conjecture of E. Borel).
SINCE EVERY UNCOUNTABLE BOREL SET HAS A PERFECT SUBSET, IT
HAS A HOMOEOMORPHIC IMAGE OF POSITIVE MEASURE. HENCE
PROVABLE.
We now mention some general limitations to forcing.
First of all, an
out of reach, is
standard looking
cannot be proved ultimate for a logician, something totally
to show that some specific notorious
statements in very finite mathematics
or refuted in ZFC. EXAMPLE: Show that “p^p^p^p is rational” is neither provable
nor refutable in ZFC, assuming ZF is consistent.
Forcing is powerless to deal with a problem like this. The
same is true of Gödel’s inner model technique
(constructible sets).
Of course, I am powerless to deal with a problem like this.
But in the case of forcing and constructible sets, we know
by the following general Remark just why.
REMARK. For any model of ZFC, all of its forcing extensions
and all of its inner models have isomorphic rings of
integers, and so obey the same sentences of finite set
theory.
The situation is much worse than this. Consider sentences
of the form
("x Õ N)($y Õ N)(A(x,y)) 60 where A involves only quantification over N. Concrete
mathematics tends to involve only quantification over N.
And when it goes beyond this, it tends to involve only one
quantifier over subsets of N. Here we have two such
quantifiers.
THEOREM 8.1. For any model of ZFC, all of its forcing
extensions and all of its inner models obey the same
sentences with at most two quantifiers over subsets of N
(as above).
9. AN EXAMPLE FROM THE BOREL UNIVERSE.
Let S be a set of ordered pairs and A be a set. We say that
f is a selection for S on A iff dom(f) = A and for all x Œ
A, (x,f(x)) Œ S.
PROPOSITION. Let S Õ ¬ x ¬ be Borel and E Õ ¬ be Borel. If
there is a Borel selection for S on every compact subset of
E, then there is a Borel selection for S on E.
THEOREM 9.1. The Proposition cannot be proved or refuted in
ZFC, assuming a bit more than ZF is consistent.
This Proposition is due to some functional analysts with a
great deal of expertise in descriptive set theory at Paris
VII, Debs and Saint Raymond. They were knowledgeable enough
to have proved it using some well known candidate axioms
going well beyond ZFC.
10. AN EXAMPLE FROM DISCRETE MATHEMATICS.
We have just seen that there are statements about the Borel
universe being considered by analysts in the natural course
of their research, that cannot be proved or refuted in ZFC.
There is a great deal of skepticism that there are
statements in discrete mathematics being considered by
mathematicians in the natural course of their research,
that cannot be proved or refuted in ZFC  or at least, that
logicians have any serious chance of showing cannot be
proved or refuted in ZFC.
However, a new area of discrete mathematics, with a clear
thematic purpose, is emerging. The new area has attractive 61
statements, delicate proofs, connections with various parts
of mathematics, and a vast array of deep open problems.
But some of the sharp results in the area cannot be proved
or refuted in ZFC, but can be proved using well known
candidates for new axioms.
This new area of discrete mathematics is called BOOLEAN
RELATION THEORY (BRT).
In fact, BRT is not at all restricted to discrete
mathematics. It makes sense as a project in virtually any
interesting mathematical context. We view BRT as a new kind
of mathematical investigation.
We begin with a description of BRT in its most elemental
forms. We will use N for the set of all nonnegative
integers; i.e., w.
In what we are going to call elemental BRT, one first
identifies an interesting class V of multivariate
functions, as well as an interesting class K of sets.
Let f be a multivariate function and A be a set. We write
fA for the set of all values of f at arguments drawn from
A. Thus fA is a convenient notation for the forward image
of a multivariate function on a set.
E.g., let f be binary addition from N into N and A be the
set of odd elements of N. Then fA is the set of even
elements of N without 0.
Now one considers statements of the following form.
For all f Œ V there exists A Œ K such that some given
Boolean relation holds between A and fA.
Let’s fix on two kinds of “Boolean relations”. A Boolean
equation (inequation) is an equation (inequation) between
Boolean combinations of A and fA.
To take complements, we need a universal set, which we take
to be the union of the elements of K. (Remember, your
mother told you never to take an unrestricted complement of
a set! It’s way too big!! Don’t put more food on your plate
than you can eat, your eyes are bigger than your stomach, 62
don’t go out without a jacket or you’ll catch a cold,
etc.).
Standard Boolean algebra (or PROP) tells us that the number
of formal Boolean equations (inequations), up to formal
equivalence, in the two variables A,fA, is 16.
The simplest example that is rather important and deep is
the following. Let MF(N) be the class of all functions of
several variables from N into N (range contained in N), and
INF(N) be the class of all infinite subsets of N.
THIN SET THEOREM. For all f Œ MF(N) there exists A Œ INF(N)
such that fA ≠ N.
Unless you are adept at just the right branch of
combinatorics, you will find this very difficult to prove.
All 16 statements in elemental inequational and equational
BRT for MF(N) and INF(N) have been completely analyzed.
For an interesting example of elemental equational BRT, we
use the class SD(N) of all strictly dominating elements of
MF(N). I.e., we require that for all x Œ dom(f),
f(x) > max(x).
COMPLEMENTATION THEOREM. For all f Œ SD(N) there exists A Œ
INF(N) such that fA = N\A. Furthermore, A is unique.
This is a fixed point theorem that can be obtained from
scratch, or by applying the contraction mapping theorem.
All 16 statements in elemental equational and inequational
BRT for SD(N) and INF(N) have been completely analyzed.
In (full blown) BRT we consider all statements of the
following form:
For all f1,...,fn Œ V there exists A1,...,Am Œ K such that
some given Boolean relation holds between the A’s and their
forward images under the f’s.
The number of such statements is 2^2^m(n+1). Even for n = 1
and m = 2 (one function and two sets) this amounts to 216.
The analysis of all these statements has yet to be done.
We have analyzed some special corner of equational BRT with
2 functions and 3 sets, and there is a big surprise. (The 63
number of statements with n = 2 and m = 3 is 2512, quite a
large number!).
In this corner of BRT, we use a natural subclass of MF(N)
called the functions of expansive linear growth, ELG(N).
Here we require that there exists constants c,d > 1 such
that for all but finitely many x Œ dom(f),
cx £ f(x) £ dx.
Here x is just max(x).
We use the standard disjoint union notation. We write X ».
Y (the disjoint union of X,Y) for X » Y together with the
commitment that X,Y are disjoint. For example,
X ». Y Õ Z ». W
means
X » Y Õ Z » W Ÿ X « Y = ∅ Ÿ Z « W = ∅.
We have analyzed all statements of the following form:
PROPOSITION. For all f,g Œ ELG(N) there exist A,B,C Œ
INF(N) such that
X ». fY Õ Z ». gW
S ». fT Õ U ». gV
where X,Y,Z,W,S,T,U,V are among the letters A,B,C.
Obviously there are exactly 38 = 6561 such statements.
It turns out that, up to obvious symmetry, all but ONE of
these 6561 statements can be proved or refuted within a
very weak fragment of ZFC. The ONE exception, up to
symmetry, cannot be proved or refuted in ZFC.
The ONE exception, up to symmetry, can be proved using some
well known candidate axiom  the existence of Mahlo
cardinals of every finite order.
Here is the ONE exception, up to symmetry.
PROPOSITION*. For all f,g Œ ELG(N) there exist A,B,C Œ
INF(N) such that 64 A ». fA Õ C ». gB
A ». fB Õ C ». gC. ...
View
Full
Document
This note was uploaded on 08/05/2011 for the course MATH 366 taught by Professor Joshua during the Fall '08 term at Ohio State.
 Fall '08
 JOSHUA
 Math

Click to edit the document details