Unformatted text preview: 1 University of Pennsylvania
Department of Mathematics
Hans Rademacher Lectures
September 17  20, 2002
4:30 p.m., TTh Sep 1719
4:00 p.m., F Sep 20
All lectures in room A6 of the David Rittenhouse Laboratory, first floor,
33rd and Walnut Streets, Philadelphia.
DEMONSTRABLY NECESSARY USES OF ABSTRACTION
Harvey M. Friedman
University Professor
Ohio State University
RADEMACHER SERIES ABSTRACT.
There are many familiar theorems whose proofs use methods which are in some
appropriate sense substantially more "abstract" than its statement. Some
particularly well known examples come from the use of complex variables in
number theory. Sometimes such abstraction can be removed  for example by the
"elementary proof of the prime number theorem"  and sometimes no appropriate
removal is known. The interest in removing abstraction typically varies, with no
agreed upon criteria for appropriateness. E.g., the removal might sacrifice
naturalness or intelligibility, or the result of the removal criticized as being merely
a thinly disguised form of the original.
These Rademacher lectures focus on cases of demonstrable unremovability of
abstraction, primarily (but not solely) in the context of discrete mathematics.
These cases rely on a sharp, fully formalized, criteria for removal, where a proof
of unremovability has been found. The issue of "natural" removal is finessed, as
there is no removal, natural or otherwise.
More specifically, in each case we begin with a theorem whose known proofs use
methods that are unexpectedly abstract relative to its statement. Next, we delineate
flexible and comprehensive methods of lower abstraction. Then we present the
result that the original theorem cannot be proved using only these
methods of lower abstraction.
LECTURE 1. DEMONSTRABLY NECESSARY USES OF ABSTRACTION. 2 In the first lecture, we introduce all of the examples of demonstrable
unremovability of abstraction under discussion. No formalisms will be presented.
The identification of methods and their levels of abstraction will be given only at
the informal mathematical level, without the use of axiomatic systems. The
examples include
1. Minimization in norm of integral polynomials.
2. Termination of lexicographic descent in the natural numbers.
3. Hilbert basis theorem.
4. Degrees of algebraic approximations to sets.
5. Comparison of blocks within finite sequences of natural numbers.
6. Comparison in sequences of finite trees, and within large finite trees.
7. Graph minors in sequences of finite graphs.
8. Continuous comparison of countable sets of reals.
9. Borel diagonalization for infinite sequences of reals.
10. Borel selection/antiselection in symmetric Borel sets.
11. Borel selection in Borel sets.
12. 6561 cases of Boolean relation theory.
The climax of the series is item 12, where the demonstrably necessary level of
abstraction is so immense that it is goes well beyond the usual accepted axioms for
mathematics. This is despite the fact that the context is that of functions on the
natural numbers.
LECTURE 2. POLYNOMIALS, TERMINATION, HILBERT BASES,
DEGREES.
We will present an in depth discussion of the demonstrably necessary uses of
abstract methods in the minimization in norm of integral polynomials, in the
termination of lexicographic descent in the natural numbers, in the Hilbert basis
theorem, and in the degrees of algebraic approximations to sets.
LECTURE 3. COMPARISON OF BLOCKS, TREES, GRAPHS, COUNTABLE
POINTSETS.
We will present an in depth discussion of the demonstrably necessary uses of
abstract methods in the comparison of blocks within finite sequences of natural
numbers, in the comparison of terms in sequences of finite trees, in the
comparison of subtrees within large finite trees, in graph minors within sequences
of finite graphs, and in the continuous comparison of countable sets of reals.
LECTURE 4. BOREL DIAGONALIZATION, BOREL SELECTION,
BOOLEAN RELATION THEORY. 3 We will present an in depth discussion of the demonstrably necessary uses of
abstract methods in Borel diagonalization for infinite sequences of reals, in Borel
selection/antiselection in symmetric Borel sets of reals, in Borel selection in Borel
sets of reals, and finally in 6561 cases of Boolean relation
theory. In this last case, the demonstrably necessary level of abstraction is so
immense that it is goes well beyond the usual accepted axioms for mathematics.
This is despite the fact that the context is that of functions on the natural numbers.
HANS RADEMACHER LECTURES
Philadelphia, Pennsylvania
LECTURE 1
DEMONSTRABLY NECESSARY USES OF ABSTRACTION
Harvey M. Friedman
The Ohio State University
[email protected]
http://www.math.ohiostate.edu/~friedman/ September 17, 2002 In order to prove a Theorem, we sometimes use ideas and
constructions that are, in some sense, out of character
with the statement of the Theorem. This is a perfectly
normal situation. It is one of those things that makes
mathematics so intriguing.
We are all familiar with cases where substantial machinery
of an exotic nature seems to be needed to prove a
comparatively mundane assertion: there are no positive
integers x,y,z,n such that xn+2 + yn+2 = zn+2.
But can we give examples where we can prove that all proofs
of a given Theorem are comparatively exotic?
Before we can prove such a result, we need to be able to
state it mathematically. Normally, finding an appropriate
formulation is highly nontrivial. I will begin with a
familiar example where this turns out to be utterly
straightforward.
Imagine that you are working in an ancient mathematical
context where numerical quantities can be added,
subtracted, multiplied, divided, and compared in size. Then
one day you encounter
x2 = 2. 4 Or, in more rudimentary terms,
x•x = 1+1.
The problem of the existence of a solution baffles
everyone.
Then some clever person comes up with the following
startling idea:
If an expression without division takes on both negative
and positive values, then it takes on zero.
This intermediate value principle for polynomials is beyond
the mathematics of the time, which consists of
manipulations of equalities and inequalities. Eventually it
becomes an accepted way of thinking about “numerical
quantities”, through a combination of introspection and
utility.
Of course, the actual history is much more subtle than
this. But in this case it is particularly easy to give a
satisfactory formulation of the phenomena in question.
The ancient mathematical context is formalized by the usual
ordered field axioms. The result is that
in some ordered fields, x2 = 2 has no solution.
The additional principle used to prove existence has more
than one formalization. An “expression” could mean a
polynomial with or without variable coefficients.
The first interpretation suffices to prove the existence of
a square root of 2.
However it does not give the existence of a square root of
every nonnegative number. E.g., there is an ordered field
containing all algebraic real numbers, and p, but without
any square root of p.
The second interpretation yields the full ordered real
closed field axioms.
Of course this is all very familiar, and in particular, no
mathematical logic is needed to give a satisfactory 5
formulation for a number of such situations, including many
that arise from Euclidean geometry.
Nevertheless, we want to give a mathematical logic
formulation, even if it is less informative than the
preceding algebraic formulation.
1. Write down the ordered field axioms, not as algebraic
conditions, but as a formal system in first order predicate
calculus. Argue by example that this captures a significant
kind of mathematical proof.
2. Show that ($x)(x2 = 2) cannot be proved in this formal
system. This follows from the stronger result that there is
an ordered field in which 2 has no square root.
3. Write down the above intermediate value principle and
add it to the formal system. Show that ($x)(x2 = 2) can now
be proved in this extended system.
In less elementary mathematical contexts, the mathematical
logic viewpoint is typically required in order to have a
chance of a successful formulation. At least as far as we
know now.
In these lectures, we will adhere to the above plan 13.
1. Minimization in norm of integral polynomials.
THEOREM 1.1. Every polynomial of several variables with
integer coefficients achieves a value of least magnitude
over the integers.
This clearly follows from the least number principle; i.e.,
any property that holds of a nonnegative integer holds of a
least nonnegative integer.
One can see by example that most theorems living in the
integers have constructive proofs.
From experience we see that constructivity is a
significant, interesting, and natural condition on proofs.
There are some well known theorems in the integers (or
algebraics) where no constructive proofs are known, and
there is great interest in finding constructive proofs. 6
E.g., Roth’s theorem on approximability, and Falting’s
proof of Mordell’s conjecture.
A well known development in mathematical logic provides
powerful formalizations of constructive reasoning that
corresponds well to informal ideas.
There is an array of results demonstrating that any theorem
proved constructively must have certain algorithmic
properties.
In the case of Theorem 1.1, if it is to be proved
constructively, then it must have the following algorithmic
property:
There must be an algorithm that produces a value of least
magnitude over the integers of any given integral
polynomial. However, by using the solution to Hilbert’s
10th problem (no algorithm for deciding the existence of
solutions to Diophantine equations over the integers),
there is no such algorithm.
THEOREM 1.2. Theorem 1.1 is not constructively provable.
It would be nice to have a supply of deeper theorems in the
integers which we know cannot be proved constructively.
So far, we could be content with avoiding mathematical
logic (proof theory), and think solely in terms of
algorithms. However, there is another aspect of Theorem
1.1, which we take up in the second lecture, regarding the
nature of the inductive argument that is involved.
Mathematical logic is used for the formulation in an
apparently essential way.
2. Termination of lexicographic descent in the natural
numbers.
The lex ordering on Nk is the dictionary ordering.
THEOREM 2.1. Every sequence from Nk that is decreasing in
the lex ordering terminates.
In the case k = 1, it is obvious that the sequence must
stop, and the number of steps is at most the first term. 7
Things get interesting with k = 2. Here one argues by
existential numerical induction. We will give a sense in
which this is required.
For fixed k, we can also argue by existential numerical
induction, but we repeating the argument roughly k times.
The full Theorem 2.1 is also proved by induction, but
cannot be proved by existential numerical induction. It can
be proved by universal/existential numerical induction.
The same results hold for the following sharper theorem.
For x,y Œ Nk, write x £c y iff for all i, xi £ yi.
THEOREM 2.2. Every infinite sequence from Nk has a finite
initial segment such that every term is ≥c some term in that
finite initial segment.
Here it is more difficult to be economical about axioms
used in the proofs, but this has been done.
In lecture 2 we will discuss how to obtain finite versions
with quantitative information.
3. Hilbert basis theorem.
For the purpose of logical analysis, one good formulation
the Hilbert basis theorem is as follows.
THEOREM 3.1. Let P1,P2,... be an infinite sequence of
polynomials from the polynomial ring in k variables over a
(countable) field. $ n such that all P’s are ideal generated
by P1,P2,...,Pn.
In an appropriate sense, HBT is equivalent to Theorem 2.2.
The idea of the connection can be seen by the following
simple construction.
For each x Œ Nk let #(x) be the monomial in k variables,
where the exponent of the ith variable is the ith
coordinate of x. Let x1,x2,... Œ Nk, and consider the
monomials #(x1),#(x2),... . By HBT, let every #(xi) be ideal
generated by #(x1),..., #(xn). Then obviously every xi is ≥
at least one of x1,...,xn. 8
We will discuss finite versions of the Hilbert basis
theorem and associated quantitative information. The
quantitative information is “exotic”.
We also consider sequences of algebraic sets and obtain the
same results.
4. Degrees of algebraic approximations to sets.
The logically exotic nature of HBT and its finite forms
generally does not spill over to results in commutative
algebra/algebraic geometry.
Often using HBT is easiest, but hard work will avoid HBT
and lead to decent estimates. E.g., the decomposition of
algebraic sets into irreducible components. My impression
is that there are some important situations where removal
of HBT has not yet been achieved.
We have found a structural result which is proved using
HBT, but where the quantitative information is exotic, as
in the finite versions of HBT.
Let F be a field and k ≥ 1. Let S Õ F[x1,...,xk]. The nth
algebraic approximation to S is the least superset of S of
presentation degree £ n.
For any set S Õ F[x1,...,xk], we look at the series of
algebraic approximations S[0] ⊇ S[1] ⊇ ... . By HBT, this
terminates. Of course, for algebraic S, this trivially
terminates with S itself.
We say that S Õ F[x1,...,xk] is rich iff S[0] ⊇≠ S]1] ... ⊇≠
S[n] = S, for some n ≥ 0.
THEOREM 4.1. For each k ≥ 1 and field F, there is a bound to
the presentation degrees of the rich S Õ F[x1,...,xk]. In
fact, there is a bound h(k) that depends only on k and not
on F.
We have shown that the best possible bounds are exotic.
5. Comparison of blocks within finite sequences of natural
numbers.
The block subsequence theorem involves a single finite
string in k letters. The binary case is elementary. This 9
challenge is given to gifted high school students in Paul
Sally’s program. At least one student solved it.
THEOREM 5.1. There is a longest finite string x1,...,xn in
two letters such that no consecutive block xi,...,x2i is a
subsequence of a later consecutive block xj,...,x2j.
The longest length is 11, with 12221111111 and 21112222222
only.
THEOREM 5.2. There is a longest finite string x1,...,xn in
three letters such that no consecutive block xi,...,x2i is a
subsequence of a later consecutive block xj,...,x2j.
Theorem 5.2 merely states the existence of a natural number
with a specific “testable” property. Nevertheless, the
simplest way to prove this appears to involve not only
infinite sequences but also defining infinite sequences by
reference to all infinite sequences (impredicativity). In
this context, such impredicativity can be avoided with
considerable difficulty. However, there still has to be
something exotic about the proof. The longest length is
also exotic. The exotic nature of all proofs and the exotic
nature of the longest length are closely related.
THEOREM 5.3. There is a longest finite string x1,...,xn in
any given finite alphabet such that no consecutive block
xi,...,x2i is a subsequence of a later consecutive block
xj,...,x2j.
As expected, the proof of Theorem 5.3 and the associated
quantitative information is yet more exotic. This is
explained in the third lecture.
6. Comparison in sequences of finite trees, and within
large finite trees.
We use the partial ordering definition of finite trees.
I.e., a tree is a finite poset with a least element (root),
where the predecessors of any point are linearly ordered.
Note that there is an obvious inf operation on the vertices
of any finite tree.
J.B. Kruskal works with inf preserving embeddings between
finite trees. 10 THEOREM 6.1. In any infinite sequence of finite trees, one
tree is inf preserving embeddable into a later one.
He also considers finite trees whose vertices are labeled
from a finite set (and more generally).
THEOREM 6.2. In any infinite sequence of finite trees with
vertices labeled from a finite set, one tree is inf and
label preserving embeddable into a later one.
Kruskal’s proof of Theorem 6.1 as well as the simplest
known proof due to NashWilliams, are exotic. NW introduced
the impredicative “minimal bad sequence” argument. We
showed that the construction of infinite sequences by
reference to all infinite sequences is unavoidable
(impredicativity).
We gave a series of finite forms of Kruskal’s theorem.
The finite forms and associated quantitative information
are also shown to be exotic.
These finite forms involve looking at long finite sequences
of finite trees. As was the case with HBT, we got
interested in structural information about a single
sufficiently large finite tree.
THEOREM 6.3. In any sufficiently tall full tree labeled
from a finite set, one truncation can be inf, label, and
terminal preserving embeddable into a higher truncation.
We prove that this finite form and its associated
quantitative information is as exotic as the finite forms
involving finite sequences.
7. Graph minors in sequences of finite graphs.
Finite graphs are pairs (V,E), where for each element of E,
we assign a set of vertices of cardinality 1 or 2
(incidence).
G is minor included in H iff G can be obtained from H by
successively deleting a single edge, contracting a single
edge, or removing an isolated vertex.
Minor inclusion is normally taken up to isomorphism. 11 Here is the graph minor theorem of Robertson/Seymour.
THEOREM 7.1. In any infinite sequence of finite graphs, one
graph is minor included in a later graph.
At a critical place, the proof of Theorem 7.1 uses an
iterated form of the infinite bad sequence argument. A
single uniterated use of the infinite bad sequence argument
is used to prove Kruskal’s tree theorem. However, such
iterations are known to be more powerful as the length of
the iteration increases.
Before the graph minor theorem was proved, we proved a
strengthening of Kruskal’s theorem called the extended
Kruskal theorem using a finitely iterated minimal bad
sequence argument. We had also proved that, in an
appropriate sense, these iterations are unavoidable.
We asked Robertson/Seymour to explicitly derive my EKT from
their GMT.
Robertson/Seymour succeeded in doing this. Therefore the
GMT is at least as exotic as the EKT.
We gave finite forms of EKT with associated numerical
information.
Some of these were converted to finite forms of GMT, also
with associated numerical information. The exotic nature of
EKT and GMT is retained.
8. Continuous comparison of countable sets of reals.
The following is in the classical folklore.
THEOREM 8.1. For any two closed sets of real numbers, one
is continuously embeddable into the other.
The proof uses the CantorBendixson countably transfinite
decomposition of closed sets.
Theorem 8.1 can be made to follow from the following main
case:
THEOREM 8.2. For any two countable closed sets of real
numbers, one is continuously embeddable into the other. 12 The following is also from the classical folklore.
THEOREM 8.3. For any two countable compact metric spaces,
one is continuously embeddable into the other.
By a more careful argument, we have shown the following.
THEOREM 8.4. For any two countable sets of real numbers,
one is continuously embeddable into the other. For any two
countable metric spaces, one is continuously embeddable
into the other.
Here is the most rudimentary form of this result.
THEOREM 8.5. For any two sets of rational numbers, one is
continuously embeddable into the other.
All of these results must, in an appropriate sense, use
arguments involving arbitrary countable ordinals.
We will take this up later, including a way of saying
“must” using descriptive set theory rather than
mathematical logic.
9. Borel diagonalization for infinite sequences of reals.
Here is one form of Cantor’s theorem.
THEOREM 9.1. For any infinite sequence of real numbers,
some real number is not a coordinate of the sequence.
There is a reasonable way of getting a real number that is
off the sequence, from the point of view of descriptive set
theory.
THEOREM 9.2. There is a Borel measurable function F:¬• Æ ¬
such that for all x Œ ¬•, F(x) is not a coordinate of x.
The construction of F is by diagonalization, and there is
every reason to believe that the value of F depends on the
order in which the arguments are given.
THEOREM 9.3. There is no Borel measurable function F:¬• Æ ¬
obeying rng(x) = rng(y) Æ F(x) = F(y), such that for all x
Œ ¬•, F(x) is not a coordinate of x. 13
Or put positively,
THEOREM 9.4. Let F:¬• Æ ¬ be Borel measurable, where for
all x,y Œ ¬•, rng(x) = rng(y) Æ F(x) = F(y). There exists x
Œ ¬• such that F(x) is a coordinate of x.
The proof uses a Baire category argument on the highly
nonseparable space ¬•, where ¬ is given the discrete
topology. In fact, we discuss a family of such results in
lecture 4, including results to the effect that the
nonseparable arguments cannot be replaced by separable
arguments.
The necessary use of machinery becomes much more dramatic
when we consider Borel equivalence relations on ¬. I.e.,
equivalence relations E Õ ¬ ¥ ¬ on ¬ which are Borel
measurable.
THEOREM 9.5. Let E be a Borel equivalence relation on ¬.
Let F:¬• Æ ¬ be Borel, where if x,y Œ ¬• have Eequivalent
coordinates then F(x),F(y) are Eequivalent. Then there
exists x Œ ¬• such that F(x) is Eequivalent to a
coordinate of x.
In order to prove Theorem 9.5, we must not only use ¬, but
also S(¬), SS(¬), SSS(¬), SSSS(¬), and even more than
this. We must use all countably transfinite iterations of
the power set operation.
10. Borel selection/antiselection in symmetric Borel sets.
Necessary uses of countably transfinite iterations of the
power set operation are rather dramatic, and we have been
interested in trying to relate this to standard situations
in classical analysis. We have been able to do this in the
context of Borel selection.
Let E Õ ¬ ¥ ¬. We say that E is symmetric iff (x,y) Œ E ´
(y,x) Œ E. We say that f is a selection for E on ¬ if and
only if for all x Œ ¬, (x,f(x)) Œ E.
Here is some background regarding selection.
THEOREM 10.1. There is a Borel set E Õ ¬ ¥ ¬ such that i)
for all x Œ ¬ there exists y Œ ¬ such that (x,y) Œ E; ii)
there is no Borel selection for E on ¬. However, if i) 14
holds then there is a Lebesque measurable selection for E
on ¬ (for Borel E).
THEOREM 10.2. Let E Õ ¬ ¥ ¬ be a symmetric Borel set. Then
E or ¬\E has a Borel selection on ¬.
The proof of Theorem 10.2 uses all countable transfinite
iterations of the power set operation in a demonstrably
essential way.
The number of iterations needed corresponds to the level of
E in the Borel hierarchy.
We proved Theorem 10.2 using a theorem of infinite game
theory due to Donald Martin, called Borel determinacy. This
theorem was first proved by Martin in the mid 1960’s using
large cardinals going way beyond the usual ZFC axioms. In
1968 we proved that every proof of Borel determinacy must
use all countably transfinite iterations of the power set
operation. In 1974, Martin proved Borel determinacy using
exactly all such.
In 1981 we also proved that Theorem 10.2 requires use of
all countably transfinite iterations of the power set
operation.
11. Borel selection in Borel sets.
We recently became acquainted with a series of joint papers
of two functional analysts Debs and Saint Raymond of U.
Paris concerning selection theorems (they use different
terminology).
We need a more general notion of selection. Let S be a set
of ordered pairs, A a set. f is a selection for S on A iff
dom(f) = A and ("x Œ A) (x,f(x)) Œ S.
THEOREM 11.1. Let S Õ ¬ ¥ ¬ be Borel and E Õ ¬ be Borel
with empty interior. If there is a continuous selection for
S on every compact subset of E, then there is a continuous
selection for S on E.
A proof of Theorem 11.1 using BD is implicit in Debs/Saint
Raymond. We have shown that if we use only a transfinite
iteration of the power set operation up to a single
countable ordinal, then we cannot prove Theorem 11.1. 15
The following result is also implicit in Debs/Saint
Raymond.
PROPOSITION 11.2. Let S Õ ¬ x ¬ and E Õ ¬ be Borel. If
there is a Borel selection for S on every compact subset of
E, then there is a Borel selection for S on E.
We say “Proposition” instead of “Theorem” because
Debs/Saint Raymond use an axiom that goes beyond ZFC to
prove this. What they use is still fairly innocent as far
as extensions of ZFC go. This will be discussed in the
fourth lecture.
We have shown that Proposition 11.2 cannot be proved in
ZFC.
12. 6561 cases of Boolean relation theory.
We have discovered a general class of mathematical problems
which make good sense in a great variety of contexts, but
which present severe logical difficulties even in concrete
contexts.
Boolean Relation Theory (BRT) concerns the Boolean
relations between sets and their images under multivariate
functions.
More specifically, let f be a multivariate function and A
be a set. We define
fA = {f(x1,...,xk): k is the arity of f and x1,...,xk Œ A}.
We find it very convenient to suppress the arity of f and
use the notation fA.
Let f be a multivariate function from N into N. We say that
f is strictly dominating if and only if
for all x Œ dom(f), f(x) > max(x).
Here are two simple examples of Boolean relation theory.
1. For all strictly dominating f there exists infinite A Õ
N such that fA = N\A.
2. For all strictly dominating f,g there exists infinite
A,B,C Õ N such that C « fA = C « gB = fA « gB = ∅. 16 Statement 1 is called the Complementation Theorem and plays
a special role in BRT. We leave the proof of both
statements to the audience.
The first example involves only one function and one set.
We have called this “baby BRT”. We do not know of any
interesting concrete contexts where BRT with one function
and one set leads to severe logical difficulties.
The second example involves two functions and three sets.
Here we know of interesting concrete contexts where BRT
with two functions and three sets leads to severe logical
difficulties.
Let f be a multivariate function from N into N. We say that
f is of expansive linear growth if and only if there exist
c,d > 1 such that for all but finitely many x Œ dom(f),
cx £ f(x) £ dx
where x is the maximum coordinate of the tuple x.
We use X ». Y for X » Y together with the commitment that
X,Y are disjoint. E.g.,
X ». Y Õ Z ». W
means
X » Y Õ Z » W Ÿ X « Y = ∅ Ÿ Z « W = ∅.
PROPOSITION 12.1. For all f,g of expansive linear growth,
there exist infinite A,B,C Õ N such that
A ». fA Õ C ». gB
A ». fB Õ C ». gC.
We have given a proof of Proposition 12.1 using certain
large cardinals that go well beyond the usual axioms of
ZFC. We have also shown that ZFC alone does not suffice. In
fact, a little less potent large cardinals than are used in
the proof do not suffice.
It is clear that Proposition 12.1 has a particularly simple
structure compared to a typical statement in Boolean
relation theory. 17
In fact, the two clauses in Proposition 12.1 have the form
X ». fY Õ Z ». gW
S ». fT Õ U ». gV
where X,Y,Z,W,S,T,U,V are among the three letters A,B,C.
This amounts to a particular set of instances of Boolean
relation theory of cardinality 38 = 6561.
We have been able to show that all of these 6561 statements
are provable or refutable using the same large cardinal
axioms that we use to prove Proposition 12.1. Obviously, we
need the large cardinal axioms since 12.1 is among the
6561.
Furthermore, the logical difficulties associated with
Proposition 12.1 are not dependent on the wildness of
arbitrary multivariate functions from N into N or arbitrary
infinite subsets of N. The logical difficulties remain even
if we restrict the functions and sets to concrete countable
families. These matters will be discussed in the fourth
lecture.
We think that BRT is mathematically interesting enough that
mathematicians will want to develop it, despite the
necessary rethinking of the foundations for mathematics.
This remains to be seen.
LECTURE 2
POLYNOMIALS, TERMINATION, HILBERT BASES, DEGREES
Harvey M. Friedman
The Ohio State University
http://www.math.ohiostate.edu/~friedman/ September 18, 2002 In the remaining three lectures, we discuss the 12
topics from the first lecture in order.
The unavoidably exotic nature of the proof methods become
successively clearer and stronger from lecture 2 to lecture
4. In this lecture, the exotic nature is at times quite
subtle, involving the nature of induction statements and
constructivity. However, you should find the associated
numerical information more dramatic.
1. Minimization in norm of integral polynomials. 18
THEOREM 1.1. Every polynomial of several variables with
integer coefficients achieves a value of least magnitude
over the integers.
This is an obvious consequence of the least number
principle  any property that holds of a nonnegative
integer holds of a least nonnegative integer.
The nonconstructive nature of the obvious proof is
apparent: let n be any value. If it is not least, then pass
to a lower value. Keep going on at most n times till you
get to the least value. But you can’t tell when you have
arrived at the least value, or how to get to a lower value.
There is a purely algorithmic way of looking at the
nonconstructivity. The logical form of the statement is
("n)($m)("r)(A(n,m,r,s)).
Here n,m,r range over nonnegative integers. The n codes the
given polynomial of several variables. The m,r code vector
arguments.
A(n,m,r,s) asserts that for the P coded by n, P(r’) ≥
P(m’), where r codes the vector r’ and m codes the vector
m’. This is typical of how coding is used to sort out
logical issues.
According to any reasonable idea of constructivity, if a
statement of the form "$" or even "$"... is provable
constructively, then there must be an algorithm which
produces an instance of the first existential quantifier
given an instance of the first universal quantifier.
In our case, this is just a computable f:N Æ N such that
("n)("r)(A(n,f(n),r)).
I.e. an algorithm which, when presented with an integral
polynomial P, produces an argument where P achieves a value
of least magnitude.
It is a bit more convenient to use the following
consequence of this. Namely, an algorithm which, when
presented with an integral polynomial P, produces a value
of P of least magnitude. 19
Hilbert’s 10th problem asks for an algorithm that decides
whether an integral polynomial has a zero over the
integers. This was answered negatively by Matiyasevich, J.
Robinson, Davis, Putnam.
Suppose there is an algorithm which, when presented with an
integral polynomial P, produces a value of P of least
magnitude over Z. Then we get an algorithm which, when
presented with an integral polynomial P, decides whether P
has a zero, since P has a zero over Z iff any value of P
over Z of least magnitude is 0. Thus there is no such
algorithm.
This establishes the nonconstructivity of Theorem 1.1 from
the algorithmic point of view. However, there is a well
understood proof theoretic approach to constructivity. We
will give some details in the case of formal arithmetic.
The system PA (Peano Arithmetic) may be familiar, but
perhaps not HA (Heyting Arithmetic). HA is the constructive
form of PA.
In all systems of logic discussed here, we use connectives
ÿ,Ÿ,⁄,Æ,´, quantifiers ",$, variables, and =. The usual
axioms and rules of logic are understood. In the case of
PA, we use 0,S,+,•,=. The axioms are:
1. ÿS(x) = 0, S(x) = S(y) Æ x = y.
2. x+0 = 0, x+(S(y)) = S(x+y).
3. x•0 = 0, x•(S(y)) = x•y + x.
4. (j[x/0] Ÿ ("x)(j Æ j[x/S(x)])) Æ j, where j is any
formula in the language.
PA is a very strong system, and it is believed that all
core mathematical theorems to date that are appropriately
expressed in finite terms can be proved in PA. We discuss
some exceptions involving trees, graphs in lecture 3 (are
they in the "core"?).
HA is the same as PA except that we use the axioms and
rules of constructive (intuitionistic) logic. This amounts
to dropping the law of excluded middle, or, equivalently,
dropping the rule that allows the deduction of A from a
contradiction from ÿA. Fortunately, constructive logic is
rather robust and unmistakable. 20
There are some very pleasing well known facts about PA vs.
HA.
1. Every " statement provable in PA is provable in HA.
2. Every "$ statement provable in PA is provable in HA.
3. If ("x)($y)(j(x,y)) is provable in HA then there is a
recursive f:N Æ N such that ("x)(j(x,f(x)).
4. If ("x)($y)(j(x,y)) is provable in HA then there is an
algorithm f such that ("x) (j(x,f(x)) is provable in HA.
We say that a formula is in class n ≥ 0 if and only if it
starts with £ n quantifiers and is followed by a formula
with bounded quantifiers only.
For each n ≥ 0, write PAn for PA where the induction scheme
is for class n formulas.
With some effort, PAn is equivalent to:
i) using "$... of length n in the induction scheme;
ii) using $"... of length n in the induction scheme;
iii) using either of the above in the least number
principle scheme instead of the induction scheme.
It is now clear that Theorem 1.1 is provable in PA1.
PA0 is too weak for many purposes. PA0 is more robust if we
introduce exponentiation into the language. Thus we add
axiom
3.5. x0 = S(0), xS(y) = xy•x.
This does not change PAn, n ≥ 1, or what we said in i) iii). To avoid confusion, we write EFA (exponential
function arithmetic) for PA0 using exponentiation.
THEOREM 1.2. Theorem 1.1 is provable in PA1 but not in EFA
and not in HA. In fact, EFA proves that Theorem 1.1 is
equivalent to PA1.
CONJECTURE. All celebrated number theory to date is
provable in EFA, even constructively.
It is at least believed that the existing proofs of all
celebrated number theory need only minor modification to
conform to PA. 21
But no proof of Falting’s theorem (Mordell’s conjecture)
within HA is known. That would yield effective bounds. No
proof of Roth’s theorem on rational approximations in HA is
known either, because that would yield effective bounds. No
proof of Roth’s theorem in EFA is known since there is a
crucial induction whose induction statement has an
unbounded quantifier.
2. Termination of lexicographic descent in the natural
numbers.
For k ≥ 1, x,y Œ Nk, write x <lex y iff at the first coordinate at which x,y differ, x is less than y.
THEOREM 2.1. Every sequence from Nk that is decreasing in
the lex ordering terminates.
We discuss what is needed to prove Theorem 2.1 by adding to
PA and HA, a symbol for an unknown function F:N Æ N. Call
this PA(F), HA(F), PAn(F), HAn(F), EFA(F). These are nice,
robust systems.
Take Theorem 2.1 as “if F is an infinite sequence from Nk
that is decreasing in the lex ordering until it reaches
zero, then it reaches zero”.
THEOREM 2.2. Theorem 2.1 for k = 1 is provable in EFA(F).
For any fixed k ≥ 2, it is provable in HA1(F) but not in
EFA(F).
But what about the full Theorem 2.1 for all k?
THEOREM 2.3. Theorem 2.1 is provable in HA2(F) but not in
PA1(F).
A sharper theorem of this character is as follows. For x,y
Œ Nk, write x £c y iff for all i, xi £ yi.
THEOREM 2.4. Every infinite sequence from Nk has a finite
initial segment such that every term is ≥c some term in that
finite initial segment.
It is trickier to be economical about the proof of Theorem
2.4.
The following can be obtained from a paper of Steve
Simpson. 22
THEOREM 2.5. Theorem 2.4 for each fixed k is provable in
PA1(F). Theorem 2.4 is provable in PA2(F) but not in PA1(F).
Even for k = 1, Theorem 2.4 is not provable in HA(F). The
equivalence of Theorems 2.1 and 2.4 can be proved in PA1(F).
The exotic nature of Theorems 2.1 and 2.4 become clearer as
we move to finite forms and associated finite information.
THEOREM 2.6. For all k ≥ 1 there is a longest sequence x1
>lex x2 >lex ... >lex xn from Nk such that each max(xi) £ i.
To prove this, fix k ≥ 1. The space of all infinite
sequences from Nk with these two properties forms an
infinite finitely branching tree. There are no infinite
paths by Theorem 2.1. Hence the tree is finite, and we are
done.
Theorem 2.6 is a "$ statement in arithmetic.
THEOREM 2.7. Theorem 2.6 is provable in HA2 but not in PA1.
We now discuss the associated numerical information in
Theorem 2.6.
We introduce (a version of) the Ackermann hierarchy of
functions. We define strictly increasing functions Ak:Z+ Æ
Z+, where k ≥ 1, as follows. A1(n) = 2n, Ak+1(n) = AkAk...
Ak(1), where there are n Ak’s.
A2(n) = 2n, A3(n) is an exponential tower of 2’s of height
n. The Ackermann function is A(n) = An(n).
A3(1) = 2. A3(2) = 4. A3(3) = 16. A3(4) = 216 = 65,536. A3(5)
= 265,536. A4(1) = 2. A4(2) = A3A2(1) = A3(2) = 4. A4(3) =
A3A4(2) = A3(4) = 216 = 65,536. A4(4) = A3A4(3) = A3(65,536),
which is an exponential tower of 2’s of height 65,536.
A4(4) is ridiculously large, but not incomprehensibly so.
However, if we go much further, then a profound level of
incomprehensibility emerges. These higher levels of
largeness blur. We think of A5(5) as incomprehensibly large.
We us it as a kind of benchmark.
Recall:
THEOREM 2.6. For all k ≥ 1 there is a longest sequence x1
>lex x2 >lex ... >lex xn from Nk such that each max(xi) £ i. 23 Let h(k) = longest length.
THEOREM 2.8. In Theorem 2.6, h(k) is roughly the Ackermann
function. I.e., there is a small c such that for all k,
A(kc) £ h(k) £ A(k+c).
3. Hilbert basis theorem.
One common way of formulating the Hilbert basis theorem is
this. We say that a ring R is Noetherian iff for all x1,x2,
..., there exists n such that all x’s are ideal generated
by x1,...,xn.
THEOREM 3.1. If R is a (countable) Noetherian ring then so
is R[x].
However, this is too innocent a formulation in order to
illustrate our theme, with exotic finite forms and
associated numerical information.
For our purposes, we consider the following formulation of
the Hilbert basis theorem.
THEOREM 3.2. Let P1,P2,... be an infinite sequence of
polynomials from the polynomial ring in k variables over a
(countable) field. There exists n such that all P’s are
ideal generated by P1,P2,...,Pn.
Let us review a proof of this form of HBT.
Order the monomials in k variables lexicographically. First
let Q1,Q2, ... enumerate all polynomials ideal generated by
the P’s. For each i, look at the leading monomial Mi of Qi.
Apply Theorem 2.4 to the sequence M1,M2,..., obtaining n
such that all M’s are multiples of at least one of M1,...,
Mn. This gives us m such that the leading coefficient of
every Q is a multiple of the leading coefficient of at
least one of Q1,...,Qm. Then every Q is ideal generated by
Q1,...,Qm, using iterated division with remainder.
As indicated in the first lecture, we can derive Theorem
2.2 from Theorem 3.2 by considering sequences of monomials.
THEOREM 3.3. Theorems 2.4 and 3.2 are provably equivalent
in EFA(F). 24
We now consider the following immediate consequence of
Theorem 3.2. It should be viewed as a form of the standard
application of HBT to decreasing chains of algebraic sets.
THEOREM 3.4. Let P1,P2,... be an infinite sequence of
polynomials from the polynomial ring in k variables over a
(countable) field. There exists n such that every
simultaneous zero of P1,...,Pn is a zero of all P’s.
It is harder to show that Theorem 3.4 implies Theorem 2.4.
The construction of the P’s is trickier. of course, this
implication cannot be done at all if we assume that the P’s
represent irreducible algebraic sets, by Krull’s theorem
for chains of prime ideals. So we are immersed in reducible
algebraic sets.
Fix the dimension k and an infinite field F. Let T be a
finite tree with at least one vertex, where every path
excluding the root is of length £ k, and where the vertices
other than the root are labeled with different elements of
the field F. We call these the kgood trees.
The algebraic meaning of a vertex at the ith level above
the root with label c is the equation xi = c. The algebraic
meaning of a path is the conjunction of the algebraic
meaning of the vertices along that path other than the
root. The algebraic meaning of the tree T is the
disjunction of the algebraic meanings of the paths of T.
Take [T] to be this union of intersections. Rewrite as
intersection of unions. Each union is the zero set of a
polynomial obtained by multiplying the relevant xic. [T]
becomes an algebraic subset of Fk, given by polynomials of
degree £ #T = the number of terminal vertices of T.
We need to have a sufficient criterion for [T] to properly
contain [T’].
LEMMA 3.5. Let T,T’ be kgood trees. Suppose T’ is obtained
from T by adding one or more children to a terminal vertex.
Or suppose T’ is obtained from T by deleting one of the
children of a vertex that has at least two children (and of
course all vertices above the one deleted). Then [T]
properly contains [T’].
Now all we have to do is to deal with the combinatorics of
these two tree operations. 25 There is a nice way of assigning ordinals < wk to kgood
trees. For each terminal node x of height 1 £ i £ k, assign
the ordinal wi1. Now take the sum of the ordinals assigned
to the terminal nodes, in decreasing (≥) order. This is
ord(T).
The two tree operations lower ordinal. Also, ord(T) is onto
the ordinals < wk. Even more is true and useful. Given a <
ord(T), there exists T’ obtained from T by successive
applications of the two tree operations in some
combination, such that ord(T’) = a.
Recall that the lexicographic ordering used in Theorems 2.1
and 2.4 is just wk.
We have just provided a way of assigning an algebraic set
to ordinals < wk so that if the algebraic set decreases then
the ordinal lowers.
THEOREM 3.6. Theorem 3.4 and 2.4 are provably equivalent in
EFA(F).
We now come to finite forms. In F[x1,...,xk], the degree of
an ideal means the least d such that the ideal is generated
by its elements of degree £ d.
THEOREM 3.7. Let k ≥ 1 and F be a field. There is a bound on
the length of chains of ideals I1 Õ≠ ... Õ≠ In Õ
F[x1,...,xk], where each Ij is of degree £ j. Furthermore,
the bound can be taken to depend on k only, and not on F.
Seidenberg proved Theorem 3.7, and more generally with any
bounds on the degrees of the ideals. He called it the
constructive form of Hilbert’s basis theorem. In
particular, he considered:
THEOREM 3.8. Le k,c ≥ 1 and F be a field. There is a bound
on the length of chains of ideals I1 Õ≠ ... Õ≠ In Õ
F[x1,...,xk], where each Ij is of degree £ j+c. Furthermore,
the bound can be taken to depend on k,c only, and not on F.
Seidenberg established a primitive recursive bound only for
k £ 2, and doubted whether one exists for k = 3.
In fact, there is a nice proof of Theorems 3.7, 3.8, using
the compactness theorem for predicate calculus. The proof 26
of Theorem 3.8 for each k ≥ 1 is naturally formalized in a
system called RCA0 + WKL0, for which there is a metatheorem
to the effect that every "$ sentence provable there has a
primitive recursive realization. PA1 is the major source of
power of RCA0 + WKL0. Thus we can put Seidenberg’s doubt to
rest using machinery from mathematical logic.
The lower bounds for Theorems 3.7, 3.8 involving the
Ackermann hierarchy are clear since the derivations of the
finite forms of lex descent discussed earlier is easy.
Seidenberg did not consider Theorems 3.7, 3.8 for
descending chains of algebraic sets:
THEOREM 3.9. Let k ≥ 1 and F be a field. There is a bound on
the length of chains of algebraic sets A1 ⊇≠ ... ⊇≠ An Õ
F[x1,...,xk], where each Ai is of presentation degree £ i.
Furthermore, the bound can be taken to depend on k only,
and not on F.
We can show that the lower bounds in Theorem 3.9 are
(roughly) at least those for our finite form of lex descent
using the above way of assigning algebraic sets to
ordinals.
4. Degrees of algebraic approximations to sets.
The logically exotic nature of HBT and its finite forms
generally does not spill over to its applications to
commutative algebra/algebraic geometry. Often HBT is the
simplest way to obtain a result, but harder work will avoid
HBT and lead to decent estimates.
For example, the decomposition of algebraic sets into
irreducible components has a treatment with decent bounds,
avoiding HBT. My impression is that there are some
important situations where removal of HBT has not yet been
achieved.
One should look further into the possible role of exotic
functions in algebraic geometry and commutative algebra.
The idea is to recognize that sequences of algebraic sets
or ideals, even if of finite length, are too arbitrary.
Instead, one wants any such sequence to come out of more
ordinary data; e.g., as a construction that is used in a
proof. 27 Along these lines, the unique decomposition of algebraic
sets into irreducible components is just perfect, except
that HBT is completely avoided in favor of explicit
arguments, and the associated numerical information is the
opposite of exotic.
Digression on Hilbert functions.
The theory of Hilbert functions and Hilbert polynomials has
a certain exotic combinatorial underpinning. We have
uncovered a simple fact about Hilbert functions which
relates to the Ackermann function.
A polynomial in F[x1,...,xk] is called homogenous if and
only if all of its monomials are of the same degree.
Let M be an ideal in F[x1,..., xk]. We define Mn to be the
set of all homogenous polynomials in M of degree n.
The Hilbert function of F[x1,...,xk]/M is the function H:N Æ
N defined by
H(n) = the dimension of F[x1,...,xk]/Mn as a vector space
over F, which is the same as
P(k,n) minus the dimension of Mn as a vector space over F,
where P(k,n) is the number of ktuples of nonnegative
integers whose sum is n. It is well known that P(k,n) is a
polynomial of degree k1 with rational coefficients.
THEOREM 4.1. (Hilbert). HF(Mn) agrees with a unique
polynomial of degree £ k1 with rational coefficients for
all sufficiently large n. This unique polynomial is called
the Hilbert polynomial of F[x1,...,xk]/M.
Let us call the k dimensional Hilbert functions the Hilbert
functions of the F[x1,...,xk] /M, where M is an ideal in
F[x1,...,xk]. These functions depend only on k and not on
the field F.
Various characterizations of these functions have been
established, mainly due to Macauley. 28
Let us define the k dimensional Hilbert function agreement
sets as those sets of the form {n: f(n) = g(n)}, where f,g
are k dimensional Hilbert functions.
THEOREM 4.2. For all k ≥ 1 there is a finite subset of N
which is not a k dimensional Hilbert function agreement
set. The least size of such a set, as a function of k,
grows like the Ackermann function.
End of Digression on Hilbert functions.
We now propose a structural theorem about algebraic S Õ
F[x1,..., xk].
The nth algebraic approximation to S is the least
algebraic superset of S of presentation degree £ n. This
always exists, even if S is not algebraic.
If £n is replaced by n then we get the same definition. For
algebraic S, we have S[0] ⊇ S[1] ⊇ ... ⊇ S[d] = S, where
deg(S) = d.
We call S rich iff the S[i], 0 £ i £ d, are distinct. This
is the same as: for all 0 £ i £ d, deg(S[i]) = i.
Note that this series fits squarely into Theorem 3.9 about
bounds on lengths of chains of algebraic sets. So
THEOREM 4.3. For all k ≥ 1 and field F, there is a bound to
the degrees of the rich algebraic subsets of F[x1,...,xk].
The bound can be taken to depend on k and not on F.
From what we saw earlier, upper bounds given by the
Ackermann function.
The lower bounds are also approximately the Ackermann
function, with a more involved construction. The lower
bounds are exotic even if only finite rich sets in finite
fields are considered.
LECTURE 3
COMPARISON OF BLOCKS, TREES, GRAPHS, COUNTABLE POINTSETS
Harvey M. Friedman
The Ohio State University
http://www.math.ohiostate.edu/~friedman/ September 19, 2002 29
In this third lecture, the levels of abstraction in the
proof methods are more exotic than in the second lecture.
In the fourth lecture, they are far more exotic still.
5. Comparison of blocks within finite sequences of natural
numbers.
Let us start with the following simple problem, which Paul
Sally uses in his gifted high school program.
THEOREM 5.1. There is a longest finite sequence x1,x2,...,xn
from {1,2} in which no consecutive block xi,...,x2i is a
subsequence of any later consecutive block xj,...,x2j.
Let us call this property of finite sequences property *.
One can easily show that
from {1,2} with property
examples are 12221111111
high school students was the longest length of a sequence
* is 11, and that the only
and 21112222222. One of the gifted
able to show this. THEOREM 5.2. There is a longest finite sequence from
{1,2,3} with property *.
This is no longer a gifted high school theorem. The
simplest known proof of this is truly exotic compared with
the statement. With some considerable trouble, it can be
replaced with a considerably less exotic proof, but still
rather exotic compared to the statement.
We sketch this simplest known proof, which uses the Nash
Williams minimal bad sequence argument in this context.
First we shift context to infinite sequences of finite
sequences.
THEOREM 5.3. Let k ≥ 1 and x1,x2,... be an infinite sequence
of finite sequences from {1,...,k}. There exists i < j such
that xi is a subsequence of xj.
Suppose this is false. Call an infinite sequence bad if it
is a counterexample. Let x1 be of least length so that it
starts an infinite bad sequence. Let x2 be of least length
so that x1,x2 starts a bad sequence. Continue in this way,
getting a “minimal” bad sequence x1,x2,... . There is an
infinite subsequence xi1,xi2,..., all of which start with the
same number. Note that xi1’,xi2’,... is bad, 30 where the primes mean “chop off the first term” (no x can
be empty). Hence x1,...,xi1 1,xi1’,xi2’,... is also bad. But
xi1’ is shorter than xi1, contradicting the choice of xi1.
QED
We now prove that there is a longest finite sequence from
{1,2,3} with property *.
Suppose there are arbitrarily long such. Build the finitely
branching tree of such. Let x1,x2,... be an infinite branch,
which therefore has property *. Consider the infinite
sequence
x1,x2
x2,x3,x4
x3,x4,x5,x6
...
By Theorem 5.3, one is a subsequence of a later one. This
contradicts property *.
Obviously we did not use that there are only three letters.
THEOREM 5.4. (Block subsequence theorem). For all k ≥ 1,
there is a longest finite sequence x1,....,xn in k letters
in which no consecutive block xi,...,x2i is a subsequence of
a later consecutive block xj,...,x2j.
In order to tame the proof of the block subsequence
theorem, we need to tame Theorem 5.3. I.e., we need to
replace the minimal bad sequence argument with something
more concrete.
The sharpest way to do this is to effectively assign (names
for) ordinals < ww^k to finite bad sequences from
{1,...,k+1}, where if one is extended to another, then the
corresponding ordinal decreases (due to Simpson).
THEOREM 5.5. The block subsequence theorem is provable in
PA3.
This is because PA3 suitably handles the ordinal ww^w. For
each fixed k, PA2 suitably handles the ordinal ww^k.
We have shown how to reverse this process in order to show
how ww^w can be suitably
handled in EFA + the block subsequence theorem.
Using well known information from the proof theory of
fragments of PA, we obtain 31 THEOREM 5.6. The block subsequence theorem is not provable
in PA2.
Back to the block subsequence theorem with 3 letters. To
obtain an exotic lower bound, we make some rather special
arguments for this context.
We begin with a somewhat long finite bad sequence x1,...,x2p
from {1,2,3} such that for all 1 £ i £ p, the block
xi,...,x2i has a 1, and the last p terms are 133...3. We
were able to create one of these so called “special”
sequences x1,...,x216 by hand.
The long bad sequence from {1,2,3} has the form x1,...,
x216,y, where y is a long sequence of 2’s 3’s.
None of the first relevant 108 blocks will be subsequences
of later relevant blocks, since they have a 1 in them.
This applies to many of the next 108 relevant blocks since
they have so many 3’s in them, and we won’t ever have so
many consecutive 3’s, and the first term of y is 2.
We use 23252 as markers that occur at successively
incrementally greater spacing apart, where what goes in
between these markers is of critical importance.
These “in between” blocks of 2’s, 3’s have exactly 8 3’s.
A series of technical Lemmas show that the resulting
sequence has property * if no in between block is a
subsequence of any later in between block.
So we just have to create a very long sequence of
incrementally length growing sequences of finite 2’s, 3’s
with exactly 8 3’s, no one of which is a subsequence of a
later one.
This is a much easier combinatorial challenge, and one can
create A7(184) of them in this context.
THEOREM 5.7. The longest length of a bad sequence from
{1,2,3} is > A7(184).
“7” was derived from the length 216 of the special
sequence, which acts as a seed. 32
Randall Dougherty wrote some software that looks for
special sequences. He was able to find one of length
187,196. Plugging this into our machinery, we obtain:
THEOREM 5.8. The longest length of a bad sequence from
{1,2,3} is > A7198(158,386).
For those not here yesterday, the Ackermann hierarchy of
functions is defined by
A1(n) = 2n, Ak+1(n) = AkAk... Ak(1), where there are n Ak’s.
Define A(n) = An(n). A(4) is an exponential stack of 2’s of
height 65,536. A(5) is incomprehensibly large.
As for an upper bound, we haven’t work this out, but are
confident that A(A(5)) is a crude upper bound.
If we consider 4 letters, then the numbers grow
considerably more exotic. The longest length is greater
than AA...A(1), where there are A(5) A’s.
Let J(k) be the longest length of a sequence with property
* in k letters.
Then J grows faster than all multiply recursive functions.
By comparison, the Ackermann function is a puny little
doubly recursive function.
6. Comparison in sequences of finite trees, and within
large finite trees.
A poset is a pair (D,£) where D is a nonempty set and £ is a
reflexive transitive relation obeying
(x £ y Ÿ y £ x) Æ x = y.
A tree is a poset T = (V,£) where there is a minimum element
called the root, and where for each x Œ T, {y: y £ x} is
linearly ordered by £.
The elements of V = V(T) are called the vertices of T. A
tree is said to be finite if it has finitely many vertices.
For finite trees, we define the inf operation on V, where x
inf y is the greatest z such that z £ x Ÿ z £ y. 33
Let T1 and T2 be finite trees. h is an inf preserving embedding from T1 into T2 iff
i) h:V(T1) Æ V(T2) is oneone;
ii) for all x,y Œ V(T1), h(x inf y) = h(x) inf h(y).
Here is the most rudimentary form of J.B. Kruskal’s
theorem.
THEOREM 6.1. In any infinite sequence of finite trees, one
tree is inf preserving embeddable into a later tree.
The Nash Williams infinite bad sequence argument was
invented to give the simplest proof known of Theorem 6.1.
Before we prove this, there is a Lemma needed due to Graham
Higman. The simplest proof of this Lemma also uses the
minimal bad sequence argument, although it can be avoided
in favor of something less exotic. However, the minimal bad
sequence argument for the main body of the proof of
Kruskal’s theorem is largely unavoidable (it can be tamed a
little bit).
A quasi ordering is a reflexive transitive relation. A wqo
(well quasi ordering) is a quasi ordering £ such that in any
infinite sequence from its field, one term is £ a later
term.
Theorem 6.1 can be restated: The finite trees ordered by
oneone inf preserving embeddability form a wqo.
Let Q be a wqo. Write FIN(Q) for the quasi ordering of all
finite subsets from Q, where A £* B iff there exists oneone
h:A Æ B such that for all x Œ A, x £* h(x).
THEOREM 6.2. (Higman’s Theorem). If Q is a wqo then FIN(Q)
is a wqo.
We first need a general fact about wqo’s, which is an easy
application of Ramsey’s theorem for pairs.
LEMMA. If (Q,£) is a wqo then every infinite sequence from Q
has an infinite increasing (£) subsequence.
To prove Higman’s theorem, let Q be a wqo, and suppose
FIN(Q) is not a wqo. Let x1 Œ FIN(Q) be of least cardinality
that starts an infinite bad sequence in FIN(Q). Let x2 be of 34
least cardinality such that x1,x2 starts an infinite bad
sequence in FIN(Q). Continue in this way, getting a minimal
bad x1,x2,... in FIN(Q).
Look at the first terms of the x’s. By the general fact,
there is an infinite subsequence xi1,xi2,... whose first
terms are increasing in Q. Hence xi1’,xi2’,... is bad, where
the primes indicate throwing away an element from each
(none of the x’s can be empty). Therefore x1,...,xi1 1,xi1’,xi2’,... is also bad in FIN(Q). Since xi1’ is of lesser
cardinality than xi1, we have a contradiction.
Now we prove KT.
THEOREM 6.1. (Kruskal’s theorem). In any infinite sequence
of finite trees, one tree is inf preserving embeddable into
a later tree.
Suppose there is an infinite bad sequence T1,T2, ... of
finite trees. Let T1 have the least number of vertices such
that it starts an infinite bad sequence of finite trees.
Let T2 have the least number of vertices such that T1,T2
starts an infinite bad sequence of finite trees. Continue
in this way to form a minimal bad sequence T1,T2,... of
finite trees.
If we chop off the root of a finite tree, we get a finite
set of trees. Let V be the union of all trees obtained in
this way from the T’s. We claim that this set V of trees is
a wqo.
To see this, suppose there is an infinite bad sequence in
V. Let this be S1,S2,..., where the S’s come from
successively later T’s, and where S1 comes from Ti. Then
T1,...,Ti1,S1,S2,... is an infinite bad sequence of trees,
with S1 smaller than Ti. This is a contradiction.
So V is a wqo. By Higman’s Lemma, FIN(V) is a wqo. Now go
back to T1,T2,... and chop off the roots, getting a sequence
T1’,T2’,... from FIN(V). Let Ti’ £* Tj’. Then Ti is inf
preserving embeddable into Tj, which is the desired
contradiction. QED
The exotic nature of the proof of Kruskal’s theorem is much
stronger than the necessary uses of induction in the second
lecture. 35 We will discuss how we classify the power of such proof
methods later.
But first we discuss the finite forms of Kruskal’s theorem.
Our original ones correspond to the ones we gave for
lexicographic descent, and the first way we did this with
the Hilbert basis theorem. We look at finite sequences with
bounded growth rate.
THEOREM 6.3. Let k ≥ 1 and T1,...,Tn be a sufficiently long
finite sequence of finite trees, where each Ti has at most
k+i vertices. There exist i < j such that Ti is inf
preserving embeddable into Tj.
To prove this, let k ≥ 1 and assume false. Form the finitely
branching tree of finite bad sequences subject to the
inequality. (Use a concrete representation of finite trees
so that isomorphic finite trees are identical). This tree
is infinite by hypothesis. Hence it has an infinite path,
which forms an infinite bad sequence of fnite trees. This
contradicts Kruskal’s theorem. QED
Note that Theorem 6.3 is a purely finite statement. To
greatly understate the truth, Theorem 6.3 is unprovable in
Peano Arithmetic.
Theorem 6.3 is not the first example of a serious finite
theorem that is unprovable in PA. The first came out of
Ramsey theory, which we discuss later. However, Theorem 6.3
is incomparably more exotic than those first examples, in
the sense of being unprovable in far stronger systems and
also having far higher associated finite information.
We found a structural theorem about a single large finite
tree that is equally exotic.
A full finite tree is a finite tree whose terminal vertices
have the same height, where all nonterminal vertices have
the same valence.
An rlabeled tree is a tree together with a mapping of the
vertices into {1,...,r}.
THEOREM 6.4. Let k,r ≥ 1 and T be a sufficiently tall full
finite rlabeled tree of valence k. Then some truncation of 36
T is inf, label, and terminal preserving embeddable into a
higher truncation of T.
We have shown that Theorem 6.4 is unprovable in PA, and
also much stronger systems, even for k = 2.
Theorems 6.3, 6.4 also have exotic associated finite
information.
7. Graph minors in sequences of finite graphs.
Finite graphs are pairs (V,E), where for all x Œ E, we
assign a set of vertices of cardinality 1 or 2.
G is minor included in H iff G can be obtained from H by
successively deleting a single edge, contracting a single
edge, or removing an isolated vertex.
Minor inclusion is normally taken up to isomorphism.
Here is the graph minor theorem of Robertson/Seymour.
THEOREM 7.1. In any infinite sequence of finite graphs, one
graph is minor included in a later graph.
At a critical place, the proof of Theorem 7.1 uses an
iterated form of the infinite bad sequence argument. We
used a single infinite bad sequence argument for Kruskal’s
tree theorem (the second one used for Higman’s Theorem can
be avoided).
Such iterations are known to be more powerful as the
iteration length increases.
Before the graph minor theorem was proved, we proved
an extension of Kruskal’s theorem called EKT.
We proved EKT using a finitely iterated minimal bad
sequence argument. We showed that these iterations are
unavoidable.
We asked Robertson/Seymour to explicitly derive our EKT
from their GMT. Robertson/ Seymour succeeded in doing this.
Therefore the GMT is at least as exotic as the EKT.
Let r ≥ 1 and T1 and T2 be finite rlabeled trees with
labeling functions l1 and l2. We say that h is a gap embedding from T1 to T2 iff 37 i) h is a oneone map from the vertices of T1 to the
vertices of T2;
ii) for all vertices h(a) < c < h(b) of T2, l2(c) ≥ l2(h(b)).
THEOREM 7.2. Extended Kruskal's Theorem. Let r ≥ 1 and
T1,T2,... be an infinite sequence of finite rlabeled trees.
There exist i < j and an inf and label preserving gap
embedding from Ti into Tj.
We gave finite forms for Theorem 7.2 in terms of sequences
of trees. We also gave a version for a single tall finite
tree, analogously as we did for Kruskal’s theorem. These
have the expected exotic properties.
Let G be the sum of the number of vertices and the number
of edges in the finite graph G.
THEOREM 7.3. Let k ≥ 1 and G1,...,Gn be a sufficiently long
finite sequence of finite graphs, where each Gi £ k+i.
There exist i < j such that Gi is minor included in Gj.
The EKT, GMT, and their finite forms are much more exotic
than KT and its finite forms.
8. Continuous comparison of countable sets of reals.
It is now time to present a hierarchy of formal systems
that are normally used to measure the level of intrinsic
exoticness.
Recall that in the second lecture, the logical issues were
the level of induction needed, and growth rates involving
the Ackermann hierarchy of functions. There it was
sufficient to use fragments of PA and PA(F) as yardsticks.
These tools are also entirely suitable for our discussion
of the block subsequence theorem, which we saw blows up
numerically with only 3 letters.
However, when it comes to KT, EKT, and GMT, as well as
their finite forms, we are way beyond systems like PA and
PA(F).
We will place all theorems in lectures 2,3 in terms of some
standard systems with quantifiers over natural numbers and
sets of natural numbers. 38 We begin with a particularly strong such system, Z2.
Here is the syntax of Z2.
i) variables over natural numbers (lower case);
ii) variables over sets of natural numbers (upper case);
iii) 0,S,+,•,= (in the natural numbers only);
iv) Œ (between natural numbers and sets).
Number terms are built up from number variables and
0,S,+,•, in the obvious way.
The atomic formulas of Z2 are s = t, and t Œ A, where s,t
are number terms and A is a set variable. Formulas are
built up from atomic formulas by ÿ,Ÿ,⁄,Æ,´ and
"n,"A,$n,$A.
The axioms of Z2 are as follows.
1. ÿS(x) = 0, S(x) = S(y) Æ x = y.
2. x+0 = x, x+S(y) = S(x+y).
3. x•0 = 0, x•S(y) = (x•y)+x.
4. (0 Œ A Ÿ ("n)(n Œ A Æ S(n) Œ A)) Æ ("n)(n Œ A).
5. ($A)("n)(n Œ A ´ j), where j is any formula in the
language in which A does not appear.
The theorems of lectures 2,3 are easily proved in weak
fragments of Z2, and so Z2 is gross overkill for our present
purposes. But just wait until the fourth lecture, where Z2
is grossly inadequate!
What we use here is carefully chosen fragments of Z2. Five
such fragments have emerged most frequently as benchmarks
for the logical analysis of a great number of theorems.
These are our formal systems RCA0, WKL0, ACA0, ATR0, P11CA0.
They are all weak fragments of full Z2. These are from
weakest to strongest.
The “naughts” have historical significance indicating the
explicit way induction is formulated.
1. RCA0. The set comprehension axiom is called “recursive
comprehension”. Details later.
2. WKL0. This is RCA0 together with “every infinite finitely
branching tree has an infinite path”. 39 3. ACA0. Same as Z2 except in the comprehension axiom, no
set quantifiers are allowed in j.
4. ATR0. This is ACA0 together with “arithmetic transfinite
recursion”. This means that transfinite recursion, with
arithmetic recipes, can be performed along any well
ordering of numbers.
5. P11CA0. Same as Z2 except in the comprehension axiom, at
most one set quantifier is allowed in j.
RCA0 is the weakest. Call a formula existential numerical if
and only if it begins with zero or more existential number
quantifiers followed by only bounded numerical quantifiers
(i.e., ("n < m•r), where < is defined in the usual way).
Here are the axioms of RCA0. There are many equivalent
versions.
1. Numerical axioms 13 of Z2.
2. (j[n/0] Ÿ ("n)(j Æ j[n/S(n)])) Æ j, where j is
existential numerical.
3. ("n)(j ´ ÿy) Æ ($A)("n)(n Œ A ´ j), where j,y are
existential numerical and do not mention A.
In reverse mathematics, we analyze mathematical theorems in
the following way. First we formalize a proof in one of the
five systems RCA0, WKL0, ACA0, ATR0, P11CA0, the weaker the
better.
E.g., the theorem is proved in ACA0. Then show that the
theorem is equivalent to ACA0 over RCA0 by proving ACA0
using RCA0 together with the theorem.
Failing that, use a host of other logically natural systems
to generate a match, but always use RCA0 as the “base
theory”.
And failing that, set up a new formal system for an exact
match.
I could easily spend 4 lectures on the story of reverse
mathematics. Instead see: Steve Simpson, Subsystems of
Second Order Arithmetic, Springer, 1999. 40
Here is a status table for the theorems of lectures 2,3,
including Ramsey’s theorem.
PROVABLE IN RCA0
The finite Ramsey theorem.
Minimization in norm of integral polynomials.
Lex descent for fixed dimension. Strengthening with £c for
fixed fixed dimension. Finite forms for fixed dimension.
Hilbert basis theorem for fixed dimension. Various
consequences involving algebraic sets for fixed dimension.
Finite forms for fixed dimension.
PROVABLE IN ACA0
The infinite Ramsey theorem for fixed dimension. An exotic
finite form for fixed dimension.
Lex descent. Strengthening with £c. Finite forms.
Hilbert basis theorem. Various consequences involving
algebraic sets. Finite forms.
The block subsequence theorem. Finite sequences from any
{1,...,k} are wqo. Higman’s theorem.
PROVABLE IN ATR0
The infinite Ramsey theorem. An exotic finite form (Paris
and Harrington).
For any two sets of rationals, one can be continuously
embeddable into the other. Same with reals.
PROVABLE IN P11CA0
Kruskal’s theorem (even with labels and orientation).
Finite forms.
Extended Kruskal’s theorem for any fixed set of finite
labels. Finite forms for any fixed set of finite labels.
The existence of minimal infinite bad sequences. 41
NOTE: The extended Kruskal theorem and the graph minor
theorem are a bit beyond even P11CA0.
To illustrate the reversal idea, we give the following
relevant examples.
THEOREM 8.1. Any of the following are provably equivalent
to “ww is well ordered” over RCA0. Lex descent.
Strengthening with £c. Hilbert basis theorem. Various
consequences involving algebraic sets.
THEOREM 8.2. Any of the following are provably equivalent
to “every level of the Ackermann hierarchy exists” over
RCA0. Finite forms of lex descent. Finite forms of
strengthening with £c. Finite forms of Hilbert basis
theorem. Finite forms of various consequences involving
algebraic sets.
THEOREM 8.3. The block subsequence theorem is equivalent to
“every multiply recursive function exists” over RCA0.
THEOREM 8.4. Higman’s theorem is equivalent to ACA0 over
RCA0.
THEOREM 8.5. “For any two sets of rationals, one can be
continuously embeddable into the other” is equivalent to
ATR0 over RCA0. The same is true if we replace "rationals"
with "reals".
THEOREM 8.6. Kruskal’s theorem is equivalent to a specific
large proof theoretic ordinal “is well ordered” over RCA0.
Same with extended Kruskal’s theorem and the graph minor
theorem for bounded tree width.
THEOREM 8.7. The existence of minimal infinite bad
sequences is equivalent to P11CA0 over RCA0.
LECTURE 4
BOREL DIAGONALIZATION, BOREL SELECTION,
BOOLEAN RELATION THEORY
Harvey M. Friedman
The Ohio State University
[email protected]
http://www.math.ohiostate.edu/~friedman/ September 20, 2002 42
In this final lecture, the gap between the statement of the
theorems and what must be involved in the proofs is
particularly apparent.
In the first half, we work in the context of Borel
measurable sets and functions on the reals or other
complete separable metric spaces (Polish spaces). We will
generally stay within ZFC.
In the second half, we work in discrete mathematics, with
functions on and sets of natural numbers. After some
preliminary results, we will require something beyond the
accepted axioms for mathematics.
9. Borel Diagonalization.
Consider Cantor’s theorem.
THEOREM 9.1. In any infinite sequence of real numbers, some
real number is not a coordinate of the sequence.
One defines a sequence of nondegenerate closed intervals
with rational endpoints, shrinking to a point that lies off
of the sequence. By standard descriptive set theoretic
technology:
THEOREM 9.2. There is a Borel measurable function F:¬ Æ ¬
such that for all x Œ ¬•, F(x) is not a coordinate of x.
F(x) may depend only on the (set of) coordinates of x.
THEOREM 9.3. There is no Borel measurable function F:¬• Æ ¬
obeying rng(x) = rng(y) Æ F(x) = F(y), such that for all x
Œ ¬•, F(x) is not a coordinate of x.
Or put positively,
THEOREM 9.4. Let F:¬• Æ ¬ be Borel measurable, where for
all x,y in ¬•, rng(x) = rng(y) Æ F(x) = F(y). There exists
x Œ ¬• such that F(x) is a coordinate of x.
We sketch a proof of a sharp form of Theorem 9.4. There is
something exotic about it.
Let ¬* be the reals with the discrete topology. Is the
discrete topology is one of those worthless meaningless
things from the new new new new math? We shall see. 43 Granted, ¬* is silly, but ¬*• is not. The basic open sets
in ¬*• are the Vx = {f Œ ¬*•: f extends x}, where x Œ FS(¬)
= set of all finite sequences from ¬. Obviously every open
(Borel) subset of ¬• is an open (Borel) subset of ¬*• but
not vice versa.
In any topological space, a set is called meager iff it is
contained in a countable union of nowhere dense sets;
comeager iff its complement is meager; Borel iff it is in
the least s algebra containing all open sets.
LEMMA 9.5. In any topological space, every Borel set
differs from an open set by a meager set.
Baire category for ¬•:
LEMMA 9.6. ¬• is not meager. In fact, no Vx is meager.
0,1 laws for ¬•:
LEMMA 9.7. Let A Õ ¬• be Borel and permutation invariant.
Then A is meager or comeager.
We say that F: ¬• Æ ¬ is Borel iff the inverse image of
every open subset of ¬ is a Borel subset of ¬•.
LEMMA 9.8. Every permutation invariant Borel f:¬• Æ ¬ is
constant on a comeager set.
THEOREM 9.9. Every permutation invariant Borel f:¬• Æ ¬
maps some argument to a coordinate of itself.
Obviously the use of the discrete topology to prove a
statement living in standard separable spaces is highly
unusual. Recall the standard well known 0,1 law:
Every permutation invariant Borel function f:¬• Æ ¬
is constant on a comeager set of full measure.
But this does not allow us to derive Theorem 9.9 because
For all c, {x Œ¬•: c is a coordinate of x}
is meager and null.
There are metamathematical results to the effect that
Theorem 9.9 is a separable theorem that has no separable 44
proof. One formulation that is not wholly satisfactory is
that Theorem 9.9 cannot be proved in countable set theory.
But Theorem 9.9 cannot even be directly stated in countable
set theory since it mentions ¬.
Two ways of clarifying this: One is to indicate how
countable set theory can formulate separable statements
like Theorem 9.9, by treating Borel functions as recipes
for producing values at arguments, thereby avoiding the use
of any uncountable sets.
A second way is less standard but perhaps better. This is
to use class theory with countable sets. The complete
separable metric spaces are classes rather than sets. Borel
sets and Borel functions are treated as classes.
Under both approaches, one sees how separable mathematics
is easily formalized, and we can prove that Theorem 9.9 is
not provable in the corresponding formal systems. The class
theory with countable sets includes the set theory with
countable sets.
The usual axioms of ZFC:
1.
2.
3.
4.
5.
6.
7.
8.
9. Extensionality.
Pairing.
Union.
Separation.
Infinity.
Foundation.
Choice.
Power set.
Replacement. Countable set theory is obtained by removing power set,
written ZFC\P. Can’t prove the existence of uncountable
sets.
The class theoretic approach is based on NBG = von Neumann,
Bernays, Godel, which is the standard class theory
associated with ZFC. For our purpose, use NBG\P + AxC.
If we add the axiom “S(N) exists” to either of these
systems, we can prove Theorem 9.9.
Our original method for proving such Borel theorems uses
the forcing technique of Paul J. Cohen, which was invented 45
to show the independence of the continuum hypothesis from
the axioms of choice over ZFC.
Sometimes one can straightforwardly eliminate all razzle
dazzle in favor of simple Baire category arguments, such as
for Theorem 9.9. However, most often the elimination is
awkward and unrewarding.
There are a number of further Borel diagonalization results
at this level of logical power.
THEOREM 9.10. Let f:¬• Æ ¬• be a Borel function such that
rng(x) = rng(y) Æ rng(f(x)) = rng(f(y)). There exists x
such that rng(f(x)) Õ rng(x).
For x,y Œ ¬•, define x ~ y iff y is a permutation of x.
THEOREM 9.11. Let f: ¬• Æ ¬• be such that x ~ y Æ f(x) ~
f(y). There exists x Œ ¬• such that F(x) is a subsequence
of x.
Let 2N be the usual Cantor space. Let s:2N Æ 2N be given by
s(x)(n) = x(n+1). f:K Æ K is called shift invariant iff
f(x) = f(sx).
THEOREM 9.12. Let f:K Æ K be a shift invariant Borel
function. There exists x Œ K such that f(x) = x(2).
Here x(2) = (x1,x4,x9,x16,…).
Let T be the circle group. f:T Æ T is double invariant if
and only if T(2x) = T(x).
THEOREM 9.13. There is a Borel f:T Æ T which agrees with
every double invariant g:T Æ T somewhere.
There are some additional Borel statements of this same
level of logical power.
Let GRP(N) be the groups with domain N.
THEOREM 9.14. Let f:GRP(N) Æ GRP(N) be an isomorphically
invariant Borel function. There exists G such that f(G) is
embeddable in G. 46
We now move to a much higher level: Let E Õ ¬¥¬ be a Borel
equivalence relation. E induces an equivalence relation on
¬•, coordinatewise.
THEOREM 9.15. Let E be a Borel equivalence relation on R
and f:¬• Æ ¬ be Borel, where x E y Æ f(x) E f(y). There
exists x Œ ¬• such that f(x) is Eequivalent to a coordinate
of x.
Let GRA(N) be the graphs with vertex set N (undirected, no
multiple edges). Subgraphs are obtained by deleting edges
and vertices. Induced subgraphs are obtained by deleting
vertices only. GRA(N) forms a Cantor space.
THEOREM 9.16. Let f:GRA(N) Æ GRA(N) be an isomorphically
invariant Borel function. There exists G such that every
connected component of f(G) is isomorphic to a connected
component of G.
THEOREM 9.17. Let f:GRP(N)• Æ GRP(N) be an isomorphically
invariant Borel function. There exists x such that f(x) is
isomorphic to a coordinate of x.
Theorems 9.15  9.17 are provable in ZFC\P plus “the power
set operation can be iterated along any countable ordinal”.
This is legal; i.e., it is a subsystem of ZFC.
However, ZFC\P plus “the power set operation can be
iterated along a”, where a is any suitably specified
countable ordinal, is not sufficient for any of Theorems
9.15 – 9.17.
10. Borel selection/ antiselection in symmetric Borel sets.
We will be at the level of ZFC\P + “the power set operation
can be iterated along any countable ordinal”.
The history starts with work on infinite games. Two players
I,II successively play natural numbers, for infinitely many
years.
At the end of the game, I has played x Œ NN and II has
played y Œ NN.
Before they start playing, W Õ NN¥NN is given. After they
finish playing, the win is assigned to I or II according to
whether (x,y) Œ W. 47 There is a clear notion of winning strategy. This is a
recipe for producing plays according to the earlier plays
of your opponent, so that no matter how your opponent
plays, you will win the game after infinitely many plays.
A series of results appeared saying that if W is a low
level Borel set then one of the two players has a winning
strategy for the game associated with W. The proofs got a
bit more exotic as one crawled up a little ways in the
Borel hierarchy.
In the mid 1960's, D.A. Martin proved the existence of
winning strategies for all Borel W using axioms far beyond
those available in ZFC. This is called Borel determinacy
(BD).
In 1968, we showed that there is no proof of BD in ZFC\P +
"the power set operation can be iterated along a", where a
is any suitably specified countable ordinal.
In 1974, Martin finally gave a legal proof of BD; i.e., in
ZFC, and in fact using just ZFC\P + "the power set
operation can be iterated along any countable ordinal".
According to Martin, knowledge that this logical power is
necessary was crucial.
In the 1980's we brought BD into the realm of classical
analysis as follows. Let E Õ ¬x¬ and A Õ ¬. A selection
for E on A is a function f:A Æ ¬ such that for all x Œ A,
(x,f(x)) Œ E. We say that E is symmetric if and only if
(x,y) Œ E ´ (y,x) Œ E.
THEOREM 10.1. Let E Õ ¬¥¬ be a symmetric Borel set. Then E
or ¬\E has a Borel selection on ¬.
We showed that Theorem 10.1 has the same status as BD.
I.e., we need arbitrarily long countable well ordered
iterations of the power set operation. As in BD, the number
of iterations needed corresponds directly to the level in
the Borel hierarchy of the given Borel set (and not on the
level of the Borel selection obtained).
11. Borel selection in Borel sets.
Logical issues in Borel statements remained dormant from
the mid 80’s to this century. We came into contact with G. 48
Debs of Paris VII and his series of joint papers with Saint
Raymond concerning selection theorems (they use different
terminology).
They mostly worked in more general settings than Borel
functions, and used strong set theoretic methods beyond
ZFC. We asked Debs what happens when their stuff is pared
down to Borel functions. The reply was that high powered
set theoretic methods appear to still be needed. In
particular, they would still need to use BD for some
statements, and principles outside ZFC for others.
We know high powered set methods are necessary here.
Many of their theorems take the form of local/global
principles of selection.
THEOREM 11.1. Let S Õ ¬¥¬ be Borel and E Õ R be Borel with
empty interior. If there is a continuous selection for S on
every compact subset of E, then there is a continuous
selection for S on E.
The proof uses BD, and so it suffices to use iterations of
the power set operation along any countable ordinal. We
have shown that iterations along any suitably specified
countable ordinal does not suffice.
PROPOSITION 11.2. Let S Õ R x R and E Õ R be Borel. If
there is a Borel selection of S on every compact subset of
E, then there is a Borel selection for S on E.
Debs/Saint Raymond prove Proposition 11.2 by using the
following technical axiom that goes beyond ZFC:
COUNT. There are at most countably many sets of integers
Gödel constructible in any given set of integers.
COUNT has been studied by set theorists already in the
1960’s. It is immediate from Gödel that COUNT is not
provable in ZFC.
COUNT is somewhat tame because of the following relative
consistency result due to Levy and Solovay:
THEOREM 11.3. ZFC + COUNT is consistent iff ZFC + “there is
a strongly inaccessible cardinal” is consistent. 49
Some algebraic geometers are familiar with the assumption
“there is a strongly inaccessible cardinal” as the set
theoretic formulation of “there is a Grothendieck
universe”.
We showed that Proposition 11.2 is not provable in ZFC.
Our results concerning Theorem 11.1 and Proposition 11.2
are an elaboration on our original techniques from 1968
showing the logical power of BD. They involve various
manipulations of non well founded models of fragments of
ZFC.
DIGRESSION ON BOREL STATEMENTS OF GREAT LOGICAL STENGTH.
Let FG(N) be the space of all finitely generated groups
whose domain is N. This is a Borel subspace of the Baire
space B(N) of binary functions from N into N.
The following is provable using all finite number of
iterations of the power set operation starting at N, but
not any finite number.
THEOREM 11.4. Let F:FG(N)• Æ FG(N) be an isomorphically
invariant Borel function. There exists x Œ FG(N)• such that
f(x) is embeddable in a coordinate of x.
Now consider the following.
THEOREM 11.5. Let F:FG(N)• Æ FG(N) be an isomorphically
invariant Borel function. There exists x Œ FG(N)• such that
for all infinite subsequences y of x, f(y) is embeddable in
a coordinate of y.
This is proved only by using axioms that go far beyond
Grothendieck universes, or arbitrarily large such, etc.
Furthermore, we do not have tameness in the sense that one
cannot derive the consistency of ZFC + Theorem 11.5 from,
say, ZFC + “there are arbitrarily large Grothendieck
universes”. END OF DIGRESSION.
12. 6561 cases of Boolean relation theory.
We have discovered a general class of mathematical problems
that make good sense in a great variety of contexts, but
carry severe logical difficulties even in concrete
contexts. 50 Boolean Relation Theory (BRT) concerns the Boolean
relations between sets and their images under multivariate
functions.
More specifically, let f be a multivariate function and A
be a set. We define
fA = {f(x1,…,xk): k is the arity of f and x1,…,xk Œ A}.
We find it very convenient to suppress the arity of f and
use the notation fA.
BRT is done in BRT settings. A BRT setting is a pair (V,K),
where V is a set of multivariate functions and K is a set
of sets.
There is a ridiculously large number of such pairs that are
natural signatures of myriad areas and subareas of
mathematics.
In baby BRT we consider only one function and one set. We
seek to understand all statements of the following form in
a fixed BRT setting (V,K):
For all f Œ V there exists A Œ K such that a given Boolean
relation holds between A,fA.
“Boolean relation” comes in flavors; e.g.; in equational
BRT, we mean Boolean equations. In inequational BRT, we
mean Boolean inequations.
Here are some examples. Let MF(N) be the set of all
functions whose domain is some Nk and whose range is a
subset of N. Let SD(N) be the set of all f Œ MF(N) such
that for all x Œ dom(f), f(x) > x, where   is the sup
norm. Let INF(N) be the set of all infinite subsets of N.
1. For all f Œ MF(N) there exists infinite A Õ N such that
fA ≠ N. (Complementation theorem).
2. For all f Œ SD(N) there exists infinite A Õ N such that
fA = N\A. (Thin set theorem).
3. Let V be the set of all linear operators on Hilbert
spaces and K is the set of all nontrivial subspaces. For
all f Œ V, there exists A Œ K such that fA Õ A. 51 1 encapsulates the essence of recursion. 2 lies at the
heart of the original Ramsey theorem. 3 is open (the
invariant subspace problem).
THEOREM 12.1. For all f Œ SD(N) there exists infinite A Õ N
such that fA = N\A. There is a unique A Õ N such that fA =
N\A.
To see existence, suppose we have determine membership of
0,1,…,r1 in A.
Put r in A iff r œ fA so far. QED
1 and 3 lie in equational BRT. We can put 2 in equational
BRT as follows:
2’. For all f Œ SD(N) there exists infinite A,B Õ N such
that fA = B and A « B = ∅.
We have determined the truth value of all statements in
equational and inequational BRT for (MF(N),INF(N)) and
(SD(N),INF(N)), with 1 function and 1 set. There are 16
statements to analyze in each of these four cases.
In adult BRT we consider k functions and n sets. Fix a BRT
setting (V,K).
We seek to understand all statements of the following form:
For all f1,…,fk Œ V there exists A1,…,An Œ K such that a
given Boolean relation holds between these sets and their
images under the functions.
Let f Œ MF(N)., We say that f is of expansive linear growth
iff there exist c,d > 1 such that for all but finitely many
x Œ dom(F),
cx £ f(x) £ dx.
We use X ». Y for X » Y together with the commitment that
X,Y are disjoint.
E.g., ]
X ». Y Õ Z ». W 52
means
X»YÕZ»WŸ
X « Y = ∅ Ÿ Z « W = ∅.
Write ELG(N) for the set of all f Œ MF(N) of expansive
linear growth. We work in the BRT setting (ELG(N),INF(N)).
PROPOSITION 12.2. For all f,g Œ ELG(N) there exists A,B,C Œ
INF(N) such that
A ». fA Õ C ». gB
A ». fB Õ C ». gC.
We have given a proof of Proposition 12.2 using certain
large cardinal that go well beyond the usual axioms of ZFC.
We have also shown that ZFC alone does not suffice. In
fact, a little less potent large cardinals than are used in
the proof do not suffice.
The large cardinals involved are called the Mahlo cardinals
of finite order. These cardinals are far beyond
Grothendieck universes and standard elaborations of them.
It is clear that Proposition 12.2 has a particularly simple
structure compared to a typical statement in BRT. In fact,
the two clauses there have the form
X ». fY Õ Z ». gW
S ». fT Õ U ». gV
where X,Y,Z,W,S,T,U,V are among the three letters A,B,C.
This amounts to a particular set of instances of BRT of
cardinality 38 = 6561.
We have been able to show that all of these 6561 statements
are provable or refutable using the same large cardinal
axioms that we use to prove Proposition 12.2 Obviously, we
need the large cardinal axioms since 12.2 is among the
6561.
In fact, Proposition 12.2 is the only one of the 6561 up to
symmetry that involves any logical difficulties.
Furthermore, the logical difficulties associated with
Proposition 12.2 appear even in concrete forms of these
6561 cases where the functions and sets are drawn from nice
countable classes of very effective functions and sets. 53
We think BRT is mathematically sufficiently interesting
that mathematicians will develop it for its own sake,
despite the necessary rethinking of the foundations for
mathematics that this entails.
BRT itself is a special case of what we call “intellectual
parameterization”.
We conjecture that “intellectual parameterization” will
lead to a thematic expansion of mathematics where the
necessary use of large cardinal axioms going well beyond
the currently usual axioms for mathematics (ZFC) will
become standard. ...
View
Full Document
 Fall '08
 JOSHUA
 Math, Natural number, HBT, Borel

Click to edit the document details