This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: A CONTINUOUS MODEL OF
COMPUTATION A central dogma of com—
puter science is that the Turingmachine model is the Physicists should consider an alternative
to the Turingmachine model of of computation: the Turing
machine and realnumber
models. In the interest of full appropriate abstraction of a C0111 utation. disclosure, 1 must tell you
digital computer. Physicists P that I’ve always used the
who’ve thought about the realnumber model in my
matter also seem to favor the Joseph F. Traub work as a computer scientist. Turingmachine model. For
example, Roger Penrose de
voted some 60 pages of a book1 to a description of this
abstract model of computation and its implications. I argue
here that physicists should consider the realnumber
model of computation as more appropriate and useful for
scientiﬁc computation. First, I introduce the four ‘ﬁvorlds” that play a role
here. Above the horizontal line in the diagram on this page
are two real worlds: the world of physical phenomena and
the computer world in which simulations are performed.
Below them are represented two formal worlds: a mathe
matical model of some physical phenomenon and a model of
computation that is an abstraction of a physical computer.
We get to choose both the mathematical model and the model
of computation. What type of models should we choose? The physicist often chooses a continuous mathemati
cal model for the
phenomenon un
der consideration.
Continuous mod
els range from the
dynamical sys
tems of classical
physics to the op—
erator equations
and path inte
grals of quantum
mechanics. These
models are based on the real numbers (as distinguished
from the subset of rational numbers). The real numbers
are, of course, an abstraction. It takes an inﬁnite number
of bits to represent a single real number. (A rational
number, by contrast, requires only a ﬁnite number of bits.)
But inﬁnitely many bits are not available in the universe.
One uses the continuous domain of the real numbers
because it is a powerful and useful construct. Let us accept
that continuous models are widely used in mathematical
physics, and that they will continue to occupy that role
for the foreseeable future. But the computer is a ﬁnite
state machine. What should we do when the continuous
mathematical model meets the ﬁnitestate machine? In the next section I compare and contrast two models JOSEPH TKAUB is the Edwin HowardArmstmng Professor of
Computer Science at Columbia University in New York City.
His homepage is www.cs.columl7imedu/~ trawl). @ 1999 American Insmule of Physics, 5700317922339050509 But I do my best here to
present balanced arguments.
Then I show how the realnumber model is used in the
study of the computational complexity of continuous
mathematical models. (Computational complexity is a
measure of the minimal computational resources required
to solve a mathematically posed problem.) This is the
province of a branch of complexity theory called informa
tionbased complexity, and what follows is intended to
demonstrate the power of this theory. Two models of computation Although many readers are familiar with the Turingma
chine model I start by describing it brieﬂy. Then, after
describing the realnumber model, I will discuss the pros
and cons of these two models. Alan Turing, one of the intellectual giants of the
twentieth century,
defined his ma
chine model to es
tablish the unsolv
ability of David
Hilbert’s Enlschelid
ungsproblem? the
problem of ﬁnding
an algorithm for
deciding (ent
scheiden, in Ger
man) whether any
given mathematical proposition is true or false. The Turing
machine is a gedankengadget employing a tape of un
bounded length, divided into sequential squares, each of
which is either blank or contains a single mark. For any
particular input, the resulting calculation and output are
ﬁnite; that is to say, the tape is blank beyond a certain
point. The machine’s reading head reads one square at a
time and, after making or erasing a mark, moves one
square to the left or right. The machine has a ﬁnite number
of internal states. Given its initial state and the input
sequence on the tape, the machine changes its state and the
head prints a symbol and moves one square. Finally, the
machine decides when to halt. We turn now to a very brief description of the real
number model; a precise formulation may he found in the
literaturef‘v4 The crux of this model is that one can store
and perform arithmetic operations and comparisons on
real numbers exactly and at unit cost. (For the moment,
I defer discussion of “information operations”) The realnumber model has a long history. Alexander pater simulation ' l f computation MAY 1999 PHYSICS TODAY 39 Ostrowski used it in his work on the computational com
plexity of polynomial evaluation in 1954. In the 1960s, I
used the realnumber model for research on optimal it
eration theory and Shmuel Winograd and V'olker Strassen
employed it in their work on algebraic complexity. Henryk
Woiniakowski and I used the realnumber model in a 1980
monograph on informationbased complexity. The 1989
formalization of the realnumber model for continuous
combinatorial complexity by Lenore Blum, Michael Shub,
and Steven Smale initiated a surge of research on com
putation over the real numbers. Both models are abstractions of real digital comput—
ers. Of the Turingmachine model, Penrose wrote, “It is
the unlimited nature of the input, calculation space, and
output which tells us that we are considering only a
mathematical idealization.”1 Which abstraction to use de
pends on how useful that abstraction is for a given pur
pose. What are the pros and cons of these two models of
computation? Turingmachine model—pros and cons In favor of the Turingmachine model, one can say that
it’s desirable to use a ﬁnitestate abstraction for a finite
state machine. Moreover, the Turing machine’s simplicity
and economy of description are attractive. Furthermore,
it is universal in two
senses: First is the Church—
Turing thesis, which states
that what a Turing machine
can compute may be consid
ered a universal deﬁnition
of computability. (Comput
ability on a Turing machine
is equivalent to comput
ability in the lambda calculus, a logical system formulated
by Alonzo Church in 1936.) Although one cannot prove
the ChurchJI‘uring thesis, it appeals to our intuitive
notion of computability. There is also a second sense in which the Turing
machine is universal: All “reasonable” machines are poly—
nomially equivalent to Turing machines. Informally, this
means that if the minimal time to compute an output on
a Turing machine is T(n)for an input of size n and if the
minimal time to compute an output on any other machine
is Sm), then T does not grow faster than a power of S.
Therefore, one might as well use the Turing machine as
the model of computation. I am, however, not convinced
of the assertion that all reasonable machines are polyno
mially equivalent to Turing machines, but I’ll defer my
critique for the moment. What are the disadvantages of the Turingmachine
model? I believe it is not natural to use such a discrete
model in conjunction with continuous mathematical mod
els. Furthermore, estimated running times on a Turing
machine are not predictive of scientiﬁc computation on
digital computers. One reason for this is that scientiﬁc
computation is usually done with ﬁxedprecision ﬂoating
point arithmetic, so that the cost of arithmetic operations
is independent of the size of the operands. Turingmachine
operations, by contrast, depend on the sizes of numbers. Finally, there are interesting computational models
that are not polynomially equivalent to the Turingma
chine model. Consider the example of a randomaccess
machine in which multiplication is a basic operation and
memory access, multiplication, and addition can be per—
formed at unit cost. Such machines go by the ungainly
acronym UMRAM. This seems like a reasonable abstrac
tion of a digital computer, in which multiplication and
addition on ﬁxedprecision ﬂoating point numbers cost
about the same. But the UMRAM is not polynomially 40 MAY 1999 PHYSICS TODAY equivalent to a Turing machine! However, a RAM that
does not have multiplication as a fundamental operation
is polynomially equivalent to a Turing machine, Realnumber model—pros and cons What are the advantages of the realnumber model? Be
cause physicists generally assume a continuum, their
mathematical models are often continuous, employing the
domain of the real (and complex) numbers. It seems
natural, therefore, to use the real numbers in analyzing
the numerical solution of continuous models on a digital
computer. If we leave aside the possibility of numerical insta—
bility, computational complexity in the realnumber model
is the same as it is for ﬁxed—precision, ﬂoatingpoint
arithmetic. Therefore the realnumber model is predictive
of running times for scientiﬁc computations. Studying
computational complexity in the realnumber model has
led to new, superior methods for doing a variety of scien
tiﬁc calculations. Another reason for using the real—number model is
that it makes available the full power of continuous
mathematics. I give an example below, when I discuss a
result on noncomputable numbers and its possible impli
cations for physical theories. With the realnumber model
and the techniques of analy
sis (that is to say, the mathe
matics of continuous func
tions), this result is estab
lished in about a page. With
the Turingmachine model,
by contrast, the proof re
quires a substantial part of
a monograph. The argument for using the power of analysis was
already made in 1948 by John von Neumann, one of the
leading mathematical physicists of the century and a
father of the digital computer. In his Hixon Symposium
lecture, von Neumann argued for a “more speciﬁcally
analytical theory of automata and of information.” He said: There exists today a very elaborate system of formal logic, and speciﬁcally, of logic as applied to mathematics. This is a discipline with many good sides, but also serious weaknesses. . . . Everybody who has worked in formal logic will conﬁrm that this is one of the technically most refractory parts of mathematics. The reason for
this is that it deals with rigid, allornone con
cepts, and has very little contact with the con
tinuous concept of the real or of the complex
number, that is, with mathematical analysis. Yet
analysis is the technically most successful and
bestelaborated part of mathematics. . . . The
theory of automata, of the digital, allornone
type as discussed up to now, is certainly a chap ter in formal logic.5
We may adopt these observations, mutatis mutandis, as
an argument for the real—number model. Recently, Blum
and coauthors6 have argued for the realnumber model,
asserting that “the Turing model . . . is fundamentally
inadequate for giving . . . a foundation to the theory of
modern scientiﬁc computation, where most of the algo
rithms . . . are realnumber algorithms." Against the realnumber model, one can point out
that digital representations of real numbers do not exist
in the real world. Even a single irrational real number is
an inﬁnite, nonrepeating decimal that requires inﬁnite
resources to represent exactly. We say that the realnum
ber model is not ﬁnitistic. But neither is the Turingma
chine model, because it utilizes an unbounded tape. It is therefore potentially inﬁnite. Nevertheless, the Turing
machine model, because it is unbounded but discrete, is
less inﬁnite than the realnumber model. It would be
attractive to have a ﬁnite model of computation. There
are finite models, such as circuit models and linear
bounded automata, but they are only for specialpurpose
computation. Informationbased complexity To see the real—number model in action, I indicate below
how to formalize computational complexity issues for con
tinuous mathematical problems and then describe a few
recent results To motivate the concepts, I choose the
example of ddimensional integration, because of its im—
portance in ﬁelds ranging from physics to ﬁnance. I will
touch brieﬂy on the case
(1: 00, that is to say, path
integrals. Suppose we want to
compute the integral of a
realvalued function f of cl
variables over the unit cube
in d dimensions. Typically,
we have to settle for com
puting a numerical approximation with some error e < 1.
To guarantee an eapproximation, we need some global
information about the integrand. We assume, for example,
that this class of integrands has smoothness n One way
of deﬁning such a class is to let F, be the class of those
functions whose derivatives, up through order I; are uni
formly bounded. A real function of a real variable cannot be entered
into a digital computer. Therefore, we evaluate the func
tion at a ﬁnite number of points, calling that set of values
“the information” about f. An algorithm then combines
these function values into a number that approximates
the integral. In the worst case, we guarantee an error of at most
a for every f in F,.. The computational complexity is the
least cost of computing the integral to within a for every
such f. We charge one unit for every arithmetic operation
and comparison, and 0 units for every function evaluation.
Typically, c >> 1. I want to stress that the complexity
depends on the problem and on a, but not on the algorithm.
Every possible algorithm, whether or not it is known, and
all possible points at which the integrand is evaluated,
are permitted to compete when we consider the least cost. Nikolai Bakhvalov showed in 1959 that the complex
ity of our integration problem is of order 2"”. For r = 0,
with no continuous derivatives, the complexity is inﬁnite;
that is to say, it is impossible to solve the problem to
within 8. But even for any positive r, the complexity
increases exponentially with d, and we say that the
problem is “computationally intractable.” The curse of dimensionality That kind of intractability is sometimes called the “curse
of dimensionality.” Very large numbers of dimensions
occur in practice. In mathematical ﬁnance, d can be the
number of months in a 30year mortgage. Let us compare our ddimensional integration prob—
lem with the Traveling Salesman Problem, a wellknown
example of a discrete combinatorial problem. The input is
the location of n cities and the desired output is the
minimal route that includes them all. The city locations
are usually represented by a ﬁnite number of bits. There
fore, the input can be exactly entered into a digital
computer. The complexity of this combinatorial problem
is unknown, but is conjectured to he exponential in the
number of' cities, renderingr the problem computationally intractable. In fact, many other combinatorial problems
are conjectured to be intractable. This is a famous un—
solved issue in computer science. Many problems in scientiﬁc computation that involve
multivariate functions turn out to be intractable in the
worstcase setting; their complexity grows exponentially
with the number of variables Among the intractable
problems are partial differential and integral equations,7
nonlinear optimization,8 nonlinear equations,9 and func
tion approximation”. One can sometimes get around the curse of dimen
sionality by assuming that the function obeys a more
stringent global condition than simply belonging to F,. If,
for example, one assumes that a function and its con
straints are convex, then its nonlinear optimization re
quires only on the order of
log(1/c) evaluations of the
function.8 In general, information
based complexity assumes
that the information con
cerning the mathematical
model is partial, contami
nated, and priced. In our in—
tegration example, the mathematical input is the inte
grand and the information is a finite set of function values.
The information is partial because the integral cannot be
recovered from the function values. For a partial differ
ential equation, the mathematical input would be the
functions specifying the boundary conditions. Usually, the
mathematical input is replaced by a ﬁnite number of
information operations—for example, ﬁinctionals on the
mathematical input or physical measurements that are fed
into a mathematical model. Such information operations}4
in the realnumber model, are permitted at cost c. In addition to being partial, the information is often
contaminated,11 for example, by measurement or rounding
errors. If the information is partial or contaminated, one
cannot solve the problem exactly. Finally, the information
has a price. For example, the information needed for
oilexploration models is obtained by the explosive trig
gering of shock waves. With the exception of certain
ﬁnitedimensional problems, such as ﬁnding roots of sys
tems of polynomial equations and doing matrixalgebra
calculations, the problems typically encountered in scien
tiﬁc computation have information that is partial, con
taminated, and priced. Informationbased complexity theory is developed
over abstract spaces such as Banach and Hilbert spaces,
and the applications typically involve multivariate func
tions. We often seek an optimal algorithm—one whose
cost is equal or close to the complexity of the problem.
Such endeavors have sometimes led to new methods of
solution. The information level The reason why we can often obtain the complexity and
an optimal algorithm for informationbased complexity
problems is that partial or contaminated information lets
one make arguments at the information level. In combi
natorial problems, by contrast, this information level does
not exist, and we usually have to settle for conjectures
and attempts to establish a hierarchy of" complexities. A powerful tool at the information level—one that’s
not available in discrete models of computation;is the
notion of the radius of information, denoted by R. The
radius of information measures the intrinsic uncertainty
of solving a problem with a given body of information.
The smaller this radius, the better the information. An
s—approximation can be computed if, and only if, R <2. MAY 1999 PHYSICS TODAY 41 The radius depends only on the problem being solved and
the available information; it is independent of the algo—
rithm. In every informationbased complexity setting, one
can deﬁne an R. (We’ve already touched on the worstcase
setting above, and two additional settings are to come in
the next section.) One can use R to deﬁne the “value of
information,” which I believe is preferable, for continuous
problems, to Claude Shannon’s entropybased concept of
“mutual information.”4 I present here a small selection of recent advances in
the theory of informationbased complexity: highdimen
sional integration, path integrals, and the unsolvability of
illposed problems. Continuous multivariate problems are, in the worst
case setting, typically intractable with regard to dimen
sion. That is to say, their complexity grow exponentially
with increasing number of dimensions. One can try in two
ways to attempt to break this curse of dimensionality:
One can replace the ironclad worstcase eguarantee by a
weaker stochastic assurance, or one can change the class
of mathematical inputs. For high—dimensional integrals,
both strategies come into play. Monte Carlo Recall that, in the worstcase setting, the complexity of
ddimensional integration is of order (1 / 3%”. But the
expected cost of the Monte Carlo method is of order (l/a)2,
independent of d. This is
equivalent to the common
knowledge among physic
sists that the error of a
Monte Carlo simulation
decreases like r171 "2. This
expression for the cost
holds even if r: 0. But
there is no free lunch.
Monte Carlo computation carries only a stochastic assur
ance of small error. Another stochastic situation is the averagecase set
ting. Unlike Monte Carlo randomization, this is a deter
ministic setting with an a prior...
View
Full Document
 '07
 Pottebaum

Click to edit the document details