6. LINEAR LEAST SQUARES AND RELATED PROBLEMS
Linear least squares constitutes one of the most important classes of optimization prob
lems in modern society, primarily because of its central role in statistical data analysis.
Within optimization itself, linear least squares provides basic tools needed in constrained
optimization and serves as a prototype for more complicated problems.
6.1 A Motivating Application — CurveFitting
Suppose we have a finite data set of data pairs (see Figure 6.1)
(
t
1
, s
1
)
, . . . ,
(
t
m
, s
m
)
meant to represent the values of some function
s
=
f
(
t
), where
f
is assumed to belong to
a specified class (e.g., all polynomials of degree less than
p
, or all sums of trig functions).
t
s
Figure 6.1 Scatterplot of data pairs and a curve of the form
s
=
at
+
bt
2
+
c
sin
t
Goal 6.1.1
(Best fit)
.
Choose
f
from the specified class of functions so as to obtain the
“best” fit to the given data points. The notion of “best” is viewed subjectively as meaning
that

s
i

f
(
t
i
)

is made as small as possible for all
i
.
A typical context is that the theory in some area of application suggests that the data
in Figure 6.1 should be representable in the form
s
=
at
+
bt
2
+
c
sin
t,
for some choice of the parameters (
a, b, c
). Our job is to make each of the quantities
fl
fl
s
i

[
at
i
+
bt
2
i
+
c
sin
t
i
]
fl
fl
,
for
i
= 1
,
· · ·
, m
as small as possible.
The difficulty is that a choice of (
a, b, c
) which makes

s
1

f
(
t
1
)

very small may make

s
3

f
(
t
3
)

very large. Consequently, we need some sort of aggregate
measure over all the deviations

s
i

f
(
t
i
)

. Here are some popular choices:
60
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
(1)
l
1
fit
minimize
f
1
(
a, b, c
) =
m
X
i
=1
fl
fl
s
i

[
at
i
+
bt
2
i
+
c
sin
t
i
]
fl
fl
(2)
leastsquares fit
minimize
f
2
(
a, b, c
) =
m
X
i
=1
fl
fl
s
i

[
at
i
+
bt
2
i
+
c
sin
t
i
]
fl
fl
2
(
∞
)
minimax fit
minimize
f
∞
(
a, b, c
) = max
i
fl
fl
s
i

[
at
i
+
bt
2
i
+
c
sin
t
i
]
fl
fl
Each of these objective functions has desirable qualities.
In particular, all are convex
functions of (
a, b, c
):
• 
x

and

x

2
are convex on
R
;
•
(
a, b, c
)
7→
s
i

[
at
i
+
bt
2
i
+
c
sin
t
i
] is linear (affine) on
R
3
.
•
f
1
and
f
2
are sums of convex functions, whereas
f
∞
is a maximum of convex functions.
The functions
f
1
and
f
∞
are special cases of “linear programming,” which is discussed
later in the course. The present chapter focuses on least squares and its special properties:
f
2
is twice differentiable in (
a, b, c
) and the optimality conditions for
f
2
can be treated
directly by linear algebra techniques.
6.2 Linear Least Squares
As in the preceding section, we assume we’re given data points (
t
1
, s
1
)
, . . . ,
(
t
m
, s
m
) and
a prescribed list of functions
q
1
, . . . , q
n
. Our goal is to find the coefficients
x
1
, . . . , x
n
that
minimize the objective
f
2
(
x
1
, . . . , x
n
) =
m
X
i
=1

s
i

[
x
1
q
1
(
t
i
) +
· · ·
+
x
n
q
n
(
t
i
)]

2
.
Observe that the list of values
x
1
q
1
(
t
1
) +
· · ·
+
x
n
q
n
(
t
1
)
,
.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '12
 DouglasWard
 Linear Algebra, Least Squares, ax, Linear least squares

Click to edit the document details