E7_lecture18_regression_F08

E7_lecture18_regression_F08 - 1 E7 INTRODUCTION TO COMPUTER...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 E7: INTRODUCTION TO COMPUTER PROGRAMMING FOR SCIENTISTS AND PROGRAMMING FOR SCIENTISTS AND ENGINEERS ENGINEERS Lecture Outline 1. Least squares solution when n = # equations >> m = # of unknowns equations >> of unknowns 2. Regression and curve fit fit Copyright 2007, Horowitz, Packard. This work is licensed under the Creative Commons Attribution-Share 2007 Horowitz Packard This work is licensed under the Creative Commons Attribution Alike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA. E7 L15 Linear equations in matrix form • Consider n LINEAR equations and m unknowns.    A11 A12 · · · A1m    p1    A21 A22 · · · A2m    y1 .   . .   p2  ... . .  .     p  =  y2  . . .  3  ... . . . .  .  . . . .  .  ... . . yn . 2 An1 An2 · · · Anm pm A matrix (n x m) p vector (m x 1) unknown y vector (n x 1) E7 L15 Linear equations in matrix form 3 Linear equations in matrix form when n >> m 4 Ap = y n> m Ap = n-by-m m-by-1 n-by-1 Ap = y n >> m y = n-by-m n-by-1 E7 L15 m>n A n-by-m y p Ap = y n: number of equations is much larger than m: number of unknowns m-by-1 n-by-1 E7 L15 m-by-1 The backslash operator: p = A\y 5 The backslash operator: p = A\y What happens if the least squares (LS) solution is not happens if the least squares (LS) solution is not unique and n  m? 6 Given a set of n linear equations with m unknowns Ap = y m > rank(A) = r # of columns of A of columns of # of LI columns of A of LI columns of computes a basic LS >> p = A\y solves the least squares problem. th Computes p that solves : >> p = A\y cf = min ||A p c y || p E7 L15 solution, which has at most components r nonzero when r < m (When n >> m, this is normally not the case…) E7 L15 Linear Regression: Curve-fitting with minimum error Curve- 8 Linear Regression: Computing the LS solution Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn), pairs: (x y (x y (x y given set of parameters: we can • For a given set of parameters: p1 and p2 , we can compute the errors ei’s as follows 8 Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn) pairs: (x y (x y (x y Find the linear function: the linear function: y (x) = p1x + p2 ˆ that minimizes the sum of error squares: error squares: 2 1 0 e 1 = p 1 x1 + p 2 1 c y 1 e 2 = p 1 x2 + p 2 1 c y 2 y (xk ) c yk ˆ -6 -4 -2 0 y -2 -3 -4 -5 -1 . . 4 6 8 10 J= n X i=1 (yi c y (xi))2 ˆ -68 - x2 e n = p 1 x n + p2 1 c y n E7 L15 E7 L15 Linear Regression: Computing the LS solution Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn), pairs: (x y (x y (x y • In vector/matrix form: vector/matrix form: 9 Linear Regression: Computing the LS solution Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn), pairs: (x y (x y (x y • In vector/matrix form: vector/matrix form: 10      e1 e2 . . . en       =   x1 x2 . . . xn 1 1 . . . 1      " p1 p2 #    c  y1 y2 . . . yn      e = Ap cy >> p = A\y solves the least squares problem solves the least squares problem. e vector (n x 1) A matrix p vector y vector (n x 2) (2 x 1) (n x 1) cf = min ||A p c y || p e E7 L15 E7 L15 Linear Regression: Computing the LS solution Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn), pairs: (x y (x y (x y • In vector/matrix form: vector/matrix form: 11 Linear Regression Matlab Code function [p1,p2] = linreg(x,y) [p1 linreg(x % % % % % % Fits a linear function y = p1*x + p2 min p to the data given by x, y Verify x and y are column vectors of same length. vectors of same length ª x1 1º ª y1 º « x 1» p «» « 2 » ª 1 º  « y2 » « # # » « p2 » « # » ¬¼ « » «» ¬ xn 1¼ ¬ yn ¼ 12 e = Ap cy >> p = A\y    A=  solves the linear regression problem. solves the linear regression problem. x1 x2 . . . xn 1 1 . . . 1         y=  y1 y2 . . . yn      " p= p1 p2 # n = length(x); p = [x ones(n,1)]\y; p1 = p(1); p(1); p2 = p(2); E7 L15 E7 L15 Linear regression example: Data: >> years = [1950;1955;1960;1965;1970;1975;1980;1985;1990;1995]; >> pop = [2.55;2.78;3.04;3.35;3.71;4.09;4.45;4.85;5.28;5.69]; 13 Linear regression example: Data: >> years = [1950;1955;1960;1965;1970;1975;1980;1985;1990;1995]; >> pop = [2.55;2.78;3.04;3.35;3.71;4.09;4.45;4.85;5.28;5.69]; 14 6 >> >> >> >> >> close all, figure(1) plot(years,pop,'o'), xlabel('years') ylabel('population (m)') 5.5 5 population (m) >> [p1,p2]=linreg(years,pop) p1 = 0.0709 p2 = -135.8653 population (m) 1955 1960 1965 1970 1975 years 1980 1985 1990 1995 6 5.5 5 4.5 4.5 4 4 3.5 3.5 3 3 2.5 1950 2.5 1950 1955 1960 1965 E7 L15 1970 1975 years 1980 1985 1990 1995 E7 L15 Linear regression example: Data: >> years = [1950;1955;1960;1965;1970;1975;1980;1985;1990;1995]; >> pop = [2.55;2.78;3.04;3.35;3.71;4.09;4.45;4.85;5.28;5.69]; 15 Quadratic Function Regression Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn) pairs: (x y (x y (x y Find the quadratic function: the quadratic 16 >> [p1,p2]=linreg(years,pop) p1 = 0.0709 p2 = -135.8653 >> yh = p1*years+p2; >> figure(2) figure(2) >> plot(years,pop,'o',… years,yh) >> xlabel('years') xlabel('years') >> ylabel('population (m)') population (m) y (x) = p1 x2 + p2 x + p3 ˆ 6 5.5 5 4.5 4 3.5 3 2.5 2 1950 that minimizes the sum of error squares: n X J= 1955 1960 1965 1970 1975 years 1980 1985 1990 1995 E7 L15 i=1 (yi c y (xi))2 ˆ E7 L15 Quadratic Regression: Computing the LS solution 17 Quadratic Regression: Computing the LS solution 18 Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn), pairs: (x y (x y (x y • For a given set of parameters: p1, p2 , and p3 we given set of parameters: and we can compute the errors ei’s as follows Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn), pairs: (x y (x y (x y • In vector/matrix form: vector/matrix form:       e1 = p1 x2 + p2 x1 + p3 1 c y1 1 e2 = p1 x2 + p2 x2 + p3 1 c y2 2 . . . e1 e2 . . . en       x2 1 =   . .. ..  . x2 xn 1 n x2 1 x2 2 x1 1       p1     y2     p2  c  .  . .  p3 yn  y1  en = p1 x2 + p2 xn + p3 1 c yn n E7 L15 e vector (n x 1) A matrix (n x 3) p vector y vector (3 x 1) (n x 1) E7 L15 Quadratic Regression: Computing the LS solution 19 n-th Polynomial regression General n-th order polynomial regression: order polynomial regression: 20 Given n (x,y) data pairs: (x1,y1), (x2,y2), … , (xn,yn), pairs: (x y (x y (x y • In vector/matrix form: vector/matrix form: e = Ap cy >> p = A\y  solves the quadratic regression problem. y (x) = p1 xn + p2 xnc1 + · · · + pn+1 ˆ for to-be-chosen parameters p1, p2,…,pn+1: x1 1    x2 1 A= . .. . ..  x 2 xn 1 n x2 1 x2 2        y=     y1 y2 . . . yn  p1    p= p   2  p3 E7 L15   ª e1 º «e » « 2» «#» «» ¬ eN ¼ ª x1n «n « x2 «# «n « xN ¬ x1n 1 " n x2 1 " # n xN1 x1 x2 %# " xN ª p1 º 1º « » ªy º p2 » « 1 » » y 1» « « # »« 2 » #» « » «#» » « pn » « » 1» ¼ « p » ¬ yN ¼ ¬ n 1 ¼ E7 L15 n-th Polynomial Regression: Computing the LS solution 21 Polynomial Regression Pseudo-Code Pseudofunction p = polyreg(x ,y ,n ) polyreg(x % Fits an n’th order polynomial % to the data given by x , y 22 Given N (x,y) data pairs: (x1,y1), (x2,y2), … , (xN,yN), pairs: (x y (x y (x y >> p = A\y  solves the solves the n-th polynomial regression. polynomial regression xn xnc1 · · · x1 1 1 1    n xnc1 · · · x   2 1 A =  xn 2 . . . . . . . . . .. . . . . n xnc1 · · · x xN N1 N      p=          y1 y  y =  .2 . . pn+1 p1 p2 . . pn         N = length(x); A = zeros(N,n+1); (N x n+1) yN E7 L15 % Generate matrix A A(:,end) = ones(N,1); for i=1:n i=1:n A(:,end-i) = A(:,end-i+1).*x; end p = A\y; p = p’; ª x1n «n « x2 «# «n « xN ¬ x1n 1 n x2 1 # n 1 xN " x1 " x2 %# " xN ª p1 º 1º « » ª y1 º » « p2 » « » y 1» « # « 2 » #» « » «#» » « pn » « » 1¼ »« ¬ yN ¼ ¬ pn 1 » ¼ E7 L15 Matlab’s polynomial regression function • Syntax: 23 Matlab’s polynomial evaluation function • Syntax: 24 p = polyfit(x,y,n) yh = polyval(p,x) –p is an n+1 row vector of coefficients –x –y –n –p is the independent variable vector is the dependent variable vector th is the polynomial order is an n+1 row vector of coefficients –x –yh is the independent variable vector the independent variable vector is the dependent variable vector evaluated: p= h p1 p2 · · · pn pn+1 i y (x) = p1 xn + p2 xnc1 + · · · + pn+1 ˆ See Palm page 315, Table 5.6-1. E7 L15 y hk = p1 xn + p2 xnc1 + · · · + pn+1 k k See Palm page 315, Table 5.6-1. E7 L15 polyfit and polyval example: example Data: >> years = [1950;1955;1960;1965;1970;1975;1980;1985;1990;1995]; >> pop = [2.55;2.78;3.04;3.35;3.71;4.09;4.45;4.85;5.28;5.69]; 25 General “basis” functions How does it work if we want to fit data with a relation does it work if we want to fit data with relation the form p1=polyfit(years,pop,1); >> yh1=polyval(p1,years); >> 6 ˆ y( x ) 5.5 5 4.5 4 3.5 3 2.5 2 1950 p1 f1 ( x )  p2 f 2 ( x )  "  pn f n ( x ) p2=polyfit(years,pop,2); >> yh2=polyval(p2,years); >> >> linear figure(3) population (m) For fixed functions f i (x) (called “basis” functions), and to-be-chosen parameters p1, p2,…,pn, For example, >> plot(years,pop,'o',years,yh1,… quadratic years,yh2) >> xlabel('years') xlabel( >> ylabel('population (m)') ˆ y( x ) 1955 1960 1965 1970 1975 years 1980 1985 1990 1995 p1 sin( x )  p2 sin (2 x )  p3 sin (3 x ) E7 L15 General “basis” functions In this case the error at the data point (xk,yk) is thi th th General “basis” functions Given N (x,y) data pairs: (x1,y1), (x2,y2), … , (xN,yN), pairs: (x y (x y (x y 28 ek p1 f1 ( xk )  p2 f 2 ( xk )  "  pn f n ( xk )  yk >> p = A\y ª f1 x1 « « f1 x2 «# « ¬ f1 xN f 2 x1 f 2 x2 # f 2 xN " " % " solves the basis function solves the basis function regression. f n x1 º » f n x2 » #» » f n xN ¼    y=  Given N (xk,yk) data pairs, we obtain A ª e1 º «e » « 2» «#» «» ¬eN ¼ ª f1 x1 « « f1 x2 «# « ¬ f1 xN f 2 x1 f 2 x2 # f 2 xN " " % " f n x1 º ª p1 º ª y1 º » f n x2 » « p2 » « y2 » « »« » # »« # » « # » »« » « » f n xN ¼ ¬ pn ¼ ¬ y N ¼ y1 y2 . . . yN         p=  p1 p2 . . . pn      (N x n) with ˆ y( x ) p1 f1 ( x )  p2 f 2 ( x )  "  pn f n ( x ) E7 L15 More information in Chapter 5 of Palm Read pages 315-326 in Palm concerning: pages 315 in Palm concerning: • Fitting other functions (exponential, logs, etc.) • The quality of a curve fit qua cu – computing the coefficient of determination, r • Regression and numerical analysis • Scaling the data the data • Using residuals 29 Summary What did we learn today? did we learn today? • Regression is employed to broadly estimate the is employed to broadly estimate the functional dependence of one set of variables on another. • Polynomial regression gives rise to a system of li linear algebraic equations, which is often solved using the least squares technique. • A suitable choice of regression functions (p (polynomial, exponential, etc) can yield very good agreement with the data . Use trial and error! 30 E7 L15 E7 L15 ...
View Full Document

This note was uploaded on 11/01/2009 for the course ENGLISH 7 taught by Professor Sengupta during the Spring '09 term at Berkeley.

Ask a homework question - tutors are online