15 - Problem(text 1329 The table below gives the population...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 3/8/2010 Problem (text 1329): The table below gives the population of a small but growing suburb over a twenty year period. Year Population 0 100 5 200 10 450 15 950 20 2000 The growth is assumed to be exponential: population = *exp(t), where t is a time in years What values of and best fit the data? What can the population be expected to be after 25 years? Ideally a nonlinear regression technique would be used to find the and that absolutely minimize the sum of the squares of the errors between the data points and the fitted curve. A good (but not perfect) answer can be obtained more simply by transforming the data and using linear regression. The basic procedure: y = *exp(x) ln(y) = ln() +x Let y' = ln(y) y , ( ) Then y'= ax +b, where a = and b = ln() Use linear regression to find best a and b. Then find and by applying =exp(b) and = a Matlab part 1: x = [0 5 10 15 20]; y = [100 200 450 950 2000]; yt = log(y); % transform the y values p = polyfit (x, yt, 1); % fit a straight line to the transformed data fitt = @(x) p(1) * x + p(2); % function for the fitted line 1 3/8/2010 Matlab part 2: figure (1) plot (x, yt, 'x', x, fitt(x), 'MarkerSize', 10); grid on; xlabel ('Years'); ylabel ('ln(Population)'); fprintf ('For transformed data a = %f, b = %f, r = %f\n', ... p(1), p(2), correlate (x, yt, fitt)); p(1) p(2) orrelate ( t fitt)) The first data point doesn't appear. This seems to happen a lot (a Matlab bug?). The best fit line is y = 0.1510 * x + 4.5841 The correlation coefficient for this straight line and the transformed data is 0.999789. Matlab part 3: % calculate alpha and beta alpha = exp(p(2)); beta = p(1); @( ) p p( ); fit = @(x) alpha * exp(beta * x); % function for fitted curve % need lots of x values to get a smooth plot of the fitted curve xplot = linspace (0, 25, 100); % plot up to 25 years yplot = fit(xfit); figure (2) plot (x, y, 'x', xplot, yplot, 'MarkerSize', 10); grid on; xlabel ('Years'); ylabel ('Population'); fprintf ('For original data alpha = %f, beta = %f, r = %f\n', ... alpha, beta, correlate (x, y, fit)); fprintf (`Predicted population after 25 years = %f\n', fit(25)); 2 3/8/2010 This time the first data point does show up. The best fit curve is y = 97.9148 * exp (0.1510 * x) The correlation coefficient for this curve and the original data is 0.999957 The predicted population after 25 years is 4268. The basic idea can be adapted to power equations: y = x log(y) = log() +log(x) Let x' = log(x) and y' = log(y) Then y'= ax' +b, where a = and b = log() Use linear regression to find best a and b. Then find and by applying =10b and = a Note: log is used instead of ln only for consistency with the text. ln would work equally well (use =exp(b)) would work equally well (use 3 3/8/2010 And to saturation growth rate equations as well: y = (x / ( +x)) 1/y = (/)(1/x) + (1/) Let x' = 1/x and y' = 1/y Then y'= ax' +b, where a = / and b = 1/ Use linear regression to find best a and b. Then find and by applying =1/b and = a/b The mathematics of linear regression: Given : ( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 )...( xn , yn ) To find : the straight line ( y ax b) that best fits the data We W must minimize E yi (axi b) t i i i i 1 n n 2 a 2 xi b 2 yi 2abxi 2axi yi 2byi 2 2 i 1 At the minimum : n E 2 2axi 2bxi 2 xi yi 0 a i 1 n E 2b 2axi 2 yi 0 b i 1 4 3/8/2010 Dividing both equations by 2 and expressing them in matrix form gives: xi 2 xi x a x y (1) b y i i i i where (1) (1) n i 1 n Solving using Cramer's Rule produces: S l i i C ' R l d b1 b2 a12 a22 n xi yi xi yi 2 2 A n xi xi a a11 b1 a21 b2 b A x y x x y n x x 2 i i 2 i i 2 i i i Aside: b is more easily calculated using b y ax Calculating of a and b involves first passing through the data points and calculating the following summations: y y x i i 2 i x y i i Once this is done formulas for a and b can be applied. For linear regression ONLY, the correlation coefficient r can be computed using: r n xi yi xi yi 2 2 n xi xi 2 n yi yi 2 In addition to the summations listed above this requires y 2 i 5 3/8/2010 Linear Regression and the Casio Calculator: formula is y = Ax+B Mode Mode 2 (REG) REG stands for regression 1 (LIN) LIN stands for linear SHIFT CLR 1 (Scl) = clear statistical memory SHIFT CLR 1 (Scl) clear statistical memory x1 , y1 DT the DT key is the M+ key x2 , y2 DT .... and so on until all points entered To retrieve value of A: SHIFT SVAR > > 1 (A) = the SVAR key is the 2 key, > is right arrow To retrieve value of B: SHIFT SVAR > > 2 (B) = To retrieve the correlation coefficient: SHIFT SVAR > > 3 (r) = Other forms of regression are also supported. Polynomial regression: Linear regression involves fitting a first order polynomial (i.e. a polynomial of the form ax + b) to a set of data points. The basic idea is readily extended to higher order polynomials. Example: X: 0 Y: 189.4 3 95.1 6 34.1 9 1.8 12 7.3 15 46.7 18 131.9 21 253.2 We want to fit a quadratic (i.e. a polynomial of the form y = ax2+bx+c) to the data. This can be done by using polyfit and specifying a second order polynomial. >> polyfit (x, y, 2) % 2 for second order The result is a 3 element containing a, b, and c (in that order). 6 3/8/2010 >> xplot = linspace (1, 22, 100); >> yplot = polyval (p, xplot); >> plot (x, y, 'o', xplot, yplot, 'MarkerSize', 10); >> fprintf ('The best fit curve is %6.4f * x^2 + %6.4f * x + %6.4f\n',... p(1), p(2), p(3)); The best fit curve is 2.0088 * x^2 + 39.5105 * x + 193.4125 >> f = @(x) p(1) * x .^ 2 + p(2) * x + p(3); >> r = correlate (x, y, f); >> r correlate (x y f); >> fprintf ('The correlation coefficient is %6.4f\n', r); The correlation coefficient is 0.9991 7 3/8/2010 The mathematics of quadratic regression: Given : ( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 )...( xn , yn ) To find : the quadratic ( y ax 2 bx c) that best fits the data We must minimize E yi (axi bxi c) 2 i 1 n 2 At the minimum E E E 0 a b c First order equations Filling in the details gives : xi 4 3 xi x 2 i x x a x y x x b x y x (1) c y 3 2 2 i i i i 2 i i i i i i where (1) (1) n i 1 n The values of a, b, and c can be found by solving this series of equations. Equations = first order equations plus extra row and column. This pattern extends to higher order polynomials. 8 ...
View Full Document

This document was uploaded on 04/14/2010.

Ask a homework question - tutors are online