This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Chapter 5 Linear Stochastic Models 5.1 Least Squares Suppose that we observe some dependent variable (e.g. number of red blood cells) as a function of some independent variable (e.g. dosage of a drug), based on n experiments. The empirical data consists of ( x 1 ,y 1 ) ,... ( x n ,y n ), where x i is the value of the independent variable used for the i th experiment, and y i is the corresponding value of the dependent variable. We wish to find a simple mathematical model that summarizes the relationship of the dependent variable to the independent variable. The simplest such model is a linear model of the form y ( x ) = ax + b If n = 2 with x 1 negationslash = x 2 , we can find a and b by solving a linear system of two equations in two unknowns. If n > 2, the system is overdetermined, and we can apply the method of least squares. Specifically, let vector y = ( y 1 ,... ,y n ) T , vectorx = ( x 1 ,... ,x n ) T , vectore = (1 , 1 ,... , 1) T . A reasonable means of finding the best values of a and b is to select a and b as to minimize some measure of distance between vector y and avectorx + bvectore . One notion of distance that leads to a particularly nice system of determining equations for a and b is to measure the distance beween vector z 1 , vector z 2 IR n via bardbl vector z 1 vector z 2 bardbl , where bardbl vectorw bardbl 2 = vectorw T vectorw. Here, is a given n n symmetric positive definite matrix. The minimizers a , b of the sum of squares min a,b bardbl vector y avectorx bvectore bardbl 2 satisfy the linear system parenleftbigg vectorx T vectorx vectore T vectorx vectorx T vectore vectore T vectore parenrightbiggparenleftbigg a b parenrightbigg = parenleftbigg vector y T vectorx vector y T vectore parenrightbigg . Our choice of a quadratic form as our (squared) notion of distance is what leads to a linear system for the minimizer. Other notions of distance would lead to more complex optimization problem. The case in which = I is called ordinary least squares, whereas negationslash = I is calledweighted least squares. This approach to fitting a linear model to observed data leaves open several questions: 1. How should one choose the matrix ? 2. How can one assign error bars to our slope and intercept values a and b ? 77 3. Is there any way to objectively test the linear model against the even simpler model in which a = 0 (the constant model)? A statistical formulation of this linear modeling problem will permit us to address these issues. 5.2 Linear Regression Models with Gaussian Residuals We turn to the statistical view of how to build a linear model. We now view the dependent data values as random variables. In particular, we assume that Y i is a rv corresponding to the measured response of the dependent variable (e.g. blood pressure) as a function of the independent variable (e.g. drug dosage)....
View Full Document
This note was uploaded on 08/06/2008 for the course CME 308 taught by Professor Peterglynn during the Spring '08 term at Stanford.
- Spring '08