Furthermore, the log likelihood is convex with respect to β , and this is true for all φ ( x ), so all of the methods proceed identically. COS424/SML 302 Linear Regression February 25, 2019 43 / 46

Regression with non-linear relationships Polynomial functions We might consider the model: y i = φ ( x i ) T β + φ ( x ) = [1 , x , x 2 , . . . , x p ] T This allows us to find non-linear (polynomial) relationships between x , y . Higher-order interactions between predictors We can include non-additive interactions, e.g., for x i = [ x 1 , x 2 ] i : φ ( x ) = [ x 1 , x 2 , x 1 x 2 , x 2 1 x 2 , . . . ] Note that we consider interactions among predictors, not among samples. COS424/SML 302 Linear Regression February 25, 2019 44 / 46
Summary A few key points from this discussion: Regression models, like other models, must be built thoughtfully: What are the set of predictors to include? Should I include an intercept term? Should I include any non-linear terms? When possible, try the normal equation to fit the ˆ β coefficients With large numbers of samples, online methods might be faster than batch methods With large numbers of predictors, these methods may be unstable Optimization methods are an important tool in the ML toolbox We will discuss regression over the next three lectures. COS424/SML 302 Linear Regression February 25, 2019 45 / 46

Additional resources Regression is a standard tool in statistics; resources abound MLAPA: Chapter 7 Elements of Statistical Learning , Chapter 4 Metacademy: Linear Regression COS424/SML 302 Linear Regression February 25, 2019 46 / 46
