Unit 6 - Statistics – V3100018.001-V3100018.006 UNIT 6 – Bivariate analytical tools regression Giuseppe Arbia Catholic University of the Sacred

Info iconThis preview shows pages 1–21. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistics – V3100018.001-V3100018.006 UNIT 6 – Bivariate analytical tools: regression Giuseppe Arbia , Catholic University of the Sacred Hearth, Roma, Italy 1 2 The correlation coefficient indicates the presence of a certain degree of linear dependence (that is a dependence along a straight line) between two quantitative variables With no distinction between the variable that is at the origin (the “cause”) and the variable that is the response (the “effect”). 3 Linear relationship 4 Non-linear relationship (quadratic) To study the relationships between pairs of quantitative variables we also use the idea of regression 5 In regression analysis we distinguish the role of the two variables involved The Independent variable: is the variable that “ideally” is at the origin of the phenomenon (the “ cause”) The dependent variable is the response variable (The “effect”) 6 We do this through a statistical model What is a model ? E. g.: A model of Ferrari 7 8 What is a statistical model A simplified representation of reality that reproduces some essential features while neglecting others. It is a stylized version of reality. 9 Digression: a line Slope = β 1 Intercept = β Y= β 0+ β 1 x Parameters of a line Y= β 0+ β 1 x : INTERCEPT β represents the value of Y when X = 0. SLOPE β 1 represents the variation of Y when X increases of 1 unit. 10 11 a b β β 1 X=0 A numerical example 12 Case of β 1 = 0 Y= β β 13 Case of β 1 > 0 Y= β 0+ β 1 x β 1 > 0 14 Case of β 1 < 0 Y= β 0+ β 1 x β 1 < 0 15 Case of β 1 = 1 (45 ° line) Y= β +x Y=x 16 Case of β < 0 β 17 Given a certain scatter diagram, among the many possible choices, how do we choose the best interplating line? 18 Let us define the theoretical value of Y Y i = the observed value of Y = error Observed Theoretical e i = Y i ! ˆ Y i ˆ Y i = ! + ! 1 X e i ˆ Y i ˆ Y i Y i 19 The estimation of a certian value on the basis of observations contaminated with small or large errors can be compared to a gambling in which you can only loose, each error corresponding to a loss. […] However is not at all clear what loss can be assigned to each error, because its determination depends essentially on a subjective decision. […] Among all the possible loss function you may choose, the simplest seems to be also the most favourable and it is the quadratic function. […] Laplace treated the problem in a similar way, but choosing the absolute value as a loss function and this choice is not less arbitrary than ours. C.F. Gauss 20 The theoretical values of the regression line can be calculated with the so-called Least Square Method (LS) We want to make the minimum possible error on the average for the whole cloud of points Let us consider as a measure of error for each unit The square of “e i ” So that positive and negative errors do not compensate each other....
View Full Document

This note was uploaded on 04/05/2012 for the course STATS V3100018.0 taught by Professor Giuseppearbia during the Spring '12 term at NYU.

Page1 / 72

Unit 6 - Statistics – V3100018.001-V3100018.006 UNIT 6 – Bivariate analytical tools regression Giuseppe Arbia Catholic University of the Sacred

This preview shows document pages 1 - 21. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online