Regression 5-Jan-20 Great Learning 1 Agenda 1. Introduction to Linear regression 2. Correlation vs regression 3. Applications of Linear regression 4. Assumptions of Linear regression 5. Two main types of Linear regression 6. Simple Linear regression (SLR) example 7. Multiple Linear regression (MLR) example
Regression 5-Jan-20 Great Learning 2 1. Introduction to Linear regression a. Sir Francis Galton, a British statistician coined the term regression based on his research on hereditary properties of successive generations of sweet peas and humans. In 1886, he published an article with title, Regression towards Mediocrity in Hereditary Stature in Journal of the anthropological institute of Great Brittan and Ireland). b. Linear regression is a mathematical technique to establish a relationship between two variables (predictor variable and response variable) by finding a straight line that best fits the values of a linear function.
Regression 5-Jan-20 Great Learning 3 1. Introduction to Linear regression c. Scatter plots gives us an idea whether the two variables are linearly related or not. We need to find the best line that represents the scatter using linear regression. d. Response variable (aka dependent or outcome or target) is the variable of focus in a research study. e. Predictor variable (aka independent or explanatory) is the variable that explains the variation in the response variable and it might affect the response variable.
Regression 5-Jan-20 Great Learning 4 1. Introduction to Linear regression - continued f. For example, the response variable is volume of Sales in thousands on a given day in an online stores and the predictor variable is the advertisement expenses. g. Focus of the regression analysis is on the relationship between a response variable and one or more predictor variables. To be more specific, this helps one to understand how the typical value of the response variable changes when any one of the predictor variables is varies, keeping other predictor variables constant.
Regression 5-Jan-20 Great Learning 5 1. Introduction to Linear regression - continued h. Identification of problem Before doing the regression analysis, as a data scientist you must review the relevant literatures to develop a deep understanding of the business domain to know the relevant variables, their relationships. The predictor (independent) variable is the core of the experiment and is isolated and manipulated by the researcher. A researcher must determine which variable is reliable and relevant that needs to be manipulated to generate quantifiable results. For more details. refer https ://explorable.com/research-variables/
Regression 5-Jan-20 Great Learning 6 2. Correlation vs regression ✓ Correlation is a measure and direction of strength of a linear relationship between two variables. The Pearson's correlation coefficient or correlation coefficient which is valid only for linear relationship, denoted by r is a value that ranges between -1 and 1; -1 indicates perfect negative relationship, 1 indicates perfect positive relationship and 0 indicates no relationship.
You've reached the end of your free preview.
Want to read all 66 pages?
- Fall '19
- Regression Analysis