It is highly unlikely that her child will have an IQ of exactly 75, as there is always error in the regression procedure. Error may be incorporated into the information given the woman in the form of an “interval estimate.” For example, it would make a great deal of difference if the doctor were to say that the child had a ninety- five percent chance of having an IQ between 70 and 80 in contrast to a ninety-five percent chance of an IQ between 50 and 100. The concept of error in prediction will become an important part of the discussion of regression models. It is also worth pointing out that regression models do not make decisions for people. Regression models are a source of information about the world. In order to use them wisely, it is important to understand how they work.
NMIMS N O T E S REGRESSION ANALYSIS 205 NMIMS Global Access – School for Continuing Education After studying this chapter, you should be able to: Understand the concept of regression analysis Discuss the applicability of regression Describe simple linear regression and nonlinear regression model. Learn about coefficient of regression and linear regression equations 7.1 INTRODUCTION The word regression was first used as a statistical concept in 1877 by Francis Galtan. Later if more than one variable is used to predict, the word multiple regression is used. In regression analysis we develop an equation called as an estimating equation used to relate known and unknown variables. Then correlation analysis is used to determine the degree of the relationship between the variables. Using the chi-square test we can find whether there is any relationship between the variables. Correlation and regression analysis show how to determine the nature and strength of the relationship between the variables. In this chapter we will learn, how to calculate the regression line mathematically. 7.2 REGRESSION ANALYSIS We need to have statistical model that will extract information from the given data to establish the regression relationship between independent and dependent relationship. The model should capture systematic behaviour of data. The non-systematic behaviour cannot be captured and called as errors. The error is due to random component that cannot be predicted as well as the component not adequately considered in statistical model. Good statistical model captures the entire systematic component leaving only random errors. In any model we attempt to capture everything which is systematic in data. Random errors cannot be captured in any case. Assuming the random errors are ‘Normally distributed’ we can specify the confidence level and interval of random errors. Thus, our estimates are more reliable. If the variables in a bivariate distribution are correlated, the points in scatter diagram approximately cluster around some curve. If the curve is straight line we call it as linear regression. Otherwise, it is curvilinear regression. The equation of the curve which is closest to the observations is called the ‘ best fit’ .
You've reached the end of your free preview.
Want to read all 380 pages?
- Summer '16
- Statistics, NMIMS Global Access, NMIMS Global Access – School for Continuing Education