This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Model Building in Linear Regression Histogram • The whole data is divided into a number of classes and the frequency of each class is found • Let n be the total frequency, f i be the frequency of class i and w i be the width of class i. The relative frequency density of class i (fd i ) is defined as fd i = f i /(n*w i ) • For each class i, a bar of height fd i is drawn. • The total area under a relative frequency histogram is 1. • The choice of the number of classes and the class width is important for construction of histograms • A histogram with improperly chosen number of class intervals/ class width can convey wrong information. Histogram • Sturges Rule : Choose the number of classes as [log 2 n]+1 where [.] is the greatest integer less than equal to n • Scott’s rule: Choose the class width to be 3.5 s/n 1/3 • FreedmanDiaconis Rule: Choose the class width to be 2 IQR/n 1/3 Multicollinearity • Some of the explanatory variables are strongly interrelated among themselves. • Result: The estimates of the regression coefficients are not precise. • Some of the interrelated variables need to removed from the model. Detecting Multicollinearity Multicollinearity is suspected if : • There is high correlation among some of the exploratory variables....
View
Full Document
 Spring '09
 scf
 Regression Analysis, Yi, Histogram, multicollinearity, dummy variables

Click to edit the document details