Guidelines for the Statistical Modeling of Data in Regression Programmer’s Instructions : I start off my modeling of data with a set of standard requests. These are done before I actually see what the data looks like. This initial request typically includes: 1.Name the variables to be entered (with abbreviations), remind that the data should be checked for accurate entry, and request a printout of the entered data. 2.Get descriptive statistics of all variables, including mean, std. dev., min., max and a histogram. 3.Run scatterplots of all pairs of variables, remind that the dependent (“y”) variable when included should be on the y-axis. Specify the plots when there are not to many of them. 4.Get a correlation matrix. 5.Run a first order model and get the residual plot and histogram of the residuals-specify the model-E(y)= … 6.Repeat the previous instruction for the second order model 7.Save all data since further analyses may be requested. As you each of the above steps 2-4 results, look and report the dependent (“y”-variable) results first. Descriptive statistics: Review the min and max to make sure that they make sense. Histograms: Look to see if they are symmetric, unimodal (one-peak) or more than one peak, skewed, flat (uniform), and try to describe them in a few words from which most individuals would draw a histogram that looked like the one being reviewed. Scatterplots: As I look at each plot, I draw and label by hand a “y-bar” horizontal line at the average of the variable on the y-axis. I then draw a smooth curve of minimal complexity that seems to model the average y score for each given x score and label it estimated E(y|x)
This is the end of the preview.
access the rest of the document.