This preview shows page 1. Sign up to view the full content.
Guidelines for the Statistical Modeling
of Data in Regression
Programmer’s Instructions
:
I start off my
modeling of data with a set of standard
requests.
These are done before I actually
see what the data looks like.
This initial
request typically includes:
1.Name the variables to be entered (with
abbreviations), remind that the data
should be checked for accurate entry, and
request a printout of the entered data.
2.Get descriptive statistics of all variables,
including mean, std. dev., min., max and a
histogram.
3.Run scatterplots of all pairs of variables,
remind that the dependent (“y”) variable
when included should be on the yaxis.
Specify the plots when there are not to
many of them.
4.Get a correlation matrix.
5.Run a first order model and get the
residual plot and histogram of the
residualsspecify the modelE(y)= …
6.Repeat the previous instruction for the
second order model
7.Save all data since further analyses may
be requested.
As you each of the above steps 24 results,
look and report the dependent (“y”variable)
results first.
Descriptive statistics: Review the min and max
to make sure that they make sense.
Histograms: Look to see if they are symmetric,
unimodal (onepeak) or more than one peak,
skewed, flat (uniform), and try to describe
them in a few words from which most
individuals would draw a histogram that
looked like the one being reviewed.
Scatterplots: As I look at each plot, I draw and
label by hand a “ybar” horizontal line at the
average of the variable on the yaxis.
I then
draw a smooth curve of minimal complexity
that seems to model the average y score for
each given x score and label it estimated E(yx)
This is the end of the preview. Sign up
to
access the rest of the document.
 Fall '10
 Szatrowski

Click to edit the document details