Exercise 1
a) Results for a logistic regression for high cholesterol status follow. The global tests are all significant
indicating that at least one of the predictors is significant. The type 3 analysis indicates that we would
lose significantly more inf
STAT 440 Homework 1
2. Student scores
Gender language
3. Orion employee reports
a) Print descriptor portion
Alphabetic List of Variables and Attributes
Homework 1
Due: Tuesday June 23 at 11pm
See general homework tips and submit your files via the course website.
For all exercises, use the code in HW1Data.sas and the heart.dat file in the course space to obtain the data set. This
Exercise 1
a) From the tabulation, it looks like 4 cylinder cars may be more fuel efficient than 6 cylinders. Sedans
might be more efficient than sports cars at least for 4 cylinders. There does not appear to be a big
difference across origin. We also see
ST448 HW6 solution
Exercise 1
(a) From the following two scatter plots of PC1 vs PC2 for red and white wines, it can be seen that two
types of wine are clearly separated each other. Specifically, red wines have generally positive PC1 values,
ST448 Homework 3 Solution
Exercise 1
(a) In terms of bp_status, people in higher blood pressure group are likely to have higher cholesterol.
We can observe it by comparing the means of cholesterol at each bp_status for the fixed
STAT 448
Midterm:
Fall 2013
Due 10/18/13 by 7:00pm
Instructions
The midterm exam is essentially an extended homework assignment. The data and SAS
data input code can be downloaded from Compass. Note that:
You need to submit one report le and one SAS code
Presentation1:
In this data analysis, they use the data from Department of Health and Human
Services with 3141 observations to solve some interesting problems.
In the first part, they use multiple linear regression to create a model for predicting
Homework 6
Due: Friday 4/28 at 5pm
See general homework tips and submit your files via the course website.
For all exercises, use the SIDS data set defined in HW6Data.sas file. The data in sids.dat were collected
Presentation1:
In this data analysis, group 5 works on community and crime dataset to figure out
some interesting characteristic about this dataset.
In the first part, they use cluster analysis to find out how does gang unit deployed
ST448 Homework 2 Solution
Exercise 1
(a)
Table of alcoholyn by outcome
alcoholyn
outcome
Frequency
Expected Abnormal Normal
Total
N
2
4.8
38
35.2
40
Y
10
7.2
50
52.8
60
Total
12
88
100
From the table, we can infer the association between alcohol consumpti
Chapter 7
Linear Regression
(Multiple Regression Case)
Additional Considerations
Have multiple possible explanatory variables
Assume that explanatory variables are
(roughly) independent
Will need to select best subset of explanatory
variables to use
Presentation1:
In this data analysis, group 7 works on IMDB 5000 Movie dataset to figure out
some interesting characteristic about this dataset.
In the first part, they use linear regression and variable transformation to find out
Presentation1:
In the first presentation, this group use logistic regression to determine which
continuous variables are significant in determining the odds of an article being posted
on a weekend. They use response Is_Weekend as and dummy binary response
Homework 2
Due: Friday February 17 at noon
See general homework tips and submit your files via the course website.
In the first exercise, we will investigate the relationship between peanut allergy and early consumption
of peanuts. For exercise 1 use the
Homework 1
Due: Friday February 3 at noon.
See general homework tips and submit your files via the course website.
In this homework assignment, we will examine and compare some attributes of fish. For all exercises,
Homework 6
Due: Tuesday August 4 at 11pm
See general homework tips and submit your files via the course website. The data sets are contained in HW6Data.sas
in the compass2g course space. For discriminant analysis, assume proportional priors.
Exercise 1
Exercise 1
a) The following contingency table shows the frequencies for crime rates greater than 100 crimes known
to police per million population. We see the observed percentage of states with crime levels above that
Exercise 1
a) Basic descriptive statistics for resting blood pressure, serum cholesterol, and maximum heart rate follow.
Variable:
restingbp
Moments
266 Sum Weights
N
266
Mean
131.293233 Sum Observations
Std Deviation
17.9225933 Variance
321.21935
MADRS (Montgomery & Asberg Depression Rating Scale)
Item
Scale
Score
1. Reported Sadness
0 Occasional sadness in keeping with the circustances.
2 Sad or low but brightens up without difficulty.
4 Pervasive feelings of sadness or gloominess.
Chapter 2
Descriptive Statistics and Simple
Inference
Descriptive Statistics
Start with samples obtained from some
population
Descriptive Statistics
Start with samples obtained from some
population
Descriptive statistics tell us about features of
Homework 5
Due: Friday April 21 at 5pm
See general homework tips and submit your files via the course website.
For all exercises, use the Auto data set defined in the HW5Data.sas file. The Auto data is based on the
Homework 4
Due: Monday April 10 at noon
See general homework tips and submit your files via the course website.
Note that for logistic regression models we can use the Cbar measure in SAS as an analogue of Cooks
