Homework 4
Due: Tuesday July 19 at 11:59pm
See general homework tips and submit your files via the course website. The data sets are in HW4Data.sas
in the course website.
For exercises 1 and 2, use the heartHW4 data set which is based on the heart data se
Exercise 1
a) Basic descriptive statistics for maximum breadth and nasal height follow.
For maximum breadth, we see that the mean and median values are about 134mm. The skewness is
-.0287 indicating there is no noticeable skew to the maximum breadth value
Exercise 1
a) Results for a logistic regression for high cholesterol status follow. The global tests are all significant
indicating that at least one of the predictors is significant. The type 3 analysis indicates that we would
lose significantly more inf
Exercise 1
a) From the tabulation, it looks like 4 cylinder cars may be more fuel efficient than 6 cylinders. Sedans
might be more efficient than sports cars at least for 4 cylinders. There does not appear to be a big
difference across origin. We also see
STAT 440 Homework 1
2. Student scores
Ob
s
Gender language
histor math science
y
1M
80
82
85
88
2F
94
92
88
96
3M
96
88
89
92
95
.
92
92
4F
3. Orion employee reports
a) Print descriptor portion
Alphabetic List of Variables and Attributes
#
Variable
Type
L
Homework 1
Due: Tuesday June 23 at 11pm
See general homework tips and submit your files via the course website.
For all exercises, use the code in HW1Data.sas and the heart.dat file in the course space to obtain the data set. This
data set was obtained fr
ST448 HW6 solution
Exercise 1
(a) From the following two scatter plots of PC1 vs PC2 for red and white wines, it can be seen that two
types of wine are clearly separated each other. Specifically, red wines have generally positive PC1 values,
otherwise whi
* read in the data and create a water data set;
data water;
infile 'c:\Stat 448\water.dat';
input flag $ 1 Town $ 2-18 Mortal 19-22 Hardness 25-27;
if flag='*' then location='north';
else location='south';
run;
proc print data=water;
run;
* get a scat
ST448 Homework 3 Solution
Exercise 1
(a) In terms of bp_status, people in higher blood pressure group are likely to have higher cholesterol.
We can observe it by comparing the means of cholesterol at each bp_status for the fixed
weight_status level. Also
STAT 448
Midterm:
Fall 2013
Due 10/18/13 by 7:00pm
Instructions
The midterm exam is essentially an extended homework assignment. The data and SAS
data input code can be downloaded from Compass. Note that:
You need to submit one report le and one SAS code
Original Research
See commentary by Dunlop and Rapaport p571
This work may not be copied, distributed, displayed, published, reproduced, transmitted, modified, posted, sold, licensed, or used for commercial purposes.
By downloading this file, you are agre
* ods html close;
* ods preferences;
* ods html newfile=proc;
* the hypertension data set;
data hyper;
* modify path to point to your files;
infile 'c:\Stat 448\hypertension.dat';
input n1-n12;
if _n_<4 then biofeed='P';
else biofeed='A';
if _n_ in(1,4) t
Midterm Exam 1
Final code and report are due on 3/28 at 11pm via the Midterm 1 assignment in the course space. Late
submissions will not be graded.
Save your files with your name (e.g. <Your-First-Name> <Your-Last-Name> Midterm1.sas and <YourFirst-Name> <
/ I. ; \ g
E g A!) :Jg \ W l / f a
h 2 W E " s N : Li: / 3 "/
x ,4: ,z , a 6; ~
\V; / x; W A 3 -7 My.
.n "M ,g,
kW Wasj i cfw_W q if f. ;
K N
NM > WWWWW w.-
/
~ / M W E . ~ 5 gm.
m, N h ;w ks.
K 5:2 fug.
,
f4; '7: t: Sg ZLC :tw 30:)
x
(i
MMRM
(Mixed Model Repeated
Measures)
STAT448, Advanced Data Analysis
MMRM (Mixed Model Repeated Measures)
Repeated measures refer to multiple measurements taken from the
same experimental unit (eg, multiple evaluations over time on the
same patient).
Mo
2016 FALL ST448 HW3 Solution
Exercise1
(a) Firstly, we can apparently see that larger diamonds tend to be expensive with higher price. Also diamonds featured
color grade D and E are more expensive than other grades. Based on the counts, we can see that th
Midterm Exam 2
Final code and report are due on Wednesday December 7 at 11pm via the Midterm 2 assignment in the
course space. Late submissions will not be graded.
Save your files with your name (e.g. <Your-First-Name> <Your-Last-Name> Final.sas and <Your
Midterm Exam 1
Final code and report are due on Friday October 14 at 11pm via the Midterm 1 assignment in the
course space. Late submissions will not be graded.
Save your files with your name (e.g. <Your-First-Name> <Your-Last-Name> Midterm1.sas and <Your
2016 FALL ST448 HW5 Solution
Exercise1
(a) We need to use a gamma model with log link because mpg is positive continuous variable.
(b) We can see that, all predictors except acceleration are significant in a gamma model with p-values less
than 0.05. Type
Midterm 2 Review
STAT448
Chapter 8: Logistic Regression
Proc logistic
Descending, reference coding when P(event =1), use descend
Stepwise selection
Diagnostic plots
Cbar for influential points
Hosmer-Lemeshow test to check lack-of-fit
-H0: the model
2016 Fall ST448 HW4 Solution
Exercise1
(a) The following are results from fitting a logistic model for remission. The global test shows p-values less
than 0.05 for three kinds of asymptotic tests, Likelihood Ratio, Score and Wald test, thus we can conclud
2016 FALL ST488 HW6 Solution
Exercise1
(a) The following are the results from an average linkage cluster analysis. From both CCC and Pseudo F
plots, we can find rapid increase at 3 and slow increasing trend until 15. The Pseudo T-squared statistic
shows t
Group 7 Project Proposal
1. The names of the group members
2. Title of the project: Predicting the Price of Rice in India.
3. Questions the group intends to answer with its analyses:
1) Fit the model with all the main effects and interactions between them
Stat 448: Advanced Data Analysis - B1
Class Time: MWF 2:00-2:50pm
Course Space: compass2g.illinois.edu
Instructor: Yeon Joo Park (Call me JOO)
Email: [email protected]
Office: 104F Illini Hall
Hours: Wed &Fri 9:00-10:00 am or by appointment
TA: Yihe W
* ods html close;
* ods preferences;
* ods html newfile=proc;
/* read in and view skull data set */
data skulls;
infile 'c:\Stat 448\tibetan.dat' expandtabs;
input length width height faceheight facewidth;
if _n_ < 18 then type='A';
else type='B';
run;
pr
Chapter 6
Linear Regression
(Simple Linear Case)
Review: ANOVA Models
In Chapters 4 and 5:
A continuous response
One or more categorical explanatory variables
Errors assumed to be iid N(0, 2 )
Interested in differences of expected values
for response
Chapter 3 Notes
Simple Inference for Categorical Data
Categorical data is based on groups or categories (as the name would imply). Looking at a single
categorical variable, we might want to know how many observations there are for each group. We could
loo