Stat120C
Homework 1 (due Thursday April 10, 2008)
Instructor: Zhaoxia Yu TA: Vinh Nyugen
Note: Where appropriate, all relevant software code and output should be also handed in. Problem 1. (11.10 of Rice) Verify that the two-sample t test at level
Stat120C Intro to Prob. and Stat.
Midterm Exam (Monday May 4, 2015)
Q1
60
Full
Q2
30
Q3
40
Total
130
Name:
Yours
Instruction:
1. There are total 3 questions (all with multiple parts).
2. The point total for each question is given at the beginning of the q
STAT120C Final (2015)
Instruction:
1. This exam is closed book and closed notes. You are allowed to use two two-sided note sheets (each is
8.5 x 11).
2. Write all your answers (including tables) in your blue book. Write only one question per page in your
Stat120C
Handout 2: Examples of One-Way ANOVA
Instructor: Zhaoxia Yu
1
Example 1
We use the data in section 12.1 to show balanced one-way ANOVA. Below are the data
(rearranged from the text book)
Treatment
Lab1
Lab2
Lab3
Lab4
Lab5
Lab6
Lab7
1
4.13
3.86
4.
Handout 4: Simple Linear Regression
By: Brandon Berman
The following problem comes from Kokoskas Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:
msm = read.csv("http:/www.ics.uci.edu/~zhaoxia/tea
Stat120C
Handout 3: Example of Two-Way ANOVA
Instructor: Zhaoxia Yu
We use the iron retention data in chapter 11 to show two-way ANOVA. Below are the data
-Fe2+
Fe3+
-high
medium
low
high
medium
low
-0.71
2.20
2.25
2.20
4.04
2.71
1.66
2.93
3.93
2.69
4.16
Categorical Analysis
STAT120C
1
Review of Tests Learned in STAT120C
Which test(s) should be used to answer the
following questions?
Is husbands BMI larger than wifes?
Is mens BMI different from womens?
Do gender and smoking affect humans weight?
Is k
#handout1.R
# example 1: one-sampel t-test
#enter data
x = c(19, 26, 32, 24, 49, 42, 23, 53, 26, 39, 38)
#sample size
n=length(x)
#sample mean and variance
mean(x)
var(x)
#compute the t statistic
t.stat= (mean(x)-30)/sqrt(var(x)/n)
# print the statistic
t
# read data #
#the data in the link are log-transformed data
iron=read.table("http:/www.ics.uci.edu/~zhaoxia/teaching/stat120c/Data/iron.txt
", head=T)
names(iron) # the iron data has three columns
# draw a boxplot #
boxplot(concentration ~ Fe + dosage, d
# Change this line when the data is on website
msm =
read.csv("http:/www.ics.uci.edu/~zhaoxia/teaching/stat120c/Data/msmdata.csv")
#
# First acquire the relevant n and p
n = dim(msm)[1]
p = dim(msm)[2]
# Set x and y for simplicity
x = msm$Pressure
y = msm
Linear Regression
1
Introduction
It is often interesting to study the effect of a variable on a response. In ANOVA, the response
is a continuous variable and the variables are discrete / categorical. What if the variables are
also continuous? That is, how
Some properties about SSReg :
P
(1)SSReg = 12 ni=1 (xi x)2 . When 1 = 0, i.e., the regression line is horizontal, SSReg = 0.
(2)SSReg may be considered a measure of that part of variability of the yi which is associated
with the regression line. The large
Pn 2
P 2
2
2
x
xi
i
i=1
Pn
Pn 2
P
=
V ar(0 ) =
n i=1 xi ( i=1 xi )2
n (xi x)2
n 2
2
P
Pn
Pn 2
V ar(1 ) =
=
(xi x)2
n i=1 xi ( i=1 xi )2
P
2 n xi
2 x
P
Pn 2 i=1
Pn
Cov(0 , 1 ) =
=
(xi x)2
n i=1 xi ( i=1 xi )2
Proof Lets calculate the variance of 1 first.
Stat120C
Handout 1: One-sample, paired, and two-sample t-tests
Instructor: Zhaoxia Yu
In this handout we show how to use R to conduct t-tests.
1
One sample t-test
Consider the example we discussed in class: we have a sample from a normal distribution
and
STAT120C Assignment 1
Problem 1. This is a reading assignment and you dont need to hand in the answers. Read your stat120B
note or the deGroot book to find out why the following statements are true:
Let X1 , , Xn be a random sample from N (, 2 ). Then
=
1
Stat120C
Homework 2 (due Thursday April 17, 2008)
Instructor: Zhaoxia Yu TA: Vinh Nyugen
Note: Where appropriate, all relevant software code and output should be also handed in. Prbolem 1. (12.5.2 of Rice) Verify that if I = 2, the estimate s2 of T
. Because the samples are independent, the sample variances Si2 s are independent; therefore, SSW/ 2 2I(J1) .
Note
(1) Part I implies that E(SSW ) = I(J 1) 2 .
(2) Sp2 = M SW =
SSW
I(J1)
is called the pooled sample variance.
Proof of B.2
Consider the samp
1.3
The F test under unbalanced designs
The test for unbalanced designs is very similar - just replacing J with Ji . In this case,
Ji
I X
X
SSW =
(Yij Yi )2
i=1 j=1
SSB =
I
X
Ji (Yi Y )2
i=1
F =
M SB
SSB/(I 1)
P
=
FI1,Pi (Ji 1)
M SW
SSW/[ i (Ji 1)]
Sourc
Lecture Note: a brief review of one-sample t test and introduction to t test for paired
samples
1. the one-sample t test
The one-sample t test is used to conduct hypothesis testing regarding the mean of a
normal distribution when both the mean and the var
2
Two-way ANOVA
In the one-way design there is only one factor. What if there are several factors? Often, we
are interested to know the simultaneous effects of multiple factors, e.g, gender and smoking
on hypertension. The statistical approach to analyze
STAT120C: Analysis of Variance (ANOVA)
So far we have considered only one or two samples. For one sample, we were concerned
by the population mean. For two samples, we were concerned by the difference of two
population means. What if there are more than t
Theorem A Under the assumptions of the one-way ANOVA model,
E(SSW ) = I(J 1) 2
E(SSB) = J
I
X
2i + (I 1) 2
i=1
Proof:
SSW
J
I X
X
(Yij Yi )2
=
i=1 j=1
I
X
(J 1)Si2
=
i=1
= (J 1)
I
X
Si2
i=1
Because the sample variance Si2 is an unbiased estimator for 2 ,
STAT120C: Two-Sample T-test
1
the same variance
Assume that X1 , ., Xm is a sample drawn from N (X , 2 ), and Y1 , ., Yn is a sample drawn from
N (Y , 2 ). Also assume that the two samples are independent. We can summerize these assumptions
using the foll
Stat120C
Homework 5 ()
Instructor: Zhaoxia Yu
Problem 1 Consider the linear regression model with independent and normally distributed random errors:
yi = 0 + 1 xi + i
where i iid N (0, 2 ), i = 1, 2, , n. Let 0 and 1 denote the the least squares estimate
STAT120C Homework 3
1. Prove the Bonferroni inequality:
P (ni=1 Ai )
n
X
P (Ai ),
i=1
where A1 , , An denote n events. (Hint: induction)
2. The concentration (in nanogram per milliliter) of plasma epinephrine were measured for 30 dogs,
with 10 dogs under
STAT120C Homework 2
1. Consider the one-way layout. We use Yij to denote the measurment of the jth observation from the
ith treatment, where i = 1, , I and j = 1, , J. Define the following summary statistics
PJ
PI PJ
Yij , i = 1, , I and Y = 1
Yij
Yi = 1
Stat120C
Homework 4 ()
Instructor: Zhaoxia Yu
Note: Where appropriate, all relevant software code and output should be also handed in.
Problem 1. Consider the two-way analysis of variance (ANOVA) model
Yijk = + i + j + ij + ijk
iid
where ijk N (0, 2 ), i