Claudia de Lavalle
2/1/14
Cd2685
STAT W2025
HW#2
1.
a)
b)
c)
d) |0.004|+|-0.002|+|0.002|+|0.000|= 0.008
e) My answer to (d) implies that there is a tiny bit of difference when using the binomial
distribution formula versus using the dbinom() function of R

Binomial Inference
(reading: Agresti, sects. 1.3 and 1.4)
Suppose Y is an observation assumed to be from a Binomial Distribution with n
trials and probability of success at each trial.
That is ~ (, ).
Equivalently, ( = |, ) =
!
!()!
(1 )() , for = 0,1, ,

Generalized Linear Models
Reading: Agresti, Chapter 3. Sects 3.1, 3.2, 3.3 (through 3.3.3); 3.4, 3.5.
Initial Motivating Example. Background of the study, from Wikipedia:
The Framingham Heart Study is a long-term, ongoing cardiovascular study on
residents

Contingency Tables
(Reading: Agresti, 2.1, 2.2, 2.3)
Example Problem: Consider the situation where we have two groups with a
binary outcome for each group, and we want to analyze the data and compare
the two groups. For example, one of such data sets in t

Introduction
My big picture of doing statistics:
1. Starts with a substantive problem of interest
Goals can be things such as:
Improve, Decide, Understand, Evaluate, Explore,
Confirm, Test, Generalize, Discover,
Be able to say in English without statisti

Testing Independence, 3-Way Association
(Reading: Agresti, Sects. 2.4 and 2.7)
We have been studying association and relationships in 2 x 2 tables.
Measuring and representing association leads to the more general notion of a
statistical model for the data

Logistic Regression
Reading in Agresti, Chapter 4: Sect. 4.1 through 4.1.5; skip 4.1.6. Sect. 4.2
through 4.2.5; skip 4.2.6 and 4.2.7. Sect. 4.3 through 4.3.3; skip 4.3.4 and 4.3.5.
Sections 4.4 and 4.5.
Summary of Chapter 4. While the statistical ideas e

Poisson Regression
Poisson Regression is an important type of GLM that is often used for analyzing
and modeling count data (i.e., data giving counts or frequency of occurrence of
some event.)
Initial Motivating Example. (context and data from Agresti, pro

Loglinear Models for Multi-way Contingency Tables
Reading. Chapter 7: Sects. 7.1, 7.2, 7.3.1.
Recommended for history and for review of major topics: Chapter 11 (6 pages
plus pictures).
Several further topics build off the GLM framework we have developed

R Code and Output for Assessing Actual Confidence Interval Properties
We have several methods to calculate what is claimed to be a 95% confidence
interval for the binomial probability . They are all based on the MLE estimator,
Y/n, but there are somewhat

R Intro Code and Output
Using the RStudio interface to R, there are four panes for (roughly) the following purposes:
Console to enter commands and get output;
History provides history of commands given and is useful for editing and resending the
command f

Stat W2025 Review Material
This narrative attempts to provide an overview big picture of the topics from
throughout the semester and how they fit together. The goal is that you should
be familiar and comfortable with these topics. If you come across a top

Take-Home Quiz
STATW2025
This data set, resulting from an employee satisfaction survey of AT&T employees, concerns a
binomial response variable of satisfaction (the two responses being Satisfied or Unsatisfied). This
response variable is categorized acros

1.
a) Done
b)
i)
ii)
iii)
iv)
c)
i)
Fitted coefficient for the SW district: 1.2182+0.1989= 1.4171
Only four fitted coefficients for District are provided. Rs default is to put the first variable as
the intercept.
Coefficient for the NC district: 1.2182
ii

STAT 2025
HW9
1) Done
2) a)
This comparative box plot shows the weights of crabs with a satellite (weights.A) and the weights
of crabs that do not have a satellite (weights.B). While this shows us that the Inter-Quartile Ranges of the
two groups are prett

4)
a)
b)
c)
d) logit(estimated probability of malformation= -5.9605+0.3166(alcohol consumption score)
e)
(R was not letting me create a plot, hence the lack of picture!)
f) Proportion of malformation at Highest level of alcohol consumption: 0.023100302
Pr

2/20/15
HW#4
>I will select n=12
1. Wald Confidence Interval:
p (z /2 xSE ( p )
where:
p=y/n
SE=squareroot(p(1-p)/n)
Analysis in R:
> mypi=0.5
> myn=12
> y <- rbinom(n=1,size=myn,prob=mypi); y
[1] 6
> p <- y/myn
> p <- y/myn; p
[1] 0.5
> alpha <- 0.05
> p

Building Logistic Regression Models
Reading in Chapter 5: Section 5.1. Section 5.2 through 5.2.5; skip 5.2.6 and 5.2.7.
Section 5.3 through 5.3.3; skip 5.3.4. Skip Sections 5.4 and 5.5.
The general topic is Model Building and Model Selection, and we provi