lecture7

# lecture7 - Lecture 7 Frequent Statistical modeling So far...

This preview shows pages 1–7. Sign up to view the full content.

Lecture 7: Frequent

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Statistical modeling So far, we have been focusing mainly on re-randomization techniques to assess significance in study designs that employ some kind of explicit randomization In each case, we defined a test statistic that represented some aspect of the subjects we were interested in studying and then created a sampling distribution for this statistic under the null hypothesis that interventions (treatment and control) had no effect on the subjects in our study The sampling distribution captures the variability present in our experiment under the null hypothesis -- We used this distribution to judge the size of our observed effect , deciding whether it was big enough (“extreme enough”) to be considered something other than noise
Statistical modeling The fact that we employed randomization in making our intervention assignments, combined with the null hypothesis of homogeneity between treatment and control provide a framework for conducting inference Random assignments and homogeneity tell us enough about how the data were generated (under the null hypothesis) to simulate draws from the sampling distribution (under the null hypothesis) and conduct a formal test These assumptions are relatively weak as statistical assumptions go, and for the next few lectures we will start to add more, fleshing out a framework for connecting inference (learning from data) to the stochastic (probabilistic) mechanism that created the data

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Statistical modeling For the next few lectures, we are going to fit probability models to data -- In short, probability distributions will serve as a kind of “origin story” for how the data were generated -- There are many reasons for fitting probability laws to data We might relate aspects of the distribution (features or parameters) to a scientific theory that reveals something about the state of Nature Probability models can be used for purely descriptive purposes, acting as a kind of data summary or “compression” Finally, we are often interested in simulation , in using these models to make predictions or to generate new data that can be fed into a larger modeling exercise Last time, we looked at two commonly used “families” of probability distributions -- In each case, the probability distributions were all of a particular functional form and depended on one or more “parameters”
The normal family The normal or Gaussian distribution involves two parameters , and , where is the mean (the center) and is the standard deviation (the spread) F = f ( k μ, σ )= 1 p 2 ⇡σ 2 e ( x - μ ) 2 / 2 σ 2 : μ 2 , σ 2 > 0 μ μ σ σ x

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
The binomial family Let X denote the number of successes in m independent trials, each with success probability p -- Then the probability function of X belongs to the family Unlike the normal, the binomial is indexed by a single parameter, p , the success probability (we consider m, the number of trials, to be fixed and given) Recall that if X has a binomial distribution (m,p), then X has expected value mp and variance mp(1-p) F = f ( k | p )= k p k ( 1 - p ) k : p 2 [ 0 , 1 ] m m
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/20/2012 for the course STAT 105 taught by Professor Hansen during the Winter '12 term at UCLA.

### Page1 / 82

lecture7 - Lecture 7 Frequent Statistical modeling So far...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online