This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Lecture 20 Nancy Pfenning Stats 1000 Standardized Statistics Recall: If the underlying population variable X is normal with mean , standard deviation , then for a random sample of size n , the random variable X is normal with mean , standard deviation n . We used this fact to transform X to a standard normal random variable Z , and solved for probabilities with normal tables: Z = X- / n is normal with mean 0, standard deviation 1. [Note that the spread of Z is always 1, regardless of sample size n .] In situations involving a large sample size n , sample standard deviation s is approximately equal to , and we can treat X- s/ n as (approximately) a standard normal variable Z . If sample size n is small, s may be quite different from , and the random variable which we call t = X- s/ n does not follow a standard normal distribution. Because of subtracting the expected value of X (that is, ) from X in the numerator, the distribution of t = X- s/ n is (like Z ) centered at zero and symmetric. Because of dividing by s/ n (which is not the standard deviation of X ), the standard deviation of t is not fixed at 1 as it is for Z . Sample standard deviation s contains less information than , so the spread of t is greater than that of Z , especially for small sample sizes n . Since s approaches as sample size n increases, the t distribution approaches the standard normal Z distribution as n increases. Thus, the spread of sample mean standardized using s instead of depends on the sample size n . We say the distribution has n- 1 degrees of freedom, abbreviated df . Since there are many different t distributionsone for each df it would take too much space to provide tables for each of them in as much detail as was provided for the standard normal z in Table A.1. Instead, t tables are condensed to provide minimal adequate information needed to state useful results. Statistical Inference Statistical inference is the process of inferring something about a larger group (the population) by analyzing data for a part of that group (the sample). There are two general forms of statements we make using statistical inference: (1) confidence intervals; and (2) significance tests. We use these forms of inference in order to answer questions about (a) population proportion p [for categorical data] or (b) the population mean [for quantitative data]. [In addition, we can use significance tests to answer questions about relationships between two variables, such as the chi-square test of a relationship between two categorical variables. The chi-square statistic chi-square = sum of all ( observed- expected ) 2 expected is another standardized statistic that follows a known pattern with values and probabilities that can be summarized in a table.] 1. Confidence Interval Questions (a) (for p ) In May, 2000, .56 of 1,012 respondents to an Associated Press survey supported gays rights) In May, 2000, ....
View Full Document