Course: ENVIRONMET 340, Winter 2011
School: Portland State
your Summarize data! P o p ula tio nve rs us s a m p le Population: The entire group of individuals in which we are interested but cant usually assess directly. Example: All urban ecosystems, all Douglas-fir trees in Oregon, all fish Sample: The part of the population we actually examine and for which we do have data. How well the sample represents the population depends on the sample design. Population Sample A parameter is a number describing a characteristic of the population. A statistic is a number describing a characteristic of a sample. T o wa rd s ta tis tic a linfe re nc e The techniques of inferential statistics allow us to draw inferences or conclusions about a population in a sample. Your estimate of the population is only as good as your sampling design. Your sample is only an estimateand if you randomly sampled again you would probably get a somewhat different result. Population Sample Population vs. Sample Population: Timber harvested watersheds Sample Sample size 1 2 4 8 Sampling unit, randomization, independence N=160 Measure of central tendency: mean and median The mean and the median are the same only if the distribution is symmetrical. The median is a measure of center that is resistant to skew and outliers. The mean is not. Mean and median for Mean a symmetric distribution distribution Mean Median Mean and median for skewed distributions distributions Left skew Mean Median Mean Median Right skew Measure of dispersion and variability Measureofspread: Varianceandstandard deviation 1. First calculate the variance s2. 1n s2 = ( xi x ) 2 n 1 1 2. Then take the square root to get the standard deviation s. x Mean 1 s.d. 1n s= ( xi x ) 2 n 1 1 Measureofspread:Range Measureofspread:Range Uncertaintyandconfidence Uncertaintyandconfidence Although the sample mean, X , is a unique number for any particular sample, if you pick a different sample you will probably get a different sample mean. In you fact, could get many different values for the sample mean, and virtually none of them would actually equal the true population mean, . Uncertaintyandconfidence Uncertaintyandconfidence Standard deviation (/n) of sample means = standard error (SE) of the mean n 95% of all sample means will be within roughly 2 standard deviations (2*/n or 2SE) of the population parameter . Red dot: mean value of individual sample Implications Implications We dont need to take a lot of random samples to rebuild the sampling distribution and find at its center. n All we need is one SRS of Sample n Population size n and relying on the properties of the sample means distribution to infer the population mean . Reworded Reworded With 95% confidence, we can say that should be within roughly 2 standard deviations (2*/n) from our sample mean X bar. In 95% of all possible samples of this size n, will indeed fall in our confidence interval. In only 5% of samples would be farther from . n A confidence interval can be expressed as: Mean m m is called the margin of error within X m Example: 120 6 A confidence level C (in %) indicates the probability that the falls within the interval. It represents the area under the normal curve within m of the center of the curve. m m Howtocalculateconfidenceinterval(CI)? CI=t/2,df X Standard Error of the mean t: students t distribution : significance level (5%) for 95%CI df: degree of freedom (sample size-1) Howtocalculateconfidenceinterval(CI)? CI=t/2,df X Standard Error of the mean Using our own data (Trees): Mean=13.16 n=160 t=1.97 se=standard deviation/n=11.34/12.64=0.90 95% CI=1.97 X 0.9 = 0.18 95% CI: 13.16 + 0.18 (12.98, 13.34) (1)The estimate of the population is only as good as your sampling design How many samples do you need? (2) The sample is only an estimateand if you randomly sample again you would probably get a somewhat different result. How well your mean is estimated?
