4 12 Parameters and Statistics A population parameter is a typically unknown

# 4 12 parameters and statistics a population parameter

• Notes
• 97
• 100% (1) 1 out of 1 people found this document helpful

This preview shows page 4 - 8 out of 97 pages.

4
1.2 Parameters and Statistics A population parameter is a (typically unknown) numerical constant that describes the population of interest. A sample statistic is a numerical summary of data that comes from the sample. Typically, we use Greek symbols to denote population parameters and the ”hat” symbol to denote the sample statistic. For example, the population mean is denoted by μ and the sample mean is denoted by ˆ μ = ¯ x 1.3 Summary Statistics Let x 1 , x 2 , ..., x n denote n observations sampled from a population. 1.3.1 Measures of Location The mode of the data is the most frequently encountered value. Note that data can have multiple modes. Data with two modes are called bimodal and data with three are called trimodal . The mean of the data is calculated by taking the arithmetic average of the data. We use the equation ¯ x = 1 n n X i =1 x i The median of the data is found by sorting the data in increasing order and then choosing the observation that divides the data into two equal parts. If n is odd, the index of the sorted data that accomplishes this is ( n + 1) / 2. If n is even, the median is found by taking the average of the numbers in positions n/ 2 and n/ 2 + 1. The p -th percentile is the value that divides the sorted data such that p % of the data are less than that value and (1 - p )% of the data are greater than it. To find this value, first sort the data. Then, compute the quantity ( p/ 100)( n + 1). If this value is an integer, then the data point in this position is the p th percentile. Otherwise, take the average of the nearest data point to the left and to the right of this number. Note that the 50th percentile is the median. Other important percentiles are the 25th and 75th. Together, the 25th, 50th, and 75th percentiles make up the first, second, and third quar- tiles, respectively. Note: Some statistical packages use different methods of calculating percentiles, such as us- ing a weighted average instead of the simple average used above. Thus, results from computers may be different than the results you obtain, but they should be close. 5
Example 1.3.1. The following values of fracture stress (in megapascals) were measured for a sample of 24 mixtures of hot mixed asphalt (HMA). 30 75 79 80 80 105 126 138 149 179 179 191 223 232 232 236 240 242 245 247 254 274 384 470 1. Find the mode of the data. 2. Find the mean of the data. 3. Find the 25th percentile (first quartile) of the data. 6
1.3.2 Measures of Spread The sample variance is a measure of spread that calculates the sum of squared deviations from the center (as measured by the mean). s 2 = 1 n - 1 n X i =1 ( x i - ¯ x ) 2 An equivalent formula, which is often easier to use, is s 2 = 1 n - 1 n X i =1 x 2 i - n ¯ x 2 ! The sample standard deviation is simply the square root of the sample variance. That is, s = s 2 = v u u t 1 n - 1 n X i =1 ( x i - ¯ x ) 2 Question: Why should we take the square root?