Terms  Definitions 

Name three measures of the central tendency of data. 
Three measures of the central tendency of data are: mean, median, and mode.

How do you calculate the arithmetic mean (mean) of a variable? 
The arithmetic mean (mean) of a variable is computed by finding the sum of all the values of the variable in the data set and dividing the sum by the number of observations.

What symbol is used for the mean of population data and what is the formula? 
The symbol used is µ, and the formula is: µ = (Σxi)/N, where N is the size of the population.

What symbol is used for the mean of sample data and what is the formula? 
The symbol used is xbar, and the formula is: xbar = (Σxi)/n, where n is the size of the sample.

What is the median of a variable data? 
The median of a variable is the value that lies in the middle of the data when arranged in ascending order. We use M to represent the median

How is the median of variable data determined? 
First arrange the values in ascending order.
If there is an odd number of data values, take the value in the middle. If there is an even number of values, take the average of the two middle values. 
What is a parameter? 
A descriptive measure of a population.

What is a statistic? 
A descriptive measure of a sample

What is the mode of variable data? 
The mode is the variable value that occurs most frequently in the data. The data may be bimodal, multimodal, or there may be no mode.

Which of the three measures of central tendency is/are resistant to extreme values? 
The median and mode are resistant to extreme values. The mean can be significantly influenced by extreme values.

What does it mean for a numerical summary, such the median, to be resistant to extreme values? 
A numerical summary of data is said to be resistant if extreme values (very large or small) relative to the data do not affect its value substantially.

If we collect data and the mean is substantially smaller than the median, what is the likely shape of the distribution? 
Skewed left

If we collect data and the mean and the median are close in value, what is the likely shape of the distribution? 
Symmetric

If we collect data and the mean is substantially larger than the median, what is the likely shape of the distribution? 
Skewed right

Name four measures that describe the dispersion or spread of variable data. 
Four measures that describe the dispersion or spread of variable data are: 1) Range, 2) Variance, 3) Standard Deviation, and 4) Interquartile Range.

How is the Range of variable data defined? 
The range, R, of a variable is the difference between the largest data value and the smallest data values.
R = Largest Data Value – Smallest Data Value 
How is variance computed for population data? 
The population variance of a variable is the sum of squared deviations about the population mean divided by the number of observations in the population, N.

What symbol is used for population variance? 
The population variance is symbolically represented by σ^2 (lower case Greek sigma squared)

What is the formula for calculating population variance? 
σ^2 = Σ (xi – µ)^2/N

How is SAMPLE variance determined? 
The sample variance is computed by finding the sum of squared deviations about the SAMPLE mean and then dividing this result by n – 1.

What symbol is used for sample variance? 
The sample variance is symbolically represented by s^2.

Give the formula for calculating sample variance. 
The formula for calculating sample variance is:
s^2 = Σ (xi – xbar)^2/(n – 1) 
How does the polulation variance compare to the population standard deviation? 
The population standard deviation is obtained by taking the square root of the population variance.

What symbol is used for population standard deviation? 
The population standard deviation is denoted by
the Greek letter σ. 
How does the sample standard deviation compare to the sample variance? 
The sample standard deviation is obtained by taking the square root of the sample variance.

What symbol is used to represent the sample standard deviation? 
The sample standard deviation is denoted by the symbol "s".

Give the formula used to compute sample standard deviation. 
The formula used to compute sample standard deviation is;
s = SQRT[Σ (xi – xbar)^2/(n – 1)] 
According to the Empirical Rule, if a distribution is roughly bellshaped, what percentage of the data values should fall within ± 1 standard deviation of the mean? 
According to the Empirical Rule, if a distribution is roughly bellshaped, approximately 68% of the data will lie within ± 1 standard deviation of the mean.
That is, 68% of the data should fall between, µ – 1σ and µ + 1σ 
According to the Empirical Rule, if a distribution is roughly bellshaped, what percentage of the data values should fall within ± 2 standard deviation of the mean? 
According to the Empirical Rule, if a distribution is roughly bellshaped, approximately 95% of the data will lie within ± 2 standard deviation of the mean.
That is, 95% of the data should fall between, µ – 2σ and µ + 2σ. 
According to the Empirical Rule, if a distribution is roughly bellshaped, what percentage of the data values should fall within ± 3 standard deviation of the mean? 
According to the Empirical Rule, if a distribution is roughly bellshaped, approximately 99.7% of the data will lie within ± 3 standard deviation of the mean.
That is, 99.7% of the data should fall between, µ – 3σ and µ + 3σ. 
What is a zscore and what is it used for? 
The distance that a value is from the mean in terms of the number of standard deviations.
Zscores are used to standardize data and to compare relative positions. 
How is a zscore calculated? 
For population data, to convert a value, X, into a zscore, the formula is: Z = (X – µ)/σ.
For sample data, to convert a value, X, into a zscore, the formula is: Z = (X – Xbar)/s. 
How is the kth percentile defined? 
The kth percentile, denoted, Pk, of a set of data is a value such that k percent of the observations are less than or equal to the value.

What are Quartiles? 
Quartiles divide data sets into fourths, or four equal parts.
The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile. The 2nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to the median. The 3rd quartile divides the bottom 75% of the data from the top 25% of the data, so that the 3rd quartile is equivalent to the 75th percentile. 
What is the Interquartile Range? 
The interquartile range, denoted IQR, is the range of the middle 50% of the observations in a data set. That is, the IQR is the difference between the first and third quartiles and is found using the formula
IQR = Q3 – Q1 
How is the interquartile range used in identifying outliers? 
Fences serve as cutoff points for determining outliers. The IQR is used to determine the upper and lower fences as follows:
Lower fence = Q1 – 1.5(IQR) Upper fence = Q3 + 1.5(IQR) 
What is the "5Number Summary"? 
1. Minimum Value
2. Q1 3. Median 4. Q3 5. Maximum Value 
How is a boxplot used? 
The boxplot is primarily used to identify possible outliers.
It can also be used to determine whether a distribution is roughly symmetric, skewed left or skewed right. 
/ 38
Term:
Definition:
Definition:
Show
Hide example sentence
Example Sentence:
Show
Hide hint
Hint:
Leave a Comment ({[ getComments().length ]})
Comments ({[ getComments().length ]})
{[ comment.comment ]}