Stat Outline

Stat Outline

1.1 Data- The information we gather with experiments and with surveys. Statistics- is the art and science of designing studies and analyzing the data that those studies produce. Design- Planning how to obtain data to answer the question of interest Description- Summarizing the data that are obtained. Inference- Making decisions and predictions based on the data. 1.2 Subjects- the entities that we measure in a study Population- the set of all subjects of interest. Sample- some of the population for whom we have data. Descriptive statistics- refers to methods for summarizing the data. The summaries usually consist of graphs and numbers such as averages and percentages. Inferential statistics- refers to the methods of making decisions or predictions about a population based on data obtained from a sample of that population. Sample Statistic- a numerical summary of the population. Parameter- a numerical summary of a sample taken from the population. Random sampling- Each subject in the population has the same chance of being included in that sample. 2.1 Variable- any characteristic that is recorded for subjects in a study. Observations- the data wee observe for a variable Categorical (observation)- if each observation belongs to one of a set of categories. Key feature: relative number of observations. Quantitative (observation)- if observations take on it take numerical values that represent different magnitudes of the variable. Key features: center and spread. Discrete quantitative variables- the possible values form a set of separate numbers such as0,1,2,3,… Continuous quantitative variables- the possible values form an interval. Mode- the category with the highest frequency. 2.2 Pareto chart- bar graph with categories ordered by their frequencies. Pareto principle- states that a small subset of categories often contains most of the observations. Dot plot- shows a dot for each observation, placed just above the value on the number line for that observation. Stem-and-leaf plot- stem consists of all digits except the last, the leaf. Histogram- a graph that uses bars to portray the frequencies or the relative frequencies of the possible outcomes for a quantitative variable. Unimodial- a distribution of data such that the highest point is the mode. Bimodal- distribution with two distinct mounds. Skewed to the left(right)- the tail of the distribution is to the left(right) and the mound to the right (left). Time series- a data set collected over time. Time plot- time series data displayed graphically. 2.3

Mean- average Median- the midpoint of data Outlier- an observation that falls well above or below the overall bulk of the data. Resistant- a numerical summary that is barely influenced by extreme observations. 2.4
Stat Outline

