MAT 221 January 20 th , 2016 Elementary Probability and Statistics I Overview - Statistics is the science of learning from data - Statistics includes collecting data, organizing data, analyzing data, drawing conclusions from data and presenting study results. - We often want to know a specific piece of information about a large group of people or things. If we could get this information from every person or thing in the group, then we would not need statistics The two main activities of statistics - Estimating a characteristic of the population - Testing a hypothesis or claim about the population Chapter 1: Looking at Data – Distribution 1.1 Data - In a study, we collect information(data) from cases. Cases can be individuals, companies animals, plants, or any objects of interest. - A variable is any characteristic of a case. A variable varies among cases. - A label is a special variable used in some data sets to distinguish the different cases. Each case has a unique label. Also, a label is not a variable that we are interested in studying. It is only used to tell the cases apart. - The distribution of a variable tells us what values the variable takes and how often it takes these values. - Data consists of numbers (or categories) recorded for the cases along with the context. - A quantitative variable is a variable that is given by numerical values for which arithmetic operations, such as adding and averaging, make sense. - A categorical variable is a variable that is given by one of several categories. What can be counted is the count or proportion of cases in each category January 22 nd , 2016 1.2 Displaying Distributions with Graphs - To present categorical data, use bar graphs and pie charts o All percentages of all categories must add up to 100 - To correctly interpret a graph, you must analyze the numerical information given the graph, so as not to be misled by the graph’s shape o Read labels and units on the axes - Numerical scales such as timelines should progress from left to right or bottom to top - The slices in a pie chart should represent non-overlapping parts of a whole - To present quantitative data, use stem plots and histograms - Each vertical bar in a histogram is called a bin the width of each bin is called the bin size o Rule of thumb state with 5 to 10 bins look at the distribution and refine your bins o There isn’t a unique or “perfect” solution
- Outliers are observations (numbers) that lie outside the overall pattern of a distribution.
