Stats Workbook - Author Brenda Gunderson Ph.D 2014 License Unless otherwise noted this material is made available under the terms of the Creative

Stats Workbook - Author Brenda Gunderson Ph.D 2014 License...

This preview shows page 1 out of 203 pages.

Unformatted text preview: Author: Brenda Gunderson, Ph.D., 2014 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported License: The University of Michigan Open.Michigan initiative has reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The attribution key provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact [email protected] with any questions, corrections, or clarification regarding the use of content. For more information about how to attribute these materials visit: . Some materials are used with permission from the copyright holders. You may need to obtain new permission to use those materials for other uses. This includes all content from: Mind on Statistics Utts/Heckard, 4th Edition, Cengage L, 2012 Text Only: ISBN 9781285135984 Bundled version: ISBN 9780538733489 SPSS and its associated programs are trademarks of SPSS Inc. for its proprietary computer software. Other product names mentioned in this resource are used for identification purposes only and may be trademarks of their respective companies. Attribution Key For more information see: http:://open.umich.edu/wiki/AttributionPolicy Content the copyright holder, author, or law permits you to use, share and adapt: Creative Commons Attribution-NonCommercial-Share Alike License Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Make Your Own Assessment Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. Public Domain – Ineligible. Works that are ineligible for copyright protection in the U.S. (17 USC §102(b)) *laws in your jurisdiction may differ. Content Open.Michigan has used under a Fair Use determination Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act (17 USC § 107) *laws in your jurisdiction may differ. Our determination DOES NOT mean that all uses of this third-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use this content you should conduct your own independent analysis to determine whether or not your use will be Fair. Statistics 250 Lab Workbook Fall 2014 -­‐ Winter 2015 Weekly Labs, In-­‐Lab Projects, Supplements, and Old Exams for Review Used in all lab sections of Stat 250 Dr. Brenda Gunderson Department of Statistics University of Michigan Table of Contents Material Note to Students and Supplements Supplement 1: SPSS Commands Summary Supplement 2: Notation Sheet Supplement 3: Name That Scenario Supplement 4: Editing Charts in SPSS Supplement 5: Notes about SPSS t Procedures Supplement 6: Interpretation Examples Supplement 7: Summary of the Main t-­‐Tests Supplement 8: Regression Output in SPSS Lab 1: Describing Data with Graphs and Numbers: Histograms, Mean, Median, and Range Lab 2: Describing Data with Graphs and Numbers: Boxplots, Standard Deviation, QQ Plots, Time Plots Lab 3: Probability and Random Variables Lab 4: Confidence Intervals for a Population Proportion Lab 5: Hypothesis Testing for a Population Proportion Lab 6: Sampling Distributions and the CLT Lab 7: One-­‐Sample t-­‐Test Procedures Lab 8: Paired t-­‐Test Procedures Lab 9: Two Independent Samples t-­‐Test Procedures Lab 10: One-­‐Way Analysis of Variance (ANOVA) Lab 11: Simple Linear Regression Lab 12: Chi-­‐Square Tests Old Exams for Review Exam 1 Questions Exam 2 Questions Final Exam Questions Page 1 2 4 6 8 10 12 14 17 23 29 41 51 59 67 77 87 93 103 111 123 131 151 171 Note to Students Welcome to Statistics 250 at the University of Michigan! This lab workbook is designed for you to use in lab and as extra preparation for exams. In the workbook, you will find the following materials: Supplemental Material – great summaries for reference throughout the term: 1. SPSS Commands Reference 2. Notation Sheet 3. Name That Scenario 4. Editing Charts in SPSS 5. Important Notes for Hypothesis Testing 6. Interpretation Examples 7. Summary of T-­‐tests and Name That Scenario Practice for Means 8. Regression Output in SPSS Weekly Labs (numbered 1 to 13) – each lab contains the follow parts: o Lab Background – objective and brief overview material, which is good to take a couple minutes to read before you come to lab each week. o Warm-­‐Up Activity – quick questions for you to do before the In-­‐Lab Project, usually a quick review of concepts you have seen in lecture. o ILP (In-­‐Lab Project) – one or more activities you will work on in lab, in groups. o Cool-­‐Down Activity –questions for you to do after the ILP for further reflection and application of the concepts covered in the ILP. o Example Exam Questions – old exam questions on the lab topic for practice. Note: each week some part of the weekly lab will be your Ticket out the Door, collected by your GSI and part of your lab grade. Old Exams – complete sets of actual old exam for studying. Be sure to refer to CTools to see if any problems on these old exams are not relevant for your particular upcoming exam (due to differences in the semester schedule). This information, in addition to solutions, will be posted on CTools in the “Review Info” folder under the “Resources” tab closer to each exam date. The Labs are designed to be interactive and to provide you with a complete example for each concept. Completing the corresponding PreLab assignment (a link to video instructions for PreLabs will be on CTools and the Stat 250 YouTube channel) and reading the upcoming lab background overview before lab each week is a good way to prepare for the various lab activities. Good luck in Statistics 250! -­‐-­‐ The Stat 250 Instructors and GSIs 1 Supplement 1: SPSS Commands Summary By Lab – For Quick Reference Lab 1 – Bar Charts, Histograms, Numerical Summaries Open a data file after having SPSS already open: File> Open> Data To produce a Histogram: Graphs> Legacy Dialogs> Histogram To generate Descriptive Statistics I: Analyze> Descriptive Statistics> Descriptives To generate Descriptive Statistics II: Analyze> Descriptive Statistics> Frequencies To produce a Bar Chart: Graphs> Legacy Dialogs> Bar> Simple> Summaries of separate variables Lab 2 – Boxplots, Time Plots, Q-­‐Q Plots To produce a Boxplot for a single variable with no groups: Graphs> Legacy Dialogs> Boxplot> Simple> Summaries for separate variables To use Data Label Mode: From inside the Chart Editor, Elements> Data Label Mode To Split (or unsplit) the data (get charts and statistics by group): Data> Split File To produce Side-­‐by-­‐Side Boxplots: Graphs> Legacy Dialogs> Boxplot> Simple> Summaries for groups of cases To produce a Sequence (Time) Plot: Analyze> Forecasting> Sequence Charts To produce a Q-­‐Q Plot: Analyze> Descriptive Statistics> Q-­‐Q Plots Lab 7 – One-­‐Sample t Procedures for a Population Mean To produce a Confidence Interval for a population mean (method I): Analyze> Descriptive Statistics> Explore> Statistics option To produce a Confidence Interval for a population mean (method II): Analyze> Compare Means> One-­‐Sample T Test To perform a One-­‐Sample T Test for a population mean: Analyze> Compare Means> One-­‐Sample T Test 2 Lab 8 – Paired t Procedures To calculate a confidence interval for µ D: Analyze> Compare Means> Paired-­‐ Samples T Test To perform a Paired T Test: Analyze> Compare Means> Paired-­‐Samples T Test To compute Differences: Transform> Compute Lab 9 – Independent Samples t Procedures To construct a confidence interval for µ 1 -­‐ µ 2: Analyze> Compare Means> Independent-­‐Samples T Test To perform a Two-­‐Samples T Test: Analyze> Compare Means> Independent-­‐ Samples T Test Lab 10 – One-­‐way Analysis of Variance (ANOVA) To perform an ANOVA: Analyze> Compare Means> One-­‐Way ANOVA Lab 11 – Simple Linear Regression To produce a Scatterplot: Graphs> Legacy Dialogs> Scatter/Dot To perform a Linear Regression: Analyze> Regression> Linear To produce a Residual plot: Graphs> > Legacy Dialogs> Scatter/Dot Lab 12 – Chi-­‐Square Tests To weight cases by Counts: Data> Weight Cases To perform a Goodness of Fit Test: Analyze> Nonparametric Tests> Chi-­‐Square To perform a Test of Independence: Analyze> Descriptive Statistics> Crosstabs To perform a Test of Homogeneity: Analyze> Descriptive Statistics> Crosstabs 3 Supplement 2: Notation Sheet The table below defines important notations, including that used by SPSS, which you will come across in the course. This is not an exhaustive list, but it is a fairly comprehensive overview of the “strange letters” used in the course. Note: Blank cells mean there is no corresponding notation. Name Mean Proportion Standard deviation Variance Sample size Multipliers Margin of error Test statistics Note: t, F, and χ 2 statistics have degrees of freedom (abbreviated df) associated with them. Look for these on your Formula Card. Significance level p-­‐value Population Sample Notation Notation Summary Measures μ (read as “mu”) x (x-­‐bar) Notation used in SPSS Mean p pˆ (p-­‐hat) σ (sigma) s Std. Deviation σ2 s2 n Confidence Intervals z* (z-­‐star) t* (t-­‐star) m, m.e. Hypothesis Testing z t F Variance N t F χ 2 (chi-­‐square) Chi-­‐square α (alpha) Sig. p-­‐value 4 Population Notation used in Sample Notation Notation SPSS Analysis of Variance (abbreviated ANOVA) Between groups Sum of squares (look in the SSG for groups column labeled Sum of Squares) Within groups Sum of squares (look in the SSE for error column labeled Sum of Squares) Between groups Mean square for (look in the MSG groups column labeled Mean Square) Within groups Mean square (look in the MSE error column labeled Mean Square) Regression Response (given by name (dependent) y y of y-­‐variable) variable Predicted E(y) (expected yˆ (y-­‐hat) (estimated) value of y) response Explanatory (given by name (independent) x x of x-­‐variable) variable B (look in the row β o (beta-­‐not) y-­‐intercept bo labeled (Constant)) B (look in the row labeled with the β1 (beta-­‐one) Slope b1 name of the x-­‐ variable) Coefficient of r R, Beta correlation Coefficient of r2 R Square determination Error terms vs Unstandardized ε (error terms) e (residuals) Residuals residuals Name 5 Supplement 3: Name That Scenario The first thing to do in any research inference problem is determine what type of inference problem it is. This will help in deciding what procedure/formulas are appropriate to use. The following questions can help you determine the data scenario you are working with. Please note, when answering, “How many variables are there?” do not count the variable which defines the populations (if there is more than one population). q How many populations are there? One Two More than two q How many variables are there? One Two q What type of variable(s)? Categorical Quantitative q Then use the following table to determine which type of inference would be appropriate for this scenario. Note the corresponding parameter is in parentheses, where appropriate. 6 Number of Variables and Type Categorical One Number of Populations One Two q 1-­‐sample q 2 indep. inference for population proportion (p) (Labs 5 an 6) q Chi-­‐square: Goodness of Fit (Lab 13) samples inference for the difference between 2 population proportions (p1 – p2) q Chi-­‐square: Homogeneity (Lab 13) q 2 indep. samples inference for the difference between 2 population means (µ1 -­‐ µ2) (Lab 10) q 1-­‐sample inference for population mean (µ) (Lab 8) Quantitative q Paired samples inference for a population mean difference (µD) (Lab 9) q Chi-­‐square: Categorical Independence (relationship) (Lab 13) Two q Regression Quantitative (β1) (relationship) (Lab 12) 7 More Than Two q Chi-­‐square: Homogeneity (Lab 13) q ANOVA (µi – where there is one µi for each population) (Lab 11) Supplement 4: Editing Charts in SPSS Once we have a histogram (or any chart) made, we may wish to edit the chart (perhaps to change the color of the bars or change the number of class intervals). To do this, double click on the chart displayed in the output viewer window. This will open the chart in the SPSS Chart Editor window. Suppose we want to change the color of the histogram bars from tan to a light green. To do this, double click on one of the tan bars and click on the Fill & Border tab in the properties window. Change the fill color to light green (click on the light green box color) and click on apply and then close. To change some aspect of the x-­‐axis, such as the scaling, double click on the x-­‐axis and its corresponding properties box will appear with many tab options. The scale tab can be used to change the endpoints and major increments of the x-­‐axis value labels. You could adjust the minimum to 10,000 and the maximum to 150,000. Leaving the Auto box checked for the Major Increment option will let SPSS create the increment size. The Histogram Options tab would allow you to add a normal curve to a histogram, as well as change the starting position and size of the bins (classes or intervals) in the histogram. In general, SPSS uses algorithms to produce a nice display of the data. These options are helpful if you have multiple plots that you would like to display using the same x-­‐axis values so comparisons can be more easily made. You can also change the gray background color to white under the Properties window with the background highlighted. The help button in the lower right corner of the Properties box can be selected to provide more details about any of the various options for that tab. Once you have finished customizing your chart, you can close out the chart editor. There are alternate ways to get to the Properties box in order to customize your chart. Once you have double-­‐clicked on your chart to open the Chart Editor, click once on the part of the chart that you wish to customize (so that is it highlighted). Then, click on the Show Properties Window tab (it looks like a paint palette) in your menu. This will open the Properties box. Another alternative is to simply select Properties under the Edit menu in the Chart Editor. Also, note that if you do not close the Properties box, and you continue highlighting different parts of your chart, the Properties box updates so that you can customize those parts as you go. For boxplots, if there are any points denoted as outliers, you can identify them by looking at their case label number in the default output. The Chart Editor provides a special mode for identifying individual cases whose data labels you want to display. This is the data label mode, and when you are in data label mode, you can't change anything else in the chart. From the menus, choose Elements> Data Label Mode. The cursor changes shape to indicate that you are in data label mode. Click the data element for which you want to display the case label. If there are overlapping data elements in the spot that you click, the Chart Editor displays the Select Data Element to Label dialog box. This dialog allows you to select the 8 specific data element or elements for which you want to display data labels. The Chart Editor displays the data label in a default position related to the data element. When you are finished choosing data elements, from the menus choose Elements> Data Label Mode again, and the cursor changes back to the arrow to indicate that you are no longer in Data Label Mode. The Options menu lets you customize your chart further. You may add a title or text box from this menu. Text boxes can appear anywhere in a chart. From the Chart Editor menus, select Options> Text Box or Options> Title depending on which you want. For titles, the Chart Editor creates the title box and automatically positions it in the top center of the chart. Type the text and press enter when you are finished typing. To enter line breaks, press Shift+Enter. If necessary, use the Text tab to format the text. For text boxes, you can drag and drop to reposition them. You may need to resize the graph so the text box will not cover up part of the graph. You can also copy the plots onto MSWord or another text editor and then type in your name and title within the document. Saving Output Boxes and Graphs Images and other output from SPSS can often be copied and then pasted into a document by selecting the desired output, right-­‐clicking, and choosing Copy. 1. To save an output box, such as a table of descriptive statistics, first have the location where you would like to store the output open. Then right-­‐click on the output table and select Copy. The table can then be pasted into a document or text-­‐field (such as those in your PreLab assignments or homework). If you are pasting into a Word document and if the output does not appear to format correctly, it may be a good idea to choose Paste Special and paste as an image. Your stats 250 GSIs prefer not to receive Word documents. 2. To save a graph, such as a boxplot or histogram, you will want to Export the graph as an image. Select the graph you wish to export then select Export from the File menu. At the top click the Selection button, select None (graphics only) under Document, and then choose the file type at the bottom. For uploading to , the extension “.jpg” or “.png” is required. Before completing the export command, be sure to give your file an informative name and note the location where the file will be saved. 3. To save an entire SPSS session, you can export output (all output in the viewer, charts only or text only) in many possible formats (html, jpeg, bitmap, etc.). You first make the Viewer the active window and select the Export command is under the File menu. You can also print the contents of your output viewer window (all output, text as well as charts) or any selected portion. Click on File from the menu bar and then choose Print. You could also just save your output file within SPSS by selecting the Save as option and giving a name for your file and click OK. 9 Supplement 5: Notes about SPSS t Procedures 1. The reported p-­‐value under the column heading of Sig. (2-­‐tailed) is for a 2-­‐sided test. For a one-­‐sided test, you first divide the reported 2-­‐tailed p-­‐value in half (p/2). If the t-­‐statistic is positive and the alternative hypothesis was upper-­‐tailed (>), then p/2 is the p-­‐value. If the t-­‐statistic is negative and the alternative hypothesis was lower-­‐tailed (<), then p/2 is the p-­‐value. However, if the t-‐statistic is positive and the alternative hypothesis was lower-­‐tailed (<) or if the t-­‐statistic is negative and the alternative hypothesis was upper-­‐tailed (>), then the p-­‐value is 1 -­‐ p/2. Alternative is >, then p-value is sig/2 positive Alternative is <, then p-value is 1 – (sig/2) t-statistic negative Alternative is >, then p-value is 1 – (sig/2) Alternative is <, then p-value is sig/2 For example, consider a 2-­‐sided test with an observed t statistic value of 0.948 and a p-­‐value of 0.364. This 0.364 is actually the sum of two equal areas: one being the area to the right of 0.948 and the other being the area to the left of -­‐0.948 under the t-­‐distribution with df = 11 curve (see Figure 1 below). If the alternative hypothesis had been upper-­‐tailed (>), then the p-­‐value would be only the area to the right (in the direction of extreme) of the observed t-­‐statistic of 0.948, which is half of the two-­‐sided p-­‐value, or 0.182. But if the alternative hypothesis had been lower-­‐tailed (<), the p-­‐value would be the area to the left (in the direction of extreme) of the observed t-­‐statistic of 0.948. So the p-­‐value would be 1 – (0.364/2) = 0.818. 10 2. Sometimes SPSS will display a p-­‐value of 0.000. Clearly, the probability is not exactly zero. Rather, it is zero to 3 significant digits. Thus, it is correct to say that the p-­‐value is less than 0.0005, since anything greater when round...
View Full Document

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture