16 Pages

Chapters 2 and 10

Course: STAT 301, Fall 2011
School: Purdue
Rating:
 
 
 
 
 

Word Count: 2818

Document Preview

2 Chapters and 10: Least Squares Regression Learning goals for this chapter: Describe the form, direction, and strength of a scatterplot. Use SPSS output to find the following: least-squares regression line, correlation, r2, and estimate for . Interpret a scatterplot, residual plot, and Normal probability plot. Calculate the predicted response and residual for a particular x-value. Understand that least-squares...

Register Now

Unformatted Document Excerpt

Coursehero >> Indiana >> Purdue >> STAT 301

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
2 Chapters and 10: Least Squares Regression Learning goals for this chapter: Describe the form, direction, and strength of a scatterplot. Use SPSS output to find the following: least-squares regression line, correlation, r2, and estimate for . Interpret a scatterplot, residual plot, and Normal probability plot. Calculate the predicted response and residual for a particular x-value. Understand that least-squares regression is only appropriate if there is a linear relationship between x and y. Determine explanatory and response variables from a story. Use SPSS to calculate a prediction interval for a future observation. Perform a hypothesis test for the regression slope and for zero population correlation/independence, including: stating the null and alternative hypotheses, obtaining the test statistic and P-value from SPSS, and stating the conclusions in terms of the story. Understand that correlation and causation are not the same thing. Estimate correlation for a scatterplot display of data. Distinguish between prediction and extrapolation. Check for differences between outliers and influential outliers by rerunning the regression. Know that scatterplots and regression lines are based on sample data, but hypothesis tests and confidence intervals give you information about the population parameter. When you have 2 quantitative variables and you want to look at the relationship between them, use a scatterplot. If the scatter plot looks linear, then you can do least squares regression to get an equation of a line that uses x to explain what happens with y. The general procedure: 1. Make a scatter plot of the data from the x and y variables. Describe the form, direction, and strength. Look for outliers. 2. Look at the correlation to get a numerical value for the direction and strength. 3. If the data is reasonably linear, get an equation of the line using least squares regression. 4. Look at the residual plot to see if there are any outliers or the possibility of lurking variables. (Patterns bad, randomness good.) 1 5. Look at the normal probability plot to determine whether the residuals are normally distributed. (The dots sticking close to the 45-degree line is good.) 6. Look at hypothesis tests for the correlation, slope, and intercept. Look at confidence intervals for the slope, intercept, and mean response, and at the prediction intervals. 7. If you had an outlier, you should re-work the data without the outlier and comment on the differences in your results. Association Positive, negative, or no association Remember: ASSOCIATON or CORRELATION is NOT the same thing as CAUSATION. (See chapter 3/2.5 notes.) Response variable: Y Dependent variable measures an outcome of a study Explanatory variable: X Independent variable explains or is related to changes in the response variables (p. 105) Scatterplots: Show the relationship between 2 quantitative variables measured on the same individuals Dots onlydont connect them with a line or a curve Form: Linear? Non-linear? No obvious pattern? Direction: Positive or negative association? No association? Strength: how closely do the points follow a clear form? Strong or weak or moderate? Look for OUTLIERS! Correlation: measures the direction and strength of the linear relationship between 2 quantitative variables, r. It is the standardized value for each observation with respect to the mean and standard deviation. 2 r 1 n1 xi x sx yi y sy where we have data on variables x and y for n individuals. You wont need to use this formula, but SPSS will. Using SPSS to get correlation: Use the Pearson Correlation output. Analyze --> Correlate --> Bivariate (see page 55 in the SPSS manual). The SPSS manual tells you where to find r using the least squares regression output, but this r is actually the ABSOLUTE VALUE OF r, so you need to pay attention to the direction yourself. The Pearson Correlation gives you the actual r with the correct sign. Properties of correlation: X and Y both have to be quantitative. It makes no difference which you call X and which you call Y. Does not change when you change the units of measurement. If r is positive, there is a positive association between X and Y As X increases, Y increases If r is negative, there is a negative association between X and Y As X increases, Y decreases 1r1 The closer r is to 1 or to 1, the stronger the linear relationship The closer r is to 0, the weaker the linear relationship Outliers strongly affect r. Use r with caution if outliers are present. 3 Example: We want to examine whether the amount of rainfall per year increases or decreases corn bushel output. A sample of 10 observations was taken, and the amount of rainfall (in inches) was measured, as was the subsequent growth of corn. Amount of Rain 3.03 3.47 4.21 4.44 4.95 5.11 5.63 6.34 6.56 6.82 Bushels of Corn 80 84 90 95 97 102 105 112 115 115 The scatterplot: 120 110 100 90 80 70 2 3 4 5 6 7 amount of rain (in) a) What does the scatterplot tell us? What is the form? Direction? Strength? What do we expect the correlation to be? 4 Correlations amount of rain (in) Pearson Correlation 1 Sig. (2-tailed) amount of rain (in) corn yield (bushels) .995(**) . .000 N corn yield (bushels) 10 10 .995(**) 1 .000 . 10 10 Pearson Correlation Sig. (2-tailed) N ** Correlation is significant at the 0.01 level (2-tailed). Inference for Correlation: R = correlation R2 = % of variation in Y explained by the regression line (the closer to 100%, the better) (Greek letter rho) = correlation for the population When = 0, there is no linear association in the population, so X and Y are independent (if X and Y are both normally distributed). Hypothesis test for correlation: To test the null hypothesis H0: = 0, SPSS will compute the t statistic: t rn2 1 r2 , degrees of freedom = n 2 for simple linear regression. b) Are corn yield and rain independent in the population? Perform a test of significance to determine this. c) Do corn yield and rain have a positive correlation in the population? Perform a test of significance to determine this. This test statistic for the correlation is numerically identical to the t statistic used to test H0: 1 = 0. Can we do better than just a scatter plot and the correlation in describing how x and y are related? What if we want to predict y for other values of x? 5 Least-Squares Regression fits a straight line through the data points that will minimize the sum of the vertical distances of the data points from the line. n (ei )2 Minimizes i1 Equation of the line is: y Slope of the line is: b1 b0 b1 x, with y = the predicted yline sy , where the slope measures the amount of change sx caused in the predicted response variable when the explanatory variable is increased by one unit. Intercept of the line is: b0 y b1 x , where the intercept is the value of the predicted response variable when the explanatory variable = 0. Type of line r Least Squares Regression equation of line y b0 b1 x Ch. 10 Sample Ch. 10 Population (model) yi x 0 1i slope y-intercept b1 b0 1 0 i Using the corn example, find the least squares regression line. Tell SPSS to do AnalyzeRegression Linear. Put rain into the independent box and corn into the dependent box. Click OK. b Model Sum mary Model 1 R .995 a R Square .991 Adjusted R Square .989 Std. Error of the E stimat e 1.290 a. Predictors: (Constant), amount of rain (in) b. Dependent V ariable: corn y ield (bus hels) b ANOVA Model 1 Regression Res idual Total Sum of Squares 1397.195 13.305 1410.500 df 1 8 9 Mean Square 1397.195 1.663 F 840. 070 Sig. .000 a a. Predictors: (Constant), amount of rain (in) b. Dependent Variable: corn y ield (bushels) a Coe fficients Model 1 (Constant) amount of rain (in) Uns tandardized Coefficients B Std. Error 50.835 1.728 9.625 .332 Standardized Coefficients Beta .995 a. Dependent Variable: corn yield (bushels) 6 t 29.421 28.984 Sig. .000 .000 95% Confidence Interval for B Lower Bound Upper Bound 46.851 54.819 8.859 10.391 d) What is the least-squares regression line equation? The scatterplot with the least squares regression line looks like: 120 R2 is the percent of variation in corn yield explained by the regression line with rain= 99.06% 110 100 90 80 70 Rsq = 0.9906 2 3 4 5 6 7 amount of rain (in) Hypothesis testing for H0: Test statistic: t 1= 0 b1 with df = n - 2 SEb1 SPSS will give you the test statistic (under t), and the 2-sided P-value (under Sig.). e) Is the slope positive in the population? Perform a test of significance. f) What % of the variability in corn yield is explained by the least squares regression line? g) What is the estimate of the standard error of the model? 7 What do we mean by prediction or extrapolation? Use your least-squares regression line to find y for other x-values. Prediction: using the line to find y-values corresponding to x-values that are within the range of your data x-values. Extrapolation: using the line to find y-values corresponding to x-values that are outside the range of your data x-values. Be careful about extrapolating y-values for x-values that are far away from x data the you currently have. The line may not be valid for wide ranges of x! Example: On the rain/corn data above, predict the corn yield for a) 5 inches of rain b) 7.2 inches of rain c) 0 inches of rain d) 100 inches of rain e) For which amounts of rainfall above do you think the line does a good job of predicting actual corn yield? Why? Cartoon by J.B. Landers on www.causeweb.org (used with permission) 8 Prediction Intervals Predicting a future observation under conditions similar to those used in the study. Since there is variability involved in using a model created from sample data, a prediction interval is better than a single prediction. Theyre related to confidence intervals. Use SPSS. The 95% prediction interval for future corn yield measurements when rain = 5.11 is (96.90, 103.14). Assumptions for Regression: 1. Repeated responses y are independent of each other. 2. For any fixed value of x, the response y varies according to a Normal distribution. 3. The mean response has a straight-line relationship with x. y 4. The standard deviation of y () is the same for all values of x. The value of is unknown. How do you check these assumptions? Scatterplot and R2: Do you have a straight-line relationship between X and Y? How strong is it? How close to 100% is R2? Hopefully no outliers! (#3) 9 Normal probability plot: Are the residuals approximately normally distributed? Do the dots fall fairly close to the diagonal line (which is always there in the same spot)? (#2) Normal P-P Plot of Regression Standardized Residual Dependent Variable: corn yield (bushels) 1.0 Expected Cum Prob 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Observed Cum Prob Residual plot: Do you have constant variability? Do the dots on your residual plot look random and fairly evenly distributed above and below the 0 line? Hopefully no outliers! (#1 and 4) Residual is the vertical difference between the observed y-value and the regression line y-value: residual ei yi yi yi a bxi ydata yline Residual plot: scatterplot of the regression residuals against the explanatory variable (e vs. x) e-axis has both negative and positive values but centered about e = 0. the mean of the least-squares residuals is always zero. e 0 Good: total randomness, no pattern, approximately the same number of points above and below the e = 0 line Bad: obvious pattern, funnel shape, parabola, more points above 0 than below (or vice versa) if you have a pattern, your data does not necessarily fit the model (line) well 10 Example: Show a residual plot for the corn/rain data using SPSS. 2. 5 2. 0 1. 5 1. 0 Unstandardized Residual .5 0. 0 -.5 -1. 0 -1. 5 -2. 0 2 3 4 5 6 7 amount of rain (in) Outliers: Outliers are observations that lie outside the overall pattern of the other observations. Outliers in the y direction of a scatterplot have large regression residuals (ei) Outliers in the x direction of a scatterplot are often influential for the regression line An observation is influential if removing it would markedly change the result of the calculation Outliers can drastically affect regression line, correlation, means, and standard deviations. You can draw a second regression line that doesnt include the outliersif the second line moves more than a small amount when the point is deleted or if R2 changes much, the point is influential Which hypothesis test do you use when? If youre not sure whether to use 1 or , here are some guidelines. The test statistics and P-values are identical for either symbol. Use 1 Either 1 or If the words are: Slope, regression coefficient Correlation, independence linear relationship 11 Review of SPSS instructions for Regression: When you set up your regression, you click on: Analyze-->Regression-->Linear. Put in your y variable for "dependent" and your x variable for "independent" on the gray screen. Don't hit "ok" yet though. Back on the regression gray screen, click on "Plots", and then click on "normal probability plot." Click "continue" on the Plots gray screen. Back on the regression gray screen, click on "Save", and then click on unstandardized residuals." Click Individual under the Prediction Interval section, and adjust the confidence level, if needed. Click "continue" on the Save gray screen and then "ok" to the big Regression gray screen. The prediction interval and the residuals will show up back on the data input screen. The LICI_1 and UICI_1 give you the prediction interval lower and upper bounds. You still won't have a residual plot yet. If you click back to your data input screen, you now have a new column called "Res_1". To make the residual plot, you follow the same steps for making a scatterplot: go to graphs-->scatter-->simple, then put "Res_1" in for y and your x variable in for x. Click "ok." Once you see your residual plot, you'll need to double click on it to go to Chart Editor. On the Chart Editor tool bar, you can see a button that shows a graph with a horizontal line. Click on that button. Make sure that the yaxis is set to 0. 12 How will I ever use this stuff again in my future career? Testimonial from a Former Student (E-mails received June 9, 2005) Ellen, I hope that all is well. It is your favorite student here, Eric from your Fall 04 stat 301 class. I need some help. Believe it or not, you were right and I am using stat everyday all day long, but I am drawing a blank. I am trying to determine a linear regression line and I can't remember the equation. Y=mx+b or course but on the regression analysis output what do I use as m & b. I am very disappointed in myself because I can't remember, but alas I am asking for help. I tried looking for the notes on your home page but I couldn't find it any longer and I think it was taken down for the summer. If you could help me that would be great. Hope your summer months are spent by the pool!!! Regards, Eric ************ Ellen, As for the project. It essentially was a regression to determine the amount of out of state cotton seed a crushing plant would need when their own states current production increased or decreased. It is not very statistically sound since we are only using 5 years worth of data...but it is just a tool in price analysis that we are using to determine a spread between plants that we can buy the cotton seed at. As for using me...that would be great. Let those impressionable young students see that we are using what they learn on an everyday basis...some more than others...especially me since I deal with prices and trading. Eric 13 Example: The scatterplot below shows the calories and sodium content for each of 17 brands of meat hot dogs. a) Describe the main features of the relationship. 60 0 50 0 40 0 Sodium content 30 0 20 0 10 0 10 0 12 0 14 0 16 0 18 0 20 0 Calories b) What is the correlation between calories and sodium? Correlations Calories Sodium content Calories 1 . 17 .863 ** .000 17 Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Sodium cont ent .863 ** .000 17 1 . 17 **. Correlation is significant at the 0. 01 level (2-tailed). c) Report the least-squares regression line. b Model Sum mary Model 1 R .863 a R Square .745 Adjusted R Square .728 a. Predictors: (Constant), Calories b. Dependent Variable: Sodium content 14 Std. Error of the Estimat e 48.913 a Coe fficients Model 1 Uns tandardized Coefficients B Std. Error -91. 185 77.812 3.212 .485 (Constant) Calories Standardized Coefficients Beta .863 t -1.172 6.628 Sig. .260 .000 95% Confidence Interval for B Lower Bound Upper Bound -257.038 74.668 2.179 4.245 a. Dependent Variable: Sodium content d) Show a residual plot and comment on its features. 10 0 0 -10 0 -20 0 10 0 12 0 14 0 16 0 18 0 20 0 Calories e) Is there an outlier? If so, where is it? f) Show a normal probability plot and comment on its features. Normal P-P Plot of Regression Standardized Residual Dependent Variable: Sodium content 1.0 Expected Cum Prob 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 Observed Cum Prob 15 0.8 1.0 g) Leave off the outlier, and recalculate the correlation and another leastsquares regression line. Is your outlier influential? Explain your answer. Correlations cal2 cal2 sod2 Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N sod2 .834 ** .000 16 1 . 16 1 . 16 .834 ** .000 16 **. Correlation is significant at the 0. 01 level (2-tailed). b Model Sum mary Model 1 R .834 a R Square .695 Adjusted R Square .674 Std. Error of the Estimat e 36.406 a. Predictors: (Constant), cal2 b. Dependent Variable: sod2 a Coe fficients Model 1 (Constant) cal2 Uns tandardized Coefficients B Std. Error 46.900 69.371 2.401 .425 Standardized Coefficients Beta .834 t .676 5.653 Sig. .510 .000 95% Confidence Interval for B Lower Bound Upper Bound -101.886 195.686 1.490 3.312 a. Dependent Variable: sod2 h) If there is a new brand of meat hot dog with 150 calories per frank, how many milligrams of sodium do you estimate that one of these hotdogs contains? 16
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Purdue - STAT - 301
MATCHING: For the following problems, write the letter of the most appropriatestatistical analysis technique next to the story. Note: each answer choice may be usedonce, more than once, or not at all.A. Mean and/orstandard deviationE. Matched pairst
Purdue - STAT - 416
NameStudent ID #Instructor:Sergey KirshnerSTAT 416 Spring 2012Practice Exam #1February 8, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculatorsare permitted. Please read the directions carefully. There are 9 prob
Purdue - STAT - 416
NameStudent ID #Instructor:SOLUTIONSergey KirshnerSTAT 416 Spring 2012Practice Exam #1February 8, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculatorsare permitted. Please read the directions carefully. There a
Purdue - STAT - 416
NameStudent ID #Instructor:Sergey KirshnerSTAT 416 Spring 2012Practice Exam #2March 20, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculatorsare permitted. Please read the directions carefully. There are 8 proble
Purdue - STAT - 416
NameStudent ID #Instructor:SOLUTIONSergey KirshnerSTAT 416 Spring 2012Practice Exam #2March 20, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculatorsare permitted. Please read the directions carefully. There are
Purdue - STAT - 416
NamePID #InstructorSergey KirshnerSTAT/MATH 416 Spring 2012Practice Quiz #1January 25, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculators arepermitted. Please read the directions carefully. The quiz is graded
Purdue - STAT - 416
NamePID #InstructorSOLUTIONSergey KirshnerSTAT/MATH 416 Spring 2012Practice Quiz #1January 25, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculators arepermitted. Please read the directions carefully. The quiz i
Purdue - STAT - 416
NamePID #InstructorSergey KirshnerSTAT/MATH 416 Spring 2012Practice Quiz #2February 21, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculators arepermitted. Please read the directions carefully. The quiz is graded
Purdue - STAT - 416
NamePID #InstructorSOLUTIONSergey KirshnerSTAT/MATH 416 Spring 2012Practice Quiz #2February 21, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculators arepermitted. Please read the directions carefully. The quiz
Purdue - STAT - 416
NamePID #InstructorSergey KirshnerSTAT/MATH 416 Spring 2012Practice Quiz #3March 11, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculators arepermitted. Please read the directions carefully. Evaluate the normal C
Purdue - STAT - 416
NamePID #InstructorSOLUTIONSergey KirshnerSTAT/MATH 416 Spring 2012Practice Quiz #3March 11, 2012You are not allowed to use books or notes. Non-programmable non-graphing calculators arepermitted. Please read the directions carefully. Evaluate the
Purdue - STAT - 511
Statistics 511 Midterm Exam II (Evening)Name:Section (circle one):1:30-2:203:30-4:20Please write down your answer and all relevant calculations(for partial credits) on the exam paper.12(1) (7) The joint mass function for random variables X and Y
Purdue - STAT - 511
Purdue - STAT - 511
Purdue - STAT - 511
Statistics 511-2Midterm Examination 1Wednesday, February 15, 2012Name (please print) :_ANSWER_Time: 50 minutesThis exam is closed-book. You may not consult any notes or books during this exam.One formula page is provided. Calculators are permitted.
Purdue - STAT - 511
Statistics 511-2Midterm Examination 2Wednesday, March 28, 2012Name (please print) :_ANSWER_Time: 50 minutesThis exam is closed-book. You may not consult any notes or books during this exam.One formula page and distribution tables are provided. Calcu
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 18Feb 22, 2012Jun Xie5.4 The distribution of sample meanDefinitionA statistic is any quantity whose value can be calculated from sample data. Prior to obtaining data, thereis uncertainty as to what value of any particu
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 19Feb 24, 2012Jun XieFinish the contents of Central Limit Theory in the last post.6.1 Point estimationThe sample mean could be used to draw a conclusion about the value of . Similarly, the value of thesample variance s
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 20Feb 27, 2012Jun Xie6.1 Point estimationThe sample mean could be used to draw a conclusion about the value of . Similarly, the value of thesample variance s2 can be used to infer something about 2.DefinitionA point e
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 22March 2, 2012Jun Xie6.2 Methods of estimationMaximum likelihood estimateThe likelihood function tells us how likely the observed sample is as a function of the possible parametervalues. Maximizing the likelihood give
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 23March 5, 2012Jun Xie7. Confidence IntervalsSuppose that the parameter of interest is a population mean and that1. The population distribution is normal;2. The value of the population standard deviation is known.This
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 24March 7, 2012Jun XieFinish the contents of large sample confidence interval in the last post.7.3 Confidence intervals based on a normal distributionThe CI for presented in earlier section is valid provided that n is l
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 25March 9, 2012Jun Xie1. Finish the content of one-sample CI by a t-distribution.2. Briefly introduce the idea of prediction interval for a single future value.3. Summary on a general method of deriving a CI.7.4 Confid
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 26March 19, 2012Jun Xie1. Summary on constructing CIs7.4 Confidence intervals for the variance and standard deviationIn case of a normal population distribution, we can construct a CI for the population variance 2 orst
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 27March 21, 2012Jun Xie8.1 Introduction of hypothesis testingConsider a pair of null and alternative hypotheses, H0 versus Ha. A test procedure is a rule, based onsample data, for deciding whether to reject H0.Example
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 28March 23, 2012Jun Xie8.2 Tests about a population mean A normal population with known Example 6 A manufacturer of sprinkler systems used for fire protection in office buildings claims that thetrue average system-acti
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 29March 26, 2012Jun Xie1. Review the general test procedure and the t-test.8.2 Tests about a population mean t-testConsider testing against H0: = 0 against Ha: > 0. We use the t test statisticWhen H0 is true, the test
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 30March 30, 2012Jun Xie1. Review and sample size determination.2. Illustrate probabilities of type I error and type II error in curves of the test statistic.8.2 Tests about a population mean Large-sample testsWhen the
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 31April 2, 2012Jun Xie1. Finish large sample z tests.8.4 P-valuesBesides the rejection region method, we now consider another way of reaching a conclusion in ahypothesis testing analysis, based on calculation of a cert
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 32April 4, 2012Jun Xie1. More discussions on P-values2. Two sample confidence interval, referring to the last post (Section 9.1 is skipped).9.2 Two-sample t test and confidence intervalExample 7 In a study of liner ten
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 33April 6, 2012Jun Xie9.3 Analysis of paired dataConsider experiments with only one set of n individuals and two observations on each hence havepaired data.AssumptionsThe data consists of n independently selected pair
Purdue - STAT - 511
STAT 511-2Spring 2012Lecture 34April 9, 2012Jun Xie10. Analysis of Variance (ANOVA)Single-factor ANOVA focuses on a comparison of more than two population or treatment means. Letl = the number of populations or treatments beingcompared1 = the mea
Purdue - STAT - 511
Name:ANSWER.Statistics 511-2, Quiz 3The heights of men in a certain population follow a normal distribution with mean 69.7 inchesand standard deviation 2.8 inches.a) If a man is chosen at random from the population, find the probability that he will
Purdue - STAT - 511
STAT 511-2 Sample Questions for Midterm 2Ch4.3-8.21. Let X = the time between two successive arrivals at the drive -up window of a local bank. If X has an exponentialdistribution with = 1 (which is identical to a standard gamma distribution with=1), c
Purdue - STAT - 512
Review for Exam 2You are not required to write any SAS code for this exam, however, you will be answering questionsbased on the SAS output (with certain key values removed). The missing values can be calculated fromthe values provided. Think about the
Purdue - STAT - 513
Exam I Spring 2009 _ _ Name1. Name exactly three signs that a process is out of control: a. b. c.2. What are the three main components of the Cost of Quality: a. b. c.3. If we used two sigma limits in the X-bar chart instead of three sigma limits, how
Purdue - STAT - 513
Exam I Stat 513 Spring 2009__Name1. Name exactly three signs that a process is out of control:a.b.c.2. What are the three main components of the Cost of Quality:a.b.c.3. If we used two sigma limits in the X-bar chart instead of three sigma li
Purdue - STAT - 513
Exam II Stat 513Spring 2008_name1. If management wants workers to produce fewer defectives, managementmust2. Why is Cp so misleading as an indicator of how a process is performing?3. Instead of computing a hypothetical capability for an out of cont
Purdue - STAT - 513
Exam II Stat 513 Fall 2008 _ name 1. What is the most common cause of white space on an X-bar chart?2. Cp would be a good indicator of how well a process is performing if X double-bar equaled3. Can control charts help you reduce common cause variation?
Purdue - STAT - 513
Exam II Stat 513 Fall 2008 _ name 1. What is the most common cause of white space on an X-bar chart?2. Cp would be a good indicator of how well a process is performing if X double-bar equaled3. Can control charts help you reduce common cause variation?
Purdue - STAT - 513
Exam II Stat 513/IE 530 Spring 2006 _ name 1. In order to remove redundant steps in a process one would use a _ _.2. Once one has identified a specific problem in a process, in order to find the cause, you would the use a _.3. When using p-charts, if yo
Purdue - STAT - 513
Exam III Spring 2008 _ _ name 1. When do you use a p chart instead of an np chart?2. When do you use a moving average chart instead of an XmR chart?3.What does an Acceptable Quality Level of 5% mean?4. When should you change your control limits (name
Purdue - STAT - 513
IE 530/ Stat 513Exam II Spring 04_1.Why is it important to have adequate measurement units?2. In setting the process aim, what is implicitly assumed?3. In acceptance sampling, what area. Producers riskb. Suppliers risk4. For Xbar and R charts, is
Purdue - STAT - 513
AStat 513/IE 530 Midterm Spring 2006 _ name1. What is the most important component of quality costs? Why? (be brief) 2. With a large sample, what is the best way to estimate the proportion out of spec? 3. From lecture, if you were to include processes w
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
X P(X t) E(X ) < E(X ).t E(X ) = t > 0txf (x)dx =xf (x)dx +0t0xf (x)dx txf (x)dx tP(X > t). I (X t) (X/t) X
Purdue - STAT - 516
Purdue - STAT - 516
Purdue - STAT - 516
STAT 516Spring 2012Practice Midterm: Probability and Distributions, MultivariateDistributionsName:Please return this page with your solution after exam.1. Five cards are drawn at random from a 52-card deck.(a). Compute the probability that at least
Purdue - STAT - 517
Purdue - STAT - 517
Solution to Homework # 1, Stat 5171. 3.1.7Solution: The PMF of X1 isf1 (x) =3 2 x 1 3x( ) ( ) , x = 0, 1, 2, 3,x33and the PMD of X2 isf2 (x) =4 14( ) , x = 0, 1, 2, 3, 4.x2Thus,P (X1 = X2 ) =P (X1 = 0, X2 = 0) + (X1 = 1, X2 = 1)+ P (X1 = 2,
Purdue - STAT - 517
Solution to Homework # 2, Stat 5171. 3.4.4.Solution: Here we haveP (X < 89) = 0.90 (89 89 ) = 0.90 = 1.282 + 1.282 = 89,P (X < 94) = 0.95 (94 95 ) = 0.95 = 1.649 + 1.649 = 95.andSolve the above equaltio, we have = 68.04 and = 16.35.2. 3.4.1
Purdue - STAT - 517
Solution to Homework # 3, Stat 5171. 4.1.5.Solution:V (Y ) = V (3X2 X1 ) = 9V (X2 ) + V (X1 ) = 18 + k.Thus,V (Y ) = 25 18 + k = 25 k = 7.2. 4.1.6.Solution: Because X1 and X2 are independent,E (X1 X2 ) = E (X1 )E (X2 ) = 1 2and222222E (X1 X