ASSIGMENT:

Question 1 5 Marks

Please limit your answer to each of the following to 100 words or less.

(a) Non-probability samples introduce selection bias into results, whereas probability samples enable the inference of unbiased generalisations about the population. Explain with an example. 2 marks

(b) What sorts of care you need to exercise to avoid biasedness, unethical practice or distortion of information in framing questions for a questionnaire survey and personal interviews of people in a sample. 2 mark

(c) Suppose that you are watching television at home. In this context give examples of data which are (i) nominal, (ii) ordinal, (iii) interval, and (iv) ratio. In other words, give four examples of data – one in each category – with the word ‘television’ in each. 1 mark

Question 2 5 Marks

The number of employees absent from work at a large tyre manufacturing plant over a period of 96 days is given in the table below.

105 101 99 100 105 101 102 91 102 100 104 100

98 99 107 99 101 97 101 92 100 100 101 103

94 106 94 102 93 109 100 103 103 109 96 101

103 103 101 100 98 96 98 104 96 105 103 97

102 106 100 108 100 100 99 99 104 98 106 107

108 102 93 100 101 105 108 99 96 101 100 99

106 95 92 108 102 105 105 81 89 103 108 98

109 106 101 102 104 97 103 108 104 98 109 108

(a) Construct a stem-and-leaf diagram. Do not round the data but be innovative in your choice of a stem so that the diagram has between 5 and 10 leafs. 1 mark

(b) Construct three frequency distributions by using each of the following as the first class (and equal class interval for each distribution): “77 to less than 92”, “79 to less than 86”, and “80 to less than 83”. Which frequency distribution provides the best summary? Explain your answer briefly. (Use a maximum of 100 words). 1 mark

(c) Draw a histogram for the frequency distribution in Part (b) with the first class “79 to less than 86”. On the same graph draw the frequency or percentage polygon. 1 mark

(d) Draw an ogive or cumulative percentage polygon for the frequency distribution in Part (b) with the first class “79 to less than 86”. The company wants to keep the plant operating each day and so is interested in the maximum number of staff absent on 95% of days. Use the ogive to determine the value of X where 95% of days have X (or fewer) staff absent. Describe briefly how you determined X or show your working on the ogive. (The value does not have to be very accurate.) 1 mark

(e) When delivering the presentation at the board meeting, a board member asks why you didn’t present the data in a Pareto diagram. How would you answer the question? (Use a maximum of 100 words). 1 mark

Question 3 5 Marks

Each of 16 students measured the circumference of a soccer ball by four different methods, which were:

Method A: Estimate the circumference by eye and using fingers.

Method B: Wrap a cardboard around the ball to form a cylinder, and then measure the internal circumference of the cylinder by flattening the cardboard and using a ruler.

Method C: Measure the circumference with a ruler and string.

Method D: Measure the circumference by rolling the ball along a ruler.

The results (in cm) were as follows:

Method A: 66 67 68 69 69 70 70 71

72 73 73 74 74 75 76 77

Method B: 68 68 69 69 69 71 71 69

69 69 69 69 68 68 68 67

Method C: 70 70 71 71 67 67 68 71

72 72 68 71 71 67 67 68

Method D: 68 69 69 70 70 71 71 67

69 71 71 71 71 72 72 72

(a) Calculate the mean and the median for each Method. 1 mark

(b) Calculate the 1st and 3rd quartiles, and the interquartile range for each Method. 1 mark

(c) Calculate the standard deviation for each Method. Which Method has the largest standard deviation? Would you expect this result? Explain briefly. (Use a maximum of 100 words.) 1 mark

(d) Construct a box-and-whisker plot for each Method. Put all the box-and-whisker plots on the one graph. Compare and comment briefly on the box-and-whisker plot of each Method. (Use a maximum of 100 words.) 1 mark

(e) Using the Chebyshev rule, at least how many values should be within two standard deviations of the mean? Check if this is the case for each Method. 1 mark

Question 4 5 Marks

A Cancer Society collected the following information when investigating the occurrences of skin cancer in a certain population of beach goers:

• 7% of beach goers, who do not use any sun-screen lotion develop skin cancer at some stage in their life.

• 1% of beach goers, who use sun-screen lotion develop skin cancer at some stage in their life.

• 90% of beach goers use sun-screen.

Use this information to answer the following questions. (Hint: construct a contingency table.)

(a) If a beach goer is randomly selected, what is the probability that the person uses sun-screen lotion and yet develops skin cancer at some stage in life? 1 mark

(b) If a beach goer is randomly selected, what is the probability that the person does not use sun-screen lotion and develops skin cancer at some stage in life? 1 mark

(c) If a beach goer is randomly selected, what is the probability that the person develops skin cancer at some stage in life? 1 mark

(d) If a beach goer is randomly selected who has already developed skin cancer, what is the probability that the person does not use sun-screen lotion? 1 mark

(e) What is the probability that a beach goer randomly selected will not develop skin cancer in life time or uses sun-screen lotion? 1 mark