STAT 5101: Foundations Of Data Science
Assignment 1
Academic year 13/14, First term
Due Date: In Class, Sep 25th (Wednesday), 2013.
1. Suppose the following information is obtained from Robert Keeler on his application for a home
mortgage loan at the Metr
Example 1: Suppose that a friend tells you that he will meet you for lunch
at a restaurant at 12:00pm 10 minutes. What is the probability that your
friend will arrive between 12:05pm and 12:08pm?
Example 2: If the random variable X has an exponential dist
Statistics for Managers
Using Microsoft Excel
5th Edition
Chapter 5 (Textbook Ch7)
Sampling and Sampling
Distributions
1
Learning Objectives
In this chapter, you will learn:
To distinguish between different survey
sampling methods
The concept of the sam
Statistics for Managers
Using Microsoft Excel
5th Edition
Chapter 2 (Textbook Ch3)
Numerical Descriptive
Measures
1
Chapter Goals
After completing this chapter, you should be able to:
Compute and interpret the mean, median, and mode for a
set of data
Find
Statistics for Managers
Using Microsoft Excel
5th Edition
Chapter 3 (Textbook Ch5)
Important Discrete Probability
Distribution
1
Chapter Goals
After completing this chapter, you should be able
to:
Interpret the mean and standard deviation for a
discrete p
Find quartiles Q1, Q2, Q3 and mode:
x1 , , xn raw data
n sample size
qi the rank (position) of Qi
[qi ] the integer part of qi , i = 1, 2, 3. e.g. [3.3]=3, [4.8]=[4]
Find Q2 (the second quartile or median):
Step 1: Take an ordered array: x(1) x(2) x(n) .
Statistics for Managers
Using Microsoft Excel
5th Edition
Chapter 1 (Textbook Ch1-Ch2)
Data Collection and Data
Presentation
1
Chapter Goals
After completing this chapter, you should be
able to:
Explain key definitions:
Population vs. Sample
Primary vs.
TABLE E.3
Critical Values oft
For a particular number qf degrees offreedom, entry represents the
critical value qf t corresponding to a speciﬁed upper-tail area (0L) t1a,df)
UPPER-TAIL AREAS
Degrees of
Freedom 0.25 0. I 0 0.05 0.025 0.01 0.005
1
846 APPENDICES
TABLE E.4
Critical Values of x2
2
ZU1u,df)
For a particular number of degrees of freedom, entry represents the critical value of x2 con-esponding to a speciﬁed upper-tail area (on).
Degrees 01
Freedom 0.995
1
2 0.010
3 0.072
4 0
TABLE E.7
Table of Poisson
Probabilities
For a given value ofl, entry
indicates the probability of a
specified value ofX.
E: Tables 805
X 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0 0.9048 0.8187 0.7408 0.6703 0.6065 0.5488 0.4966 0.
Point Estimation
Deﬁnition
Armin: gs'timate of song: population parameter 0 is a single numericai Value 60ft: '
statistic 6. The statistic 3 is called the paint estimator. Estimation problems occur frequently in engineering. We often need to estimate
The
3—3.2 The Use 0f PvValues for Hypothesis Testing
Deﬁnition
The P—value is the smallest level of signiﬁcance that would lead to rejection of
the null hypothesis H0.
If 012494935) W
P’l/aolue, 50.034 7 0< 050 1446 mid [’60
SO, agnoéé (: prvmc) is 1%; gmﬁe/s
Example 1:
A manufacturer of semiconductor devices takes a random
sample of size n of chips and tests them, classifying each chip as defective or
non-defective. Let Xi = 0 if the chip is non-defective and Xi = 1 if the chip
is defective. The sample fracti
Summary of Chapter 3
1
Concepts
Probability mass function: For a discrete random variable X with possible values X1 , X2 , , Xn , a probability mass function is a function such
that
(1) P (Xi ) 0
n
P
(2)
P (Xi ) = 1
i=1
(3) P (Xi ) = P (X = Xi )
Cumulat
STAT5101: Foundations of Data Science
Assignment 1
Academic year 15/16, First term
Deadline: During Class, Sep 29 (TUE), 2015.
1. For each of the following variables, determine whether the variable is categorical or numerical. If
the variable is numerical
STAT5101: Foundations of Data Science
Assignment 4
Academic year 15/16, First term
Deadline: During Class, Dec 1 (TUE), 2015.
1. The population mean waiting time to check out of a supermarket has been 10.73 minutes. Recently,
in an effort to reduce the wa
STAT5101: Foundations of Data Science
Assignment 2
Academic year 15/16, First term
Deadline: During Class, Oct 13 (TUE), 2014.
1. The time between arrivals of customers at a bank during the noon to 1 P.M. hour has a uniform
distribution over an interval f
Summary of Chapter 9
1
Concepts
Chi-Square Tests:
Test statistic 2 =
P
all cells
(fo fe )2
,
fe
where fe =
ni+ n+j
n
Concepts
Examples
1. 2 test for the difference between two proportions:
Home Page
H0 : p1 = p2 , H1 : p1 6= p2
If 2 > 21 , reject H0 .
Ti
STAT 5101: Foundations of Data Science (2014 - 2015)
Mid-Term Examination
Oct.21, 7:00pm - 9:00pm
Answer all questions.
1. (10%) Multiple-choice test questions.
(1) The width of each bar in a histogram corresponds to the
a. differences between the boundar
STAT5101: Foundations of Data Science
Assignment 3
Academic year 15/16, First term
Deadline: During Class, Nov 10 (TUE), 2015.
1. The following data represent the number of days absent per year in a population of six employees
of a small company:
1
5
6
8
STAT5101: Foundations of Data Science
Assignment 1 Solution
Academic year 15/16, First term
Deadline: During Class, Sep 29 (TUE), 2015.
1. For each of the following variables, determine whether the variable is categorical or numerical. If
the variable is
STAT5101: Foundations of Data Science
Assignment 4
Academic year 12/13, First term
1. The manager of a paint supply store wants to determine whether the mean amount of paint contained in 1-gallon cans purchased from a nationally known manufacturer is actu
STAT5101: Foundations of Data Science
Assignment 4
Academic year 14/15, First term
Deadline: During Class, Nov 25 (TUE), 2014.
1. Children in the United States account directly for $36 billion in sales annually. When their indirect
influence over product
STAT 5101 Assignment4
Suggested Solution
1.
a.
, where
Under
b.
the population mean amount of paint
and
. The
test statistics is between two critical values. Hence, at the 0.05 level of
significance, there is no enough evidence that the population mean am
Solution for STAT5101 Assignment 2
1.
(a) P(10<X<30) = (30 - 10)/120 = 0.1667
(b)
2.
The normal probability plot confirms that the data appear to be approximately normally distributed.
3.
4.
5.
(a)
(b)
(c)
6.
(a)
(b)
7.
(a)
(b)
Solution for ST
TAT5101 Midterm Exam
1. (10%
%)
A, B, D
D, D, D
2. (20%
%)
(c)
(d)
3. (20%
%)
You m
may also caalculate th
he results oof P(X = 0). It being
g close to 0 can also
o
providde us the conclusion
n.
4. (15%
%)
5. (15%
%)
(a)
(b)
(c)
6. (20%
%)