QM203
Engel coefficient
My project discusses the regression of Engel Coefficient (Y) with Disposable
Income (X1), New Residential Area (X2), and Year-end Public Saving Deposit (X3) in
1
China during the year 1978, 1980, 1985-2012. The entire statistic com
Regression with Categorical Variables
1. To express concepts that are not generally quantitative we
use categorical (qualitative) variables.
2. In regression, categorical variables are built using dummy
0,1 variables.
3. Always use one less dummy variable
Multiple Comparison Tests
Fisher's Least Significance Difference Procedure
The result of an ANOVA test will only tell us if there is a difference among
the groups, i.e. treatments, that were tested. The results do not tell us
which, if any group or treatm
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.814905707
R Square
0.664071312
Adjusted R Square
0.622080226
Standard Error
1.001791873
Observations
10
Y = Travel Time
X1 = Miles Travelled
ANOVA
df
Regression
Residual
Total
Intercept
X1
1
8
9
SS
MS
F
15
Coefficient of Determination
We have determined the best regression line for our Consumer Durables
data set using the Least Squares Method. The question now becomes, How
well does this regression line fit (represent) the data?
Calculating SSE
We recall th
Analysis of Variance
I. We have just studied the case where we wish to make
inferences about the means of two populations. Now we
will look at the case where we have more than two
populations to consider.
II. An example
Consider a problem in which it is d
Chi Square Test of Independence
Bank Location Problem
Observed Frequencies
Account Size
Small
Prefer
Location
A
Prefer
Location
B
120
240
Total
360
280
360
640
Calculating chi square
400
600
Account Size
Small
Large
Total
Prefer
Location
B
144
256
400
216
Confidence Intervals
I. Confidence Intervals for the Mean Population Standard
Deviation, , Known
We would like to estimate the value of from sample data
We begin with the value of X which we can calculate from sample data.
We know that the value X , hav
Multiple Regression
Consumer Durables Expenditures Problem
Recall from our earlier lecture that we wish to fit the following
relationship
y = 0 + 1x1+ 2x2
where
y=consumer durables expenditures (CDX) ($100)
x1 = net income (NI) ($1000)
x2 = family size .
Two Sample Tests on the Mean
I. Previously we have studied single population statistical inference on the
mean. That is we have used the sample mean X to infer something about the
population mean .
II. Two Sample Tests
We can also make inferences about m
Some Regression Model Assumptions
Recall the following from our first lecture on Simple Linear
Regression
First Order Linear Model
y = 0 + 1x +
(1)
where
y = E(y) = Dependent variable
x = Independent Variable
0 = y intercept of line
1 = Slope of line
=
Simple Linear Regression
1. Regression Analysis Provides a Methodology to:
Relate the mean value of a single dependent variable to the values of
one or more independent variable.
2. Exact Relationships vs. Statistical Relationships
Exact = c2 = a2 + b2 ;
Statistical Inference and Sampling Distributions
I. What is Statistical Inference?
We want to estimate or make a statement about a population parameter - say the
population mean or the population standard deviation.
The way to do this without experienci
Test for Independence
Testing for the Independence of Two Variables
In this test we use the chi square distribution to test if two categorical
variables are Independent of each other.
Let's look at the following problem:
Suppose a savings bank in a metrop
ASSIGNMENT 9
15. a.
S = cfw_ace of clubs, ace of diamonds, ace of hearts, ace of spades
b.
S = cfw_2 of clubs, 3 of clubs, . . . , 10 of clubs, J of clubs, Q of clubs, K of clubs, A of clubs
c.
There are 12; jack, queen, or king in each of the four suits.
ASSIGNMENT 5
5.
15, 20, 25, 25, 27, 28, 30, 34
2nd position = 20
6th position = 28
9.
a.
b.
Order the data from low 6.7 to high 36.6
Median
Use 5th and 6th positions.
c.
Mode = 7.2 (occurs 2 times)
d.
Use 3rd position. Q1 = 7.2
Use 8th position. Q3 = 17.2
ASSIGNMENT 1
3. a. 360 x 58/120 = 174
b. 360 x 42/120 = 126
c.
d.
Jep
10
20
7. a.
Rating
Excellent
Very Good
Good
Fair
Poor
Frequency
20
23
4
1
2
50
Percent Frequency
40
46
8
2
4
100
Management should be very pleased with the survey results. 40% + 46% = 8
ASSIGNMENT 4
27. a.
b.
c.
d.
Category A values for x are always associated with category 1 values for y. Category B values for x are usually
associated with category 1 values for y. Category C values for x are usually associated with category 2 values for
ASSIGNMENT 2
12.
Class
less than or equal to 19
less than or equal to 29
less than or equal to 39
less than or equal to 49
less than or equal to 59
Cumulative Frequency
10
24
41
48
50
Cumulative Relative Frequency
.20
.48
.82
.96
1.00
17. a/b.
Waiting Tim
ASSIGNMENT 13
7.
a.
f (x) 0 for all values of x.
f (x) = 1 Therefore, it is a proper probability distribution.
b.
c.
Probability x 25 is f (20) + f (25) = .20 + .15 = .35
d.
8.
x
1
2
3
4
Probability x = 30 is f (30) = .25
Probability x > 30 is f (35) = .
ASSIGNMENT 15
32. a.
f (0) = .3487
b.
f (2) = .1937
c.
P(x 2) = f (0) + f (1) + f (2) = .3487 + .3874 + .1937 = .9298
d.
P(x 1) = 1 - f (0) = 1 - .3487 = .6513
e.
E(x) = n p = 10 (.1) = 1
f.
Var(x) = n p (1 - p) = 10 (.1) (.9) = .9
= = .95
33. a.
f (12)
ASSIGNMENT 11
30. a.
b.
c.
31. a.
No because P(A | B) P(A)
P(A B) = 0
b.
c.
No. P(A | B) P(A); the events, although mutually exclusive, are not independent.
d.
Mutually exclusive events are dependent.
33. a.
Business
Undergraduate Major
Engineering
Other
ASSIGNMENT 22
10. a.
b.
Upper tail p-value is the area to the right of the test statistic
Using normal table with z = 1.48: p-value = 1.0000 - .9306 = .0694
c.
p-value > .01, do not reject H0
d.
Reject H0 if z 2.33
1.48 < 2.33, do not reject H0
11. a.
b.