1
Viva La Correlación!
•
Say X and Y are arbitrary random variables
Correlation of X and Y, denoted
r
(X, Y)
:
Note: -1
r
(X, Y)
1
Correlation measures linearity
between X and Y
r
(X, Y)
= 1
Y = aX + b
where a =
s
y
/
s
x
r
(X, Y)
= -1
Y = aX + b
where a = -
s
y
/
s
x
r
(X, Y)
= 0
absence of linear
relationship
o
But, X and Y can still be related in some other way!
If
r
(X, Y)
= 0, we say X and Y are “uncorrelated”
o
Note: Independence implies uncorrelated, but
not
vice versa!
Y)
Var(X)Var(
)
,
(
Cov
)
,
(
Y
X
Y
X
r
Fun with Indicator Variables
•
Let
I
A
and
I
B
be indicators for events A and B
E[
I
A
] = P(A),
E[
I
B
] = P(B),
E[
I
A
I
B
] = P(AB)
Cov(
I
A
, I
B
)
= E[
I
A
I
B
] – E[
I
A
] E[
I
B
]
= P(AB) – P(A)P(B)
= P(A | B)P(B) – P(A)P(B)
= P(B)[P(A | B) – P(A)]
Cov(
I
A
, I
B
) determined by P(A | B) – P(A)
P(A | B) > P(A)
r
(
I
A
, I
B
) > 0
P(A | B) = P(A)
r
(
I
A
, I
B
) = 0
(and Cov(
I
A
, I
B
) = 0)
P(A | B) < P(A)
r
(
I
A
, I
B
) < 0
otherwise
0
occurs
if
1
A
I
A
otherwise
0
occurs
if
1
B
I
B
Can’t Get Enough of that Multinomial
•
Multinomial distribution
n
independent trials of experiment performed
Each trials results in one of
m
outcomes, with
respective probabilities:
p
1
,
p
2
, …,
p
m
where
X
i
= number of trials with outcome
i
E.g., Rolling 6-sided die multiple times and counting how
many of each value {1, 2, 3, 4, 5, 6} we get
Would expect that
X
i
are negatively correlated
Let’s see.
.. when
i
j
, what is Cov(
X
i
,
X
j
)?
m
i
p
1
1
m
c
m
c
c
m
m
m
p
p
p
c
c
c
n
c
X
c
X
c
X
P
...
,...,
,
)
,...,
,
(
2