This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: * Rule 1: Iins a random variablewitb mean 1.11 [ _I2_{x—fl2+[I1—fl2+"'+{x.—f)l andaandbareﬁaedconstanis,1henthemean
_ —1'Z_Ii E "_1 ofa+insE1Ea+bXJ=a+qu
* Rule 1: Ifde Fare random variables with
E: (F. ”ﬁg means uxandnrrespectively, then themean of
1+ ris Etr+ I? = ”1+ is
L _— _—
i" = M = ﬂ estimate for the correlation.
313:: 1a“ Germ”) Earn”)
3y X T X l1: 3 HI}: llJr=xlpl+ x2P2+ +xi'pt
em... = (1 —*r2] 3: 3y “2.1“:{1'1le2l’1'l'[12HﬂiF1++[Itle2£’a  P [A 85 B) = P (A I'i B] Probability A AND B both occuring is the “intersection” of the sets.
I: P (A or B) = P(A U B) Probability of either A or B or BOTH occuring is the union of the sets. I U S PUlSLB S PTA) ,P (B) S P (A 01' B) 3: PD” +P (B)  Aconlinuousrandomvariableisa
variable halting all possible values in an
 PtAorB)=P(A)+P<BJ—P(A&BJ esteem”
PASZB . ,, ,, ' leiaacmthuwsrandomvmiablaits
n P [A B) = JW1: Deﬁmton of conditlonal probability. thbﬂity distribution is described by a
a P (A 85 B) = P (A B) X P(Bj: Just a use of that deﬁnition of conditional probability. : cum;
S  'I‘hepmbabilityofaneventisﬂiesrea
n _— _ — _ _y underthecmveshovemevaluesofl'
3‘_ 1 2(xf X)(y: y) b—F‘S thaimskeupiheeveui
— X
H — 1 i=1 Sr Sy a : )7 —bf Rule 1: Iins a random variable with mean ”x
_ _ _ _ and a and b are ﬁxed constants, then the mean
Mmmesﬂleqmw ofa+hXisE(a+bX):a+qu
A " A " Rule 2: If X and Yare random variables with
y _ Z a —— bx_ E (y, — y; )2 = Z [yi — (a + bxfjlz means [1de uyrespectively, then the mean of
I I i=1 i=1 X+ Yls E(X+ Y) 2 “X4. ”Y The variance of X, denoted E(X— 112192 = 62);
is given by
Pliﬂ and B) = Pliﬂ} 1: P03 I A} """ "'" in“ 52X: “1' “XV 131+ (12' ”332 102+" '+(xk' “1917}:
“CD—"(D (D (3 {ED—(i)
' r2 is the fraction of the variation in the values of y that
is explained by the leastsquares regression of y onx i.‘ l
' Thus, 1‘2 = variance of 3? / variance of y, where (E) “16
17' are the predicted values (3‘) = a + bx) and (Julian {mm respnl'isl [cramming
y are the observed values ta] m m
Lug (AxB}=LoeA+LogB Ifleg(159:;rhesus:It]c
Lﬂe(ﬁbl=bxlﬂs(ﬁ} lflﬂ(A)=c,thenA=e‘
I} Who or what is the object of study (individuals)?      
_ _ _ 1) 3:11:15 ﬁpmﬂoﬁmm groups 0f 511ml” Rule 3. For any event A, mm) = 1  P(A)
2} WM Stlltljf be Obsmqauogal Dr Emma? m 5’ C e Strata Rule 4. Addition rule: If A and B are disjoint
(11" experimental — how Will treatments be asmgned?) 2) Choose a sepamte mple random sample events (no elements in common) then
_ _ _ _ ' ' P[A or B) : P(A) + P(B)
3} How eat the mdi'vuhals be selected? “hm “‘3'“ 5m 1 . 1. _ 1 _ f d
. . . _ _ 3) Combine these simple random samples Rule 5' rgeggéggfgvgzi; an E are
4} How many mdwuluals “”11 be studied? together to form the full sample P(A and B) = P(A) x P03)
5} What variables will be measured? {in the C0111?“ PFDPDIﬁOD‘il
PTA l E] _ P'iB l A) PTA} 1. Fixed number (n) of observations, or “trials"
PU} l A) PU!) + Pa} l A‘)P(A‘) 2. The n trials are all independent of each other
A random variable is a variable Whose value is a 3 Each trial has exactly tWO pOSSlblEi outcomes _
numerical outcome of a random phenomenon “ 5, S “f .1 ,.,
A discrete random variable is a variable having a ﬁnite SHCCE SS ( ) GI 31 111'6 (F)
“umber OfWSSible values 4. The probability of success at each trial
A continuous random variable is a . .
variable taking all possible values in an (denoted byp) 15 constant (Le. P(S) = p) interval of numbers Deﬁnt‘llon: Events A and B are Independent if the fact that one of them happened does not imply
anything about the chance of the other one happening.
Deﬁnl'llon: Events A and B are mutually mluslve if they can't happen at the same time:
PtA rt 3) = 0.
BASIC RU LES OF PROBABILITY 1.03P(A)£1 Thepl‘obabilityofarlyevmtAisbetweenDandl. 2.P(not A): 17 HA) The probability that event A does not happen is one minus the probability that it does. 30.1fevents A1, A2, Aa,  . aremutually erclusive, then P[A1 or A2 orA3 or ...) = PTA] UAQUA3U“J = P(A1) +P(A2)+P(A3) +. The probability that one of sevaal mutually emlusive evmts happens is the sum of their probalilities Example: The probabtltty of rolltng a ate and getttng a1 or a 6 ts % +% = % CONDITIONAL PROBABILITY Deﬂnltlon: The probability that event 3 will occur if you know that event A has already occurred
isdenoted P(BA) and called the condlttonal pmbabllttv of B gluten A. Multipllcaﬂon rule: P[A rt 3) = P{BA)P[A)
The probability of A and B is the probability of 3 given A times the probability of A.
0 HA and B are independentevents, then P(BIA) = 13(3). BAYES'S RULE Boyes‘s rule Is a statement about the relattonshtp between the marglnal probablllty PM), the olnt
probabllltv PIA rt 8) [also wrttten P(A, 8)], and the condtttahat probobtl ty HAIR). PtArtB) _ P(AB)P(B) PlBlAl = PlAl _ EPHtIBtJPlgil 3b. PIA u B) : PtA) + PtB) — Pris D B) ‘ ‘ ‘ ‘ AIDS tesltng example: Suppose that It a person has the H\ar onttbodtes, then the probabtttty of o postttve
This isthe probability that A or B or bothoccur. test Is 95%, and that It the person does not have the anttbodtes, then the probohtltty at o negotlve test ls i
also 95%. Assume 0.6% of the populotton has the HIV onttbodtes. Bayes‘s rule can be used to show that E
It a pantcular person from the paputatton tests posttttre. the {.trobabtlttyl that he or she actually has that
anttbodtes ts only 10%! 4.1fevmts A1, A2, A3, . .. are independqit, then
P(A1 and A2 and A3 and ...) = P(A1 rt A2 rt A3 rt   ) = P(A1)P(A2)P(A3) Example: The probabtltty of rolltng a land then a 615% x g = 3—15. The correlation ofraridorri vai'iablesX and Y, otten denotedp(X,Y), is the covariance oth and Deflnltton: A Bernoul" "ml is an experiment consisting (If 1.61358th trials With only two possible
outcomes, typically success with probabilityp or failure with probability (,7. Y scaled so that it is independent ofthe units of measurement. ptxy} COVIX= Y) El“ — I“) (3* — 33 1. Blnomtal dletrtbutton: Adiscrtete probability distribution with probability density function
WartXJVartY) t/ztxt — 2r Eat — w m : (n)prqn_1
1. Coeﬁklent of determlnat'ton, RE .i' ’ The coeﬁicimt of determination is the ratio of explained variation {about the mean) over the where I = 0,1,2, . . . ,n. This is the distribution of the number of successes in ii indqaendent
total variation {about the mean) and is inta'preted as the proportion of the observed variation Bernoulli [31815 This djsuibuﬁon has mean p, 2 up and variance 32 : npq_
in Y that can beexplained by the regression model. 2 (Ya — 1—” 2 Selection bias: some groups in population are a? = — ‘ tween —1 1. 
2 (y; _ 17)“ "a” be and over or underrepresented in sample
2.5mndard error of f” _
The standard of error is an estimate of the amount at variability inhemt in the tegmsioti Nonresponse bias: nonrespondents may
i 2 i _ . . .
“dd A“ mat“ 0” ’“1‘ "32““ c“ differ n1 important ways from respondents
6'2 = s2 = SSE = Ely: —§r,l
rt — 2 rt — 2 ’
where SSE is the sum ofthe square errors. Response bias: e.g., wording of questions, Mm “Imam“ : MSE : W telescoping m the recall of events If X has a binomial distribution with n independent USE the EDITeena“ f0]:
observations and a constant probability of success, p, Cﬂﬂtiﬂllity if H “1: 10 000
3‘ for each observation, then we say H
k —k N
P<X=k) = p (1—12)” X 13W)
k The mean, variance and standard deviation of X are R .
 espouse variable, denoted as Y, measures the outcome
n 1 ﬁx : MP of a study. Y is the variable we want to predict/explain
where _ n ' 2 (often called the dependent variable)
k k Kn — k) l 0K : np(1 7 p)  Explanatory variable, denoted as X, is avaiiable that
may predict/explain (but not necessarily cause) the response
I : 35‘ _ 35‘ _ 9k * 0' : d” 1 — variable (often called the predictor variable)
311d n ' n (n 1) (n 2) ' ' ' 1 X p( p) (frequently  many possible explanatory variables)
There are two ways to view binomial variables If X has a binomial distribution with it independent
X Z the count 0f the number 0f successes "1 a observations and a constant probability of success, p, sample of size n then for the sample proportionX i’ n = 13
X t“ rt = the proportion of successes in a sample of size it (called the sample proportion) . lull) : p Recall that if X~ B (F143)
For a population of individuals Wlth a true proportion _ n
p of“success", p is the population parameter 2 pa — p) #X _ P
The sample proportionX t“ n is the statistic from a 0" _ 0—2 : n (1_ )
sam l tha ' ' n X P P
p e t estimates the population parameter p
“max ”W“ _ pa — p) as 2W1  P)
_ A 0f, — If X~B(n,p), rip2 10 andn(Ip)2 10 then
We Wlll denote Yin = n n
1. Sort the observed data from smallest to largest r2 is the fraction of the variation in the values of y that
2_ Record the observed percentiles. For example the is explained by the leastsquares regression of y onx X m N (”10, (”pa _ P) )
smallest observation in a set of 50 is at the 2% Thus, 1’2 = variance 0f]? 1' variance 0f)”. where
point, the second smallest is at 4%, etc. j; are the predicted values (3? = a + hr) and
3. Do normal quantile calculations to determine the y are the observed values theoretical percentiles. For example, Plot each data point y (vertical axis) against
7  . the corresponding 2 (horizontal axis) 1. z —  2.05 is the 2"/ cint, . . . _ _ 00 p _ If the data distribution is close to the normal ECOIDglcal fallacy: concluding (perhaps mCOITeCﬂY) 2' Z _ ' 175 15 the 4/” paint, etc. distribution then the plotted points will lie that relationships holding for groups necessarily hold close to a straight line for individuals in those groups ...
View
Full Document
 Spring '07
 Kirnan

Click to edit the document details