{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

notes_for_5mar - The sample space(the set of possible...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 4
Background image of page 5
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: The sample space (the set of possible values) for a discrete random variable is either finite or countably infinite. The probability distribution of a discrete random variable gives the probability the random variable equals any single value. Here the single value x, of X is an outcome of the random experiment of observing X. This means we think of an event as the occurrence of the single number x. P(X=x) is the probability that the value of X will be observed as x, and, of course the sum of all these probabilities, whether a finite series or an infinite series equals one. What changes when the sample space is infinite, and not countably infinite? A random variable, with an infinite, and uncountable, sample space is called a continuous random variable. We cannot assign probabilities to single values of X and expect the probabilities to sum to one. There are infinitely, and uncountably many possible values of X. In fact, because of this last observation, P(X=x)=0. Now we think of events as occurrences of intervals, and instead of P(X=x) defining the probability distribution of X, P (a S X S b) is the probability the random variable X will be in the interval [a,b] on a run of the random experiment. The probability for every possible interval is defined by a probability density function (pdf), denoted f(x) where, for all possible a and b, P(aSXSb)=lj)f(x)dx. The pdf, f(x) is a smoooooth curve over the sample space of X, and P(a S X S b) is interpreted geometrically as the area under the curve over the interval |a,b |. Note 1: P(aSXSb)=P(aSX<b)=P(a<XSb)=P(a<X<b) foracontinuous random variable. (These all may be different for a discrete random variable). Note 2: If Sx denotes the interval of possible values, we may write the pdf of X as f(x) for xe SX c: R, the real line 0 elsewhere Note 3: 1 f(x) dx = Tax) dx =1 SX —00 Note 4: The cdfofX is P(X S x) = F(x) = ?f(u) du. Note 5: In practice the pdf for a continuous random variable may be obtained by a) a model for intervals, b) idealizing a spike diagram or histogram (curve fitting), c)assuming a member of a family of probability density functions. At this point we will digress to the Commuter's Tale. 984 The Commuter's Tale Case 1 Boss: "We need to assess the probability that you arrive late for work. If you arrive after 9:55 in the morning you are late, and you will be penalize ". Commuter: "OK -- That’s reasonable. I always arrive at work between 9:00 and 10:00". Boss: "1 agree, and of course there are uncountably infinitely many possible arrival times between 9:00 and 10:00. I propose the following model to represent your arrival times". Boss: The probability you arrive in any interval of length L is proportional to the length of the interval. Therefore, if P(L) denotes the probability you arrive in an interval of length L, P(L) = (1L, for some a > 0. If T denotes the random variable " arrival time", axiom 2 fixes (1, since it requires P(ST) = l, and P(ST) = P(O S T s 60) = 1 :> 0:60 =1 :> a =1/60. Axiom 1 is, of course satisfied, and so is axiom3: if A and B are two nonoverlapping (mutually exclusive) intervals of lengths LA and LB, respectively, then P(AU B) = (LA +LB)/60 = P(A)+ P(B) The graph of the probability density function (pdf) of T is f0) 1/60 = 0. 0167 That is, the pdf of T is f(t)= 1/60 OSt_<_60 0 elsewhere . The probability for any interval is the area under f(t) over the interval. The cdf of T is t F(t) = P(T s t) = f(1/60)du = .630 o s t s 60 0 =0 t<0 =1 t260 The commuter is amazed as the boss finishes her discussion. . _. .___.___J Boss: The probability that you are late is the area of the rectangle with the four coordinates (0,55), (0,60), (1/60,55), (l/60,60). Therefore the probability you arrive late is 5/60 = 1/12 = 0.0833. 60 The boss also computes the probability as I (1/ 60)du = 0.0833 = 1 - F(55). . 55 Case 2 Commuter: "Your assessment of the probability that I arrive late is unfairly high. You're model is wrong, with all due respec ". Boss: "OK. Convince me that my model is wrong. Certainly the pdf that I have proposed for your time of arrival is a pdf. It satisfies the axioms of probability theory. What‘s your evidence that it is the wrong model for your arrival time?" Commuter: "I've kept records of my arrival time. I have a data set of my arrival times and I've constructed a histogram of my arrival times. My histogram of arrival times looks like 0 30 Q0 Boss: "Hmm, that histogram certainly does not suggest that your arrival time probability model 1s a rectangular probabllity density function, like the one that I modeled for your arrival times. Indeed, I would say it looks like we could idealize your arrivals by a triangle mstead of a rectangle". The boss and the commuter work together, and fit the following triangle to the commuter's histogram. h Mo»: __L (UAR at A, (W Nos—raw: \\ 3'3 lA: ”30 oral The triangular pdf of T is f(t)=t/900 0_<_ts30 1 t —-———— 30$t$60 15 900 0 elsewhere The cdfofT is , P(T_<_t)=F(t)=0 t<0 t2 1800 t(120-t) _ 1800 1 t> 60 OStS30 l 3OStS60 Boss: Using the triangular pdf as the model for your arrival times, and using triangle geometry, the probability that you are late is l/72 = 0.0139. Of course, I should compute your late probability as the area under f(t) over the interval (55,60). That is, the probability 60 you are late is P(T > 55) = [(i—i—ylt =1/72 = 0.0139. 55 15 900 Commuter (getting into the spirit of things): There is a third way to do the computation. It follows that P('l>55) = 1-P(Tg55) = 1-F(55) = 1_ ( 5_5(_12£—5_5) ~1) = 0.0139 . 1800 Boss: That's right. Commuter: "That probability is much better, since 0.0139 is much less than 0.0833". Boss: That's good. By the way, I've attached a graph of your triangular cdf. Commuter: Thanks, very much. Case 3 Commuter: (speaking sheepishly) "I admit that I arrived after 10:00 on very rare occasions, (recovering) but also on very rare occasions I arrived before 9:00". Boss: "OK, that's no problem. We can assume your arrival time pdf is 1 _l(fl)2 62 10 —oo<t<oo, 10427: and this pdf will allow you to arrive any time. Then the event "you are late" means that you arrive after 9:55, and using this pdf the probability that you are late is P(T > 55) = W) dt = 0.0062. 55 f(t) = Commuter: Wow, that's really good. What does the graph of this pdf look like? Boss: It is the so-called bell shaped curve. Here is the graph. .040 .038 . 038 . 034 .032 .030 .028 .026 .024 . 022 .020 . 018 . 015 . 014 . 012 .010 .008 .005 .004 . 002 . 000 OOOOOOOOOOOOQOOOfiOOOO -10 0 IO 20 30 40 50 60 ?0 ...
View Full Document

{[ snackBarMessage ]}