midterm2005-solution - mx “xiv/i 1 Q1 Probability and MLE...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: mx “xiv/i 1 Q1 Probability and MLE [20 pts] 1, (a) Suppose we wish to calculate P(H|E1, E2) and we have no conditional independence information Which of the following sets of numbers are sufficient for the calculation 7 P(E1, E2), P(H), P(E1lH), P(E2(H) airtime, E2), P(H), MEL E2111) iii, P(H), P(E1IH), P(E2|H) (b) Suppose we know that P(E1lH,E2) : P(E1!H) for all values of H,E1,E2‘. Now which of the above three sets are sufficient 7 8&765‘ Em: email) Piatziiwm’) p €13$<Qéc93fi PCEg t; l : PM?) l H) PCEZlH) (it fir ‘55; x I ms Am 6 man 305M“>é.rocmnw, relabmg d J g) a i 2” Which of the following statemen s are true 7 If none of them are true, write NONE, (a) If X and Y are independent then E[2XY] 2 2E[X]E[Y] and VaT‘lX + 2Y] : VaflXl —l— VaTIYl Va (b) IfX and Y are independent and X > 1 then Va7‘[X+2Y2] = Var [X]+4Va7‘[Y2] and E[X2—X] Z Va7'[X]. (c) If X are Y are not independent then Var[X + Y] : Va7'[Xl + Var (d) If X and Y are independent then E[XY2] 2 E[X]E[Y]2 and Var[X + Y] : Va7‘[X] + Vm (e) If X and Y are not independent and f(X) : X2 then E[f(X)Y] : E[f(X)]E[Y] and VaflX + 2Y] : leX] + 4Va7'[Y] OVER FUR seams 3” You are playing a game with two coins Coin 1 has a 9 probability of heads” Coin 2 has a 29 probability of heads” You flip these coins several times and record your results: (a) What is the log—likelihood of the data given 6 7 ‘7 kit?) ‘7. 8% t P 2 Head) Pfcom 2:T€u\)l J PLCGm Z .2; 2: @Chlefiizea— r. 239—- C9 «29);, 5M8) : 363:; L69) ; icga + Liege}— + 3:03 (1 «1a) (b) What is the maximum likelihood estimate fOr 6‘ 7 7‘ TM Ossxuea ‘_ 2. o —‘ We 2. a- 69 m; :fi 2-— ;’ - (p w »_T +£1.12) 2%”; 9‘> {’S O >%£ [5 {99— 8 (14.6”) A , , A , v’ ‘ I ' ‘1 gwe arzjmag Us) 2 {imam->4 Jug») big tangy) ,5, mcnctanai _ L9 x, l .. l ‘7‘. ‘! [mt-ifimlglm ti Q, Ra\e,uom\- \bmyar-Hes EU“ 3 0&9 3R “1 {:00 \8 Tmeggm VN‘UXX 3 0‘7‘\/C«FL)Q ag\R ‘H‘m ESL-£an \C- X and cma \néupenéun\— EDW] :ECXQEW] Voxc’H‘rfl ; Vqr‘ffl Nady] WW1: EUMEW] 19X cmd \/ am no? \nchpenéUN’ Hm; Emm] \Mx ‘\/a<"[)<]+\/mr[\{] (Ci. X” WWW") ) m m; "3 : Em +Em km’ngfinggg ‘Pm Swfhcs 0&3, 9&6?th \TD ((3, CL), (A) a Farm max]: EWHH'X VQ(EX1‘§Ei-<X~EY><W’3 2 {[th‘EEX‘IX mfxfl 2 mi] N. Eileix‘sx] + ELgtxfl Em} ammfl MEN? ‘ : HM— E;._[x13*7~ Ed? smw Xfl E[,X19‘7E[X] and 5‘0 EW} ~- EEX] ‘» EH1 T1 ~ EH11 H H EH1,“ 7/ Var [)4] ma Hm. Q2 Decision Trees [20 pts] 1. The figure below shows a dataset with two inputs X1 and X2 and one output Y, which can take on the values positive (+) or negative There are 16 datapoints: 12 are positive and 4 are negative, L2a£ nook Answer t0 (“> Assume we are testing two extreme decision tree learning algorithms Algorithm OVERFIT builds a decision tree in the standard fashion, but never prunes Algorithm UNDERFIT refuses to risk splitting at all, and so the entire decision tree is just one leaf node. (a) Exactly how many leaf—nodes will be in the decision tree learned by OVERFIT on this data? 01 ($22 piciwe Aux/2.) (b) What is the leave—one—out classification error of using OVERFIT on our dataset? Report the total number of misclassifications, Eve-J3 row: ,M w;- b‘ Mc‘sckamikt“ hawk Ur “ML he in A SiAtl/bm MAG mqu cm b3 HM. opposing cian . EVPJ‘A 90%? M dank “MA , AA SwJU‘ 7- 8 (c) What is the leave-one-out classification error of using UNDERFIT on our dataset? Report the total number of misclassifications. its A“ N: 6H5, MG) WUA be. crimfiJS‘o omit) WwLULeMC‘B.M-Q Lo-é-Dnookns. Ans :Lr (d) Now, suppose we are learning a decision tree from a dataset with M binary—valued inputs and R training points, What is the maximum possible number of leaves in the decision tree, Circle one of the following answers: \F R <7.M the» M St- H has A R,log2(R),R2,2R,M,log2(M),M2,2M, 52;: Pom. at “(a min(R,M),min(R,log2(M)),rnin(R,M2 Leaé Le. R (Eaves. min(log2(R),M),min(log2(R),log2(M)),min(log2(R),M2),min(log2(R),2M), ’ min(R2, M), min(R2, log2(M)), min(R2, M2), min(R2, 2M), If R 22M We» min(2R,M),min(2R,log2(M)),min(2R,M2),rnin(2R,2M), W quip? misi- siuf max(R,M),maX(R,10g2(M)),max(R,M2),max(R, 2M), 0+“; a“ M k%ka$ max(log2(R),M),max(log2(R),log2(M)),max(1og2(R),M2),max(log2(R),2M), max(R2,M),maX(R2,log2(M)),max(R2,M2),max(R2,2M), WM” W WRA‘ max(2R,M),max(2R,log2(M)),maX(2R,M2),max(2R,2M) M Max—WS 2M ‘ M Lad/es . Ms Awgwef = MM<R,Z 3 Linear Regression Consider fitting the linear regression model for these data X —1 0 2 y 1 —1 1 (b) Fit Y2- : fig + e,- (degenerated linear regression), find ,80, 50 =al'gmin 20/; — 50)2 fig 2 1/3 (b) Fit Y,- = fllXi + 6,- (linear regression without the constant term), find [30 and ,61 . [7’1 =argmin -r ,81Xi)2 fii = ZXiYi/ZX? = 1/5 Q4 Conditional Independence [5 pts] 1. Consider the following joint distribution over the random variables A, B, and C. A 0 0 0 0 1 1 1 1 HOHOI—‘OD—low HHOOt—II—IOOQ (a) True or False: A is conditionally independent of B given C. . fled becqut Viék PCA" 13:3) 6:10 : PCAzi [C3 (b) If you answered part (a) with TRUE, make a change to the top two rows of this table to create a joint distribution in which the answer to (a) is FALSE. If you answered part (a) with FALSE, make a change to the top two rows of this table to create a joint distribution in which the answer to (a) is TRUE. one, Possible change. {9 A B C, PCAsc.) 0 0 o 0 010%! moi”: any damage winced? 444559,. 4100 rows mvs‘i' 54-,” «rest/1+ {v1 -H4¢ +4.ng Tep'reS-evt'l;"; «jwwl' W) a we}*5t23tzr1anwe; ,m’tgn 1:4. 95 $0M +0 Q5 Generative vs Discriminative Classifiers [15 pts] 1. You wish to train a classifier to predict the gender (a boolean variable, G) of a person based on that person’s weight (a continuous variable, W) and whether or not they are a graduate student (a boolean variable, 5'). Assume that W and S are conditionally independent given G. Also, assume that the variance of the probability distribution P(Weighthender = female) equals the variance for P(Wez'ghthender = male). (a) Is it reasonable to train a Naive Bayes classifier for this task? Y€$t Wow} 5‘”: Cancer‘l'nna”), [Haeepcnflen‘l’ glue» (b) If not, explain why not, and describe how you might reformulate this problem to allow training a naive Bayes classifier. If so, list every probability distribution your classifier must learn, what form of distribution you would use for each, and give the total number of parameters your classi- fier must estimate from the training data. We must cctmm‘l‘c 6 Fara-«dds: P(é) Bgvnwll; -—3P PCG=OETrCno+e I’Ce'm) Meal via-Ir be es-l-rvnqteISefumtcb. H [S 1" P(G :1) ) Bfl‘wovlll ‘9 'P($=ll G =0 3" 9, PCs‘G) Bewile "'> P(S:l )G :0956, lap L". r W m ‘ GT” - Vuvmwcc av +11: Norm; :s-l'n v was gove—‘VV’3 P(WlG) MW at "3’ Mulls“ _ Mean -P,-( Y’Cwléu) law‘s“, - mam 9w PCWlG=°> (c) Note one difference between the above P(GendeTIWeight, Student) problem and the problems we discussed in class is that the above problem involves training a classifier over a combination of boolean and continuous inputs. Now suppose you would like to train a discriminative classifier for this problem, to directly fit the parameters of P(GH/V, S), under the conditional independence assumption. Assuming that W and S are conditionally independent given G, is it correct to assume that P(G = Ill/V, S) can be expressed as a conventional logistic function: 1 P(G = IIW’S) = 1 + exp(wo + w1W + 1023) If not, explain why not. If so, prove this. YYS. fife. can In: shown by "inguinal ‘Hac. altexmq-lwn "41.:“1'5 Uqchuygs Ch“l’+="" ‘e‘fllfl‘l' (Wh'ela (was 'l'lm: cue. O‘DNM'MQI quMLlcs D wn-Hq 'Hae Solv-l-Im 49 d fives-{Jon Rom homewwlc ?wahwla covets Boolew Yawaalf53. “gram 8% m’l—zmi's Llflflflfixfl'" usmg ov-r Vcme-es C,W,+$, W¢ have: P(Gfllws) = 1+ exam at» +1., ‘p'IOWl'T-OMIS he "flew? 41148 +lvle'Ve'Cv-re‘. 1—9 we: lVl LW'fi'l’lVI I'e: 10,: Mo'z‘h OAR 6 wag; In 99 (I'Q,) 9' ll'ao) Q6 Neural Networks [20 pts] 1” For this question, suppose we have a Neural Network (shown below) with linear activation units, In other words, the output of each unit is a constant C multiplied by the weighted sum of inputs (a) Can any function that is represented by the above network also be represented by a single unit ANN (or perceptron)" If so, draw the equivalent perceptron, detailing the weights and the acti— vation function Otherwise, explain wh not“ , , y “7/11.: MW c4555 c 2 {LS gLC‘iil‘g/af/W 9Mka 7313a} W,Wl/éflf "flu/10524;? 1/13; :1 wi’réf/f if amt/wt», £4545 Magma? (b) Can the space of functions that is represented by the above ANN also be represented by linear regression? (Yes/No) %S x451? fi/mfiim ’37 %£’b¢ éflJ 9457;»; ___. 7%,»); #5 IL, [II/Maw yr} éljfu/iL/fv‘wii/[gx’l ’+C2[lv-’W +M;%’)Xl Wjfljfiiflw Wig/X2 ,8]. I31 iiwa gaggwm fill/5 2‘, Consider the XOR function: Y 2 (X1 /\ -uX2) V (fiXl /\ X2) We can also express this as: “ 5 Y Z > % X1 7A X2 < % otherwise It is well known that XOR cannot be implemented by a single perceptr‘on, Draw a fully connected three unit ANN that has binary inputs X1, X2, 1 and output Y, M/e < [MP gfiqfidy’ Select wei hts that im lementY: X XOR X H _ . = , g t g p < 1 2> 7% at”? W era/z, For this question, assume the sigmoid activation function: 1 X, )9 l/ M 1 + 6XP(‘('LU0 + 101371 + 102332)) y We are 7%; gfiwmfmfian Mm: V= (x; Ann) Win At) A z? ,4“; x‘m/aléramé ,4 5’”? I 4 féfa Ami Kim/[amamk “big? m page??? W‘m’ i in”: 14,21 5%”? its (’1' flymlfi r} wry/hawk? <25” fl/wwaa 5,7me :25” .. [f rééi/IVL m4jgh'flwé % ,«5‘ “Wiley/736], jg Jk'évefl" ...
View Full Document

Page1 / 7

midterm2005-solution - mx “xiv/i 1 Q1 Probability and MLE...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online