22 Pages

Supervised_and_Bayes

Course: CSCI 5525, Spring 2012
School: Minnesota
Rating:
 
 
 
 
 

Word Count: 996

Document Preview

5525: CSCI Machine Learning (Spring 2012) Supervised Learning Rui Kuang Department of Computer Science and Engineering University of Minnesota Noise and Model Complexity Given similar training error, use the simpler one Simpler to use (lower computational complexity) Easier to train (lower space complexity) Easier to explain (more interpretable) Generalizes better (lower variance - Occams razor) Lecture...

Register Now

Unformatted Document Excerpt

Coursehero >> Minnesota >> Minnesota >> CSCI 5525

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
5525: CSCI Machine Learning (Spring 2012) Supervised Learning Rui Kuang Department of Computer Science and Engineering University of Minnesota Noise and Model Complexity Given similar training error, use the simpler one Simpler to use (lower computational complexity) Easier to train (lower space complexity) Easier to explain (more interpretable) Generalizes better (lower variance - Occams razor) Lecture Notes for E Alpaydn 2010 Introduc9on to Machine Learning 2e The MIT Press (V1.0) Model Selection & Generalization Learning is an ill-posed problem; data is not sufficient to find a unique solution Given d binary inputs, there are at most 2 d binary functions d samples, and 2 2 Each sample eliminates half of the functions; 2d " N Thus, N samples leaves 2 viable functions ! Not possible to check all functions. Need for inductive bias, assumptions about H ! Generatlization and Overfitting Generalization: How well a model performs on new data Overfitting: H more complex than C or f Underfitting: H less complex than C or f Cross-Validation Cross-Validation To better estimate generalization error, we need data unseen during training. We split the data as Training set (50%) Validation set (25%) Test set (25%) Resampling when there is few data Lecture Notes for E Alpaydn 2010 Introduc9on to Machine Learning 2e The MIT Press (V1.0) Triple Trade-Off There is a trade-off between three factors (Dietterich, 2003): 1. 2. 3. Complexity of H, c (H), Training set size, N, Generalization error, E, on new data As N, E As c (H), first E and then E Lecture Notes for E Alpaydn 2010 Introduc9on to Machine Learning 2e The MIT Press (V1.0) Generalization Accuracy Triple Trade-Off 400 examples 200 examples 100 examples Complexity of Classier Figure 2: Generalization accuracy as a function of the complexity of the classier, for various amounts of training data. Summary of Supervised Learning 1. Model Selection: 2. Loss function: g ( x|! ) ! H ( E ( h | X ) = #1 h ( x ) " r ! t =1 t t ) Optimization procedure: 3. ( E (" | X ) = # L r t , g( x t | " ) t N g( x ) = w1 x + w 0 ! ! ) N 1 E ( g | X ) = # r t " g( x t ) N t =1 [ " * = arg min E (" | X ) " ! Algorithms: KNN, percepton, linear regression ! ] 2 CSCI 5525: Machine Learning (Spring 2012) Bayes Decision Theory and Parametric Models Rui Kuang Department of Computer Science and Engineering University of Minnesota Probabilistic Perspective We have seen classification models. Classification decision is deterministic " 1 if h says x is positive h( x) = # $0 if h says x is negative What if we have cases that we are not so sure about, such as data with outputs from ! stochastic process? a Estimation of p(C = 0 | x ) and P (C = 1 | x ) Probability and Inference Result of tossing a coin is {Heads,Tails} Random var X {1,0} Bernoulli: P {X=1} = poX (1 po)(1 X) Sample: X = {xt }Nt =1 Estimation: po = # {Heads}/#{Tosses} = t xt / N Prediction of next toss (no input): Heads if po > , Tails otherwise E. Alpaydin, Introduction to Machine Learning Classification Credit scoring: Inputs are income Output and savings. is low-risk vs high-risk Input: x = [x1,x2]T ,Output: C is in {0,1} Prediction: "C = 1 if P (C = 1 | x1,x 2 ) > 0. 5 choose # $C = 0 otherwise or "C = 1 if P (C = 1 | x1,x 2 ) > P (C = 0 | x1,x 2 ) choose # $C = 0 otherwise E. Alpaydin, Introduction to Machine Learning Bayes Rule How to get P(C|x)? prior likelihood posterior P (C ) p( x | C ) P (C | x ) = p( x ) evidence P (C = 0) + P (C = 1) = 1 ! p( x ) = p( x | C = 1) P (C = 1) + p( x | C = 0) P (C = 0) p(C = 0 | x ) + P (C = 1 | x ) = 1 E. Alpaydin, Introduction to Machine Learning Bayes Rule Example prior likelihood P (C ) p( x | C ) P (C | x ) = p( x ) posterior ! evidence P (C = ' acc ') = 0.6, P (C = ' unacc ') = 0.4 Safty (x) 'high' 'low' 'med' 'high' 'low' 'med' 'high' 'low' 'med' 'high' 'low' 'med' 'high' 'low' 'med' 'high' Rating (C) 'acc' 'unacc' 'acc' 'acc' 'unacc' 'acc' 'acc' 'unacc' 'unacc' 'acc' 'unacc' 'acc' 'acc' 'unacc' 'acc' 'acc' Bayes Rule: K>2 Classes p( x | Ci ) P (Ci ) P (C i | x ) = p( x ) = p( x | Ci ) P (Ci ) K " p(x | C )P (C ) k k k =1 K P (Ci ) " 0 and # P (Ci ) = 1 i =1 ! choose C if P C | x = max P C | x (i ) (k ) i k E. Alpaydin, Introduction to Machine Learning ! Losses and Risks Actions: i Loss of i when the state is Ck : ik Expected risk (Duda and Hart, 1973) K R(" i | x ) = $ #ik P (Ck | x ) k =1 choose " i if R(" i | x ) = min k R(" k | x ) ! E. Alpaydin, Introduction to Machine Learning Losses and Risks: 0/1 Loss Loss of i when the state is Ck : ik K $0 if i = k "ik = % &1 if i # k ! R(" i | x ) = $ #ik P (Ck | x ) How likely the prediction Ci is wrong. A soft cost based on the confidence of a prediction. k =1 = $ P (Ck | x ) k %i = 1 & P (Ci | x ) For minimum risk, choose the most probable class ! E. Alpaydin, Introduction to Machine Learning Losses and Risks: Reject #0 if i = k % "ik = $ " if i = K + 1, 0 < " < 1 % &1 otherwise R(" K +1 | x ) = # ! R(" i | x ) = % P (Ck | x ) =1 & P (Ci | x ) k $i choose Ci if P (Ci | x ) > P (Ck | x ) "k # i and P (Ci | x ) > 1 $ % ! reject otherwise E. Alpaydin, Introduction to Machine Learning ! Discriminant Functions gi ( x ), i = 1,, K $ "R(# | x ) i & gi ( x ) = % P (Ci | x ) & ' p( x | Ci ) P (Ci ) ! choose Ci if gi ( x ) = max k gk ( x ) ! ! K decision regions R1,...,RK R i = {x | gi ( x ) = max k gk ( x )} ! E. Alpaydin, Introduction to Machine Learning K=2 Classes Dichotomizer (K=2) vs Polychotomizer (K>2) g(x) = g1(x) g2(x) "C1 if g( x ) > 0 choose # $C2 otherwise Log odds: ! ! log P (C1 | x ) P (C2 | x ) E. Alpaydin, Introduction to Machine Learning Parametric vs Nonparametric Parametric methods: A model (usually a type of simple distribution) with a few parameters (sufficient statistics) is assumed Learning problem is to fit the model with the best parameter to the data The model is used for prediction Nonparametric methods: No model/distribution is assumed Predictions is made based on training instances Semiparameteric A mix of model-based and instance-based learning
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Minnesota - CSCI - 5525
CSCI 5525: Machine Learning (Spring 2012)NonparametricMethodsRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaDensity EstimationGiven the training set X=cfw_xtt drawn iid from p(x)Divide data into bins of size h Histo
Minnesota - CSCI - 5525
CSCI5525: Machine Learning (Spring 2012)ClusteringRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaExpectation-Maximization (EM)Complete likelihood, Lc( |X,Z), in terms of x and zLc (&quot; | X ) = log # p(x t , zt | &quot;) = $t
Minnesota - CSCI - 5525
CSCI 5525: Machine Learning (Spring 2012)Linear DiscriminationRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaLikelihood- vs. Discriminantbased ClassificationLikelihood-based: Assume a model for p(x|Ci),use Bayes rule
Minnesota - CSCI - 5525
CSCI 5525: Machine Learning (Spring 2012)Linear DiscriminationRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaLogistic DiscriminationlogTwo classes: Assume log likelihood ratio is linearp (C1 | x )p (C1 | x )= log=
Minnesota - CSCI - 5525
CSCI 5525: Machine Learning (Spring 2012)Local ModelsRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaOnline k-meansWinner-take-all networkWeight decay term:!mij = !bit ( x tj &quot; mij ) = !bit x tj &quot; !bit mijE. Alpaydin
Minnesota - CSCI - 5525
CSCI 5525: Machine Learning (Spring 2012)Multilayer PerceptronRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaBiological Neural Nets Pigeonsas artexperts (Watanabe et al. 1995)Experiment: Pigeonin Skinner box Pres
Minnesota - CSCI - 5525
CSCI 5525: Machine Learning (Spring 2012)Multilayer PerceptronRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaBackpropagationHy i = vT z = &quot; v ih zh + v i 0ih =1zh = sigmoid ( wT x )h=1*$ d&amp;&quot; w hj x j + w h 0 '
Minnesota - CSCI - 5525
CSCI 5525: Machine Learning (Spring 2012)Multilayer PerceptronRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaTuning the Network SizeDestructiveWeight decay:penalizing non-zeroparameters The same as addingadditiona
Minnesota - CSCI - 5525
CSCI5525: Machine Learning (Spring 2012)ClusteringRui KuangDepartment of Computer Science and EngineeringUniversity of MinnesotaMixture of GaussiansKp( x | &quot;) = % # iN ( x | i , $i ),i =1Kwith 0 &amp; # i &amp; 1 and % # i = 1.i =1Mixture of Gaussians
Webster - FINC - 5880
A firm has 10 million shares outstanding with a market price of $20 per share. The firm has $25 million in extra cash (short term investments)that is plans to use in a stock repurchase; the firm has no other financial investments or any debt. What is the
Webster - FINC - 5880
Axel Telecommunications has a target capital structure that consists of 70% debt and 30% equity. The company anticipates that its capital budget for the upcoming year will be $3 million. IfAxel reports net income of $2 million and follows a residual dist
Webster - FINC - 5880
A firm has 10 million shares outstanding with a market price of $20 per share. The firm has $25 million in extra cash (short term investments)that is plans to use in a stock repurchase; the firm has no other financial investments or any debt. What is the
Drexel - ECON - 201
Econ Final ReviewChapter 1- Scarcity - the limited nature of society's resources- Opportunity cost - whatever must be given up to obtain some item- Marginal cost - the increase or decrease in costs as a result of one more or one less unit ofoutput-
North Texas - ACCT - 5130
Multiple Choice Questions1. Generally speaking, which of the following is not one of the primary purposes of abudget?A. Identifying a company's most profitable products.B. Evaluating performance.C. Planning.D. Controlling profit and operations.E. F
Drexel - ENGR - 361
ENGR 361 Statistical Analysis of Engineering Systems (Spring, 2012)Homework 1 Solutions
Florida State College - BUS - 305
Bus Prin of Orig Behaivor
Florida State College - BUS - 305
Week 8 - Assignment #2Coca-ColaOctavial RobinsonPrincipal of Organizational BehavioralDr. Daphyne FosterStrayer UniversityAugust 21, 20111. What do you think is the most important emerging issue in the design of work?The most important rising issu
Florida State College - BUS - 305
Week 2 Quiz1.Question:Researchfocusingontheeffectsofefficientculturesonorganizationalperformance andhowpathologicalpersonalitiesmayleadtodysfunctionalcultureshighlights whichdisciplinescontributiontoorganizationalbehavior?StudentAnswer:psychologys
Florida State College - BUS - 305
Week 1Companies role should change every so often as time changes. A company that stays the same will fail. That is whyit is important for companies to be positive impacts in the contemporary world. People changes and so will thecompany which will acco
Florida State College - BUS - 305
Week 1Companies role should change every so often as time changes. A company that stays the samewill fail. That is why it is important for companies to be positive impacts in the contemporaryworld. People changes and so will the company which will acco
CSU Northridge - COMP - 122
Conditional statementsLoop: for, while, do-whileBRGE- branch(br) greater than or equal(ge)Accumulator(A) - using LOAD A RR stores a value for a variable Ex: var i; LOAD A RR, ii = i + 10; would use STORE A RR, i | for the answer |Full code for ^ :
CSU Northridge - COMP - 122
Variable: static v. dynamic Assembler
CSU Northridge - COMP - 122
CSU Northridge - COMP - 122
#ram/CPU,SP,PCHow to select to fetch 1 or 3 bytesInspect Instruction#von Newman cycle T#vonnewmancycle, page 168 in bookUnary Instructions: Instructions that don't have an operand specier, single 8 bitinstruction
CSU Northridge - COMP - 122
How long for instruction? length(bits)#ram, CPU
CSU Northridge - COMP - 122
CSU Northridge - COMP - 122
Deign Machine Langua&quot; 1. Ram size address8-bit address Capacity of RAM : 16-bit address Capacity of RAM : 32-bit address Capacity of RAM : 2. Character Set.ASCII (256 letters) uses 8-bit allocation (1 byte, 1 letter)Common 128 Characters 0000,000
CSU Northridge - COMP - 122
Negative to Binary Given allocation: 8 bits Number: -24 Question: nd its binary representation1. Drop the sign (-) = 242. To binary: 11000 [practice decimal to binary conversion]3. Fit in 8 bits: 00011000 [ll in allocation]4. Flip: 11100111 [ones
CSU Northridge - COMP - 122
Binary Operators Given allocation and number in binary Question: nd its shift left1. Fit the numbers to the given allocation2. Shift left (c-bit: carry bit)Question: nd its shift right. Binary: 0 1Binary Operators require 2 numbersUnary Operator
CSU Northridge - COMP - 122
Ram on Hex vs. BinAllocation address Offset ContentConversion: Two's Complement AllocationDec to Bin (+/-)Bin to DecBinary to Decimal2's complement Method if number starts with a zero. Finding AllocationBase to BaseCombinations
CSU Northridge - COMP - 122
Combinations Each representation is associated to an object. Allocation # of representationEx: 3 bits, 1 letter, 2 octals.Processor (CPU) - accesses the RAM Circuits are like road systems, usesthe id to locate the objectsIDs are actually called ad
CSU Northridge - COMP - 122
CSU Northridge - COMP - 122
Power Base Base = 3, 9, 27, 81Special characteristics:Every octal digit is equivalent to 3 binary digits. Every hex digit is equivalent to 4 binary digits.Power Base ConversionsMemory AllocationBinary digits 1 digit has 2 combosPower = # of digits
CSU Northridge - COMP - 122
Convert decimal number 164 to a 16-digit binary number(base 01).Given:1) 164 in decimal2) Target base to be converted to: 013) allocation for memory: 16 digits How do you make the answer 16 digits w/o changing the value?Add the rst symbol to the le
CSU Northridge - COMP - 122
1. In binary, the next number of 1011 is _ 2. In octal, the next number of 767 is _3. In hexadecimal, the next number of 199 is _4. In trinary, the next number of 212 is _5. In base=cfw_2103456789, the next number of 132 is _6. In binary, the previo
CSU Northridge - COMP - 122
Computer Architecture Assembly Abstract SupportNew language
University of Toronto - CIV - 235
University of Sydney - MATH - 2061
T HE U NIVERSITY OF S YDNEY P URE M ATHEMATICS Linear Mathematics 2012Assignment - Solutions1. (3 marks) The set M2,2 of 2 2 matrices, with real entries, is a vector space. The set of antisymmetric matrices A = Prove that A is a subspace of M2,2 . Solut
University of Sydney - MATH - 2061
T HE U NIVERSITY OF S YDNEY P URE M ATHEMATICS Linear Mathematics 2012Tutorial 1 (Week 2) - Solutions 5 3 0 4 a) Calculate the matrix product 1 1 2 -2. 4 -1 3 5 b) Hence find all solutions to the system of linear equations: 3x1 + 4x3 = 31 x1 + x2 + 2x3
Abilene Christian University - CULTURE - 101
La Biblia y la antigedad clsica de GreciaContinuando con el enriquecimiento de nuestro conocimiento cultural e histrico,Schwanitz habla de la Biblia, cuya idea de Dios es la ms importante de nuestra cultura.Realmente es tanta su influencia, que incluso
Abilene Christian University - CULTURE - 101
La II Guerra Mundial y las formas literarias de la literatura EuropaLa II Guerra Mundial inicia con la entrada de las tropas alemanas a Polonia, Hitler se dedic a invadirpor sorpresa, y en la guerra se cometieron crmenes terribles, destaca el genocidio
Abilene Christian University - CULTURE - 101
Las grandes obras de la literaturaHeinrich von Kleist fue uno de los poetas romnticos malditos. Tras una vida de riesgos,se suicida junto con Henriette Vogel en 1811 el autor de la mejor comedia en lengua alemana:El jarrn roto (1808).Fausto, tragedia
Abilene Christian University - CULTURE - 101
El siglo XVII y XVIIIEn el siglo XVII se decide el destino de tres naciones, cuyos Estados se forman por vas distintas. Con laguerra de los Treinta Aos, Alemania se fragment y su gente se volvi desconsolada y deseosa de encontrar lamuerte, no figuraran
Abilene Christian University - CULTURE - 101
La Reforma y el nacimiento de los Estados europeosAnte la Dieta de Worms, Lutero se declar autor de sus escritos y no quisoretractarse, por lo que el emperador lo expuls del Imperio. Se form la oposicinextraeclesistica, los monasterios quedaron vacos y
Abilene Christian University - CULTURE - 101
El Renacimiento, la Reforma y el nacimiento de los Estados europeosInicia el Renacimiento, que es cuando se redescubre la cultura pagana de laAntigedad, reflejndose sobre todo en la arquitectura, la escultura y la pintura.Renaci el gusto por la vida, l
Abilene Christian University - CULTURE - 101
Romero 2/2Los griegosQu tan cultos somos? Es una pregunta que desgraciadamente en nuestro pas,pocos se plantean, y aquellos a los que les preguntan, se escudan diciendo que esirrelevante e intil serlo, pero se equivocan. El simple hecho de conocer ms
Abilene Christian University - CULTURE - 101
Historia de Roma, la Edad Media y Renacimiento*Con el fin de no omitir temas para retomar el paso de la lectura del grupo, y procurandoser ms breve, comenzar con parte de lo que corresponda a la lectura pasada.Es importante conocer la historia de Roma,
European School of Economics - ECON - 105
Bonus tasks. Each task is scored up as 1 point. Var. # 1 is for those whose surnames start from Ato I.1. Suppose that the equilibrium rent for a two-bedroom apartment in downtown Chicago is$900 per month. The city council decides to place a price ceili
European School of Economics - ECON - 105
European School of Economics - ECON - 105
European School of Economics - ECON - 105
European School of Economics - MGMT - 103
European School of Economics - MGMT - 103
European School of Economics - MGMT - 103
EXAMINATION 4April 9ContentTotal number of questions: 45 multiple-choicePart A (18 multiple-choice questions)Chapter 12The Service TriangleDelivering Excellent ServiceThe Service EncounterService RecoveryWant to Perfect Your Companys Service?Eli
European School of Economics - MGMT - 103
PRINCIPLES OF MANAGEMENTExamination Procedure1. Exam 5 date: May 72. Exam start time (and duration): 09:00am (45 minutes)3. Exam report time: 08:45am4. Exam location: Great Hall5. Food/drink: Please do not bring any food or drinks into the Great Hal
European School of Economics - MGMT - 103
EXAMINATION 5May 7ContentTotal number of questions: 45 multiple-choicePart A (15 multiple-choice questions)Chapter 13 (pages 299-312)Chapter 19 (pages 452-465)Enhancing CreativityThe Early Career of the Manager-Part B (30 multiple-choice questions
UNSW - ACCT - 1501
Bank reconciliation1 of 9http:/www.perdisco.com/elms/qsam/html/qsam.aspxSpecialty Sports Store Practice SetBeing completed by: chapm0nBank reconciliation [feedback page]This is a feedback page. Please review this page carefully because later pages i
UNSW - ACCT - 1501
Closing entries1 of 14http:/www.perdisco.com/elms/qsam/html/qsam.aspxSpecialty Sports Store Practice SetBeing completed by: chapm0nClosing entries [feedback page]This is a feedback page. Please review this page carefully because later pages in this
UNSW - ACCT - 1501
End of month posting1 of 10http:/www.perdisco.com/elms/qsam/html/qsam.aspxSpecialty Sports Store Practice SetBeing completed by: chapm0nEnd of month posting [feedback page]This is a feedback page. Please review this page carefully because later page
UNSW - ACCT - 1501
Financial statements1 of 6http:/www.perdisco.com/elms/qsam/html/qsam.aspxSpecialty Sports Store Practice SetBeing completed by: chapm0nFinancial statementsThis practice set continues over multiple pages.Saving your position. If you want to save thi