Summarizing lda and qda we can summarize what we have

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: arth.org/lectures/advice- for- applying- machine- learning) so far into the following theorem. The ore m: Suppose that , if is Gaussian, the Bayes Classifier rule is where (quadratic) Note The decision boundary between classes k and l is quadratic in x . If the covariance of the Gaussians are the same, this becomes (linear) Note returns the set of k for which attains its largest value. In practice We need to estimate the prior, so in order to do this, we use the sample estimates of in place of the true values, i.e. In the case where we have a common covariance matrix, we get the ML estimate to be wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 9/74 10/09/2013 Stat841 - Wiki Cour se Notes Estimation of the probability of belonging to either class k or l This is a Maximum Likelihood estimate. Computation Cas e 1: (Example ) ' This means that the data is distributed symmetrically around the center μ, i.e. the isocontours are all circles. We have: We see that the first term in the above equation, , is zero since second term contains distance is the determine and . The , which is the squared Euclidean (http://www.improvedoutcomes.com/docs/WebSiteDocs/Clustering/Clustering_Parameters/Euclidean_and_Euclidean_Squared_Distance_Metrics.htm) between and . Therefore we can find the distance between a point and each center and adjust it with the log of the prior, . The class that has the minimum distance will maximise . According to the theorem, we can then classify the point to a specific class . In addition, implies that our data is spherical. Cas e 2: (Ge ne ral Cas e ) We can decompose this as: have (In general when is symmetric) . Here and the inverse of , is the eigenvectors of and is the eigenvectors of . So if is symmetric, we will is (since So from the formula for is orthonormal) , the second term is where we have the Euclidean distance between A transformation of all the data points can be done from It is now possible to do classification with and to . where . , treating it as in Case 1 above. Note that when we have multiple classes, they must all have the same transformation, els...
View Full Document

This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online