This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Speech Recognition Pattern Classification 2/13/12 Veton Këpuska 2 Pattern Classification u Introduction u Parametric classifiers u Semiparametric classifiers u Dimensionality reduction u Significance testing 2/13/12 Veton Këpuska 3 Pattern Classification u Goal: To classify objects (or patterns) into categories (or classes) u Types of Problems: 1. Supervised : Classes are known beforehand, and data samples of each class are available 2. Unsupervised : Classes (and/or number of classes) are not known beforehand, and must be inferred from data Feature Extraction Classifier Class i Feature Vectors x Observation s 2/13/12 Veton Këpuska 4 Probability Basics u Discrete probability mass function (PMF): P ( i ω ) u Continuous probability density function (PDF): p(x) u Expected value: E(x) ∑ = i i P 1 ) ( ϖ ∫ ∞ ∞ = 1 ) ( dx x p ∫ ∞ ∞ = dx x xp x E ) ( ) ( 2/13/12 Veton Këpuska 5 KullbackLiebler Distance u Can be used to compute a distance between two probability mass distributions, P ( zi ), and Q ( zi) u Makes use of inequality log x ≤ x  1 u Known as relative entropy in information theory u The divergence of P ( zi ) and Q ( zi) is the symmetric sum ( 29 ( 29 ( 29 ( 29 log  ≥ = ∑ i i i i z Q z P z Q Q P D ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ∑ ∑ ∑ =  ≤ i i i i i i i i i i i z Q z P z Q z P z Q z Q z P z Q 1 log ( 29 ( 29 P Q D Q P D   + 2/13/12 Veton Këpuska 6 Bayes Theorem u Define: {w i} a set of M mutually exclusive classes P(w i) a priori probability for class w i p( x w i) PDF for feature vector x in class w i P(w i x ) A posteriori probability of w i given x 2/13/12 Veton Këpuska 7 Bayes Theorem Bayes Rule: From Bayes Rule: Where: ) ( ) ( )  ( )  ( x p P x p x P i i i ϖ ϖ ϖ = ∑ = = M i i i P x p x p 1 ) ( )  ( ) ( ϖ ϖ ) ( )  ( ) ( )  ( i i i P x p x p x P ϖ ϖ ϖ = Bayesian Decision Theory Reference: Pattern Classification – R. Duda, P. Hard & D. Stork, Wiley & Sons, 2001 2/13/12 Veton Këpuska 9 Bayes Decision Theory u The probability of making an error given x is: P(errorx)=1P( w ix) if decide class w i u To minimize P ( error  x ) (and P ( error )): Choose w i if P(w ix)>P(w jx) j ≠i ∀ 2/13/12 Veton Këpuska 10 Bayes Decision Theory u For a two class problem this decision rule means: Choose w 1 if else w 2 u This rule can be expressed as a likelihood ratio: ) ( ) ( )  ( ) ( ) ( )  ( 2 2 1 1 x p P x p x p P x p ϖ ϖ ϖ ϖ ≥ ) ( ) ( )  ( )  ( 1 2 2 1 ϖ ϖ ϖ ϖ P P x p x p ≥ 2/13/12 Veton Këpuska 11 Bayes Risk u Define cost function ij λ and conditional risk R ( i ω  x ): n ij λ is cost of classifying x as i ω when it is really...
View
Full Document
 Summer '09
 Staff
 Normal Distribution, Variance, Probability theory, probability density function, Maximum likelihood, Veton Këpuska

Click to edit the document details