This preview shows pages 1–12. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Speech Recognition Pattern Classification February 13, 2012 Veton Kpuska 2 Pattern Classification Introduction Parametric classifiers Semiparametric classifiers Dimensionality reduction Significance testing February 13, 2012 Veton Kpuska 3 Pattern Classification Goal: To classify objects (or patterns) into categories (or classes) Types of Problems: 1. Supervised : Classes are known beforehand, and data samples of each class are available 2. Unsupervised : Classes (and/or number of classes) are not known beforehand, and must be inferred from data Feature Extraction Classifier Class i Feature Vectors x Observation s February 13, 2012 Veton Kpuska 4 Probability Basics Discrete probability mass function (PMF): P ( i ) Continuous probability density function (PDF): p(x) Expected value: E(x) = i i P 1 ) ( = 1 ) ( dx x p = dx x xp x E ) ( ) ( February 13, 2012 Veton Kpuska 5 KullbackLiebler Distance Can be used to compute a distance between two probability mass distributions, P ( z i ), and Q ( z i ) Makes use of inequality log x x  1 Known as relative entropy in information theory The divergence of P ( z i ) and Q ( z i ) is the symmetric sum ( 29 ( 29 ( 29 ( 29 log  = i i i i z Q z P z Q Q P D ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 =  i i i i i i i i i i i z Q z P z Q z P z Q z Q z P z Q 1 log ( 29 ( 29 P Q D Q P D   + February 13, 2012 Veton Kpuska 6 Bayes Theorem Define: { i } a set of M mutually exclusive classes P( i ) a priori probability for class i p( x  i ) PDF for feature vector x in class i P( i  x ) A posteriori probability of i given x February 13, 2012 Veton Kpuska 7 Bayes Theorem From Bayes Rule: Where: ) ( ) ( )  ( )  ( x p P x p x P i i i = = = M i i i P x p x p 1 ) ( )  ( ) ( Bayesian Decision Theory Reference: Pattern Classification R. Duda, P. Hard & D. Stork, Wiley & Sons, 2001 February 13, 2012 Veton Kpuska 9 Bayes Decision Theory The probability of making an error given x is: P(errorx)=1P( i x) if decide class i To minimize P ( error  x ) (and P ( error )): Choose i if P( i x)>P( j x) j i February 13, 2012 Veton Kpuska 10 Bayes Decision Theory For a two class problem this decision rule means: Choose 1 if else 2 This rule can be expressed as a likelihood ratio: ) ( ) ( )  ( ) ( ) ( )  ( 2 2 1 1 x p P x p x p P x p ) ( ) ( )  ( )  ( 1 2 2 1 P P x p x p February 13, 2012 Veton Kpuska 11 Bayes Risk Define cost function ij and conditional risk R ( i  x ):...
View
Full
Document
This note was uploaded on 02/11/2012 for the course ECE 5526 taught by Professor Staff during the Summer '09 term at FIT.
 Summer '09
 Staff

Click to edit the document details