Ch3-Pattern_Classification1

Ch3-Pattern_Classification1 - Speech Recognition Pattern...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Speech Recognition Pattern Classification February 13, 2012 Veton Kpuska 2 Pattern Classification Introduction Parametric classifiers Semi-parametric classifiers Dimensionality reduction Significance testing February 13, 2012 Veton Kpuska 3 Pattern Classification Goal: To classify objects (or patterns) into categories (or classes) Types of Problems: 1. Supervised : Classes are known beforehand, and data samples of each class are available 2. Unsupervised : Classes (and/or number of classes) are not known beforehand, and must be inferred from data Feature Extraction Classifier Class i Feature Vectors x Observation s February 13, 2012 Veton Kpuska 4 Probability Basics Discrete probability mass function (PMF): P ( i ) Continuous probability density function (PDF): p(x) Expected value: E(x) = i i P 1 ) ( -= 1 ) ( dx x p -= dx x xp x E ) ( ) ( February 13, 2012 Veton Kpuska 5 Kullback-Liebler Distance Can be used to compute a distance between two probability mass distributions, P ( z i ), and Q ( z i ) Makes use of inequality log x x - 1 Known as relative entropy in information theory The divergence of P ( z i ) and Q ( z i ) is the symmetric sum ( 29 ( 29 ( 29 ( 29 log || = i i i i z Q z P z Q Q P D ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 ( 29 -= - i i i i i i i i i i i z Q z P z Q z P z Q z Q z P z Q 1 log ( 29 ( 29 P Q D Q P D || || + February 13, 2012 Veton Kpuska 6 Bayes Theorem Define: { i } a set of M mutually exclusive classes P( i ) a priori probability for class i p( x | i ) PDF for feature vector x in class i P( i | x ) A posteriori probability of i given x February 13, 2012 Veton Kpuska 7 Bayes Theorem From Bayes Rule: Where: ) ( ) ( ) | ( ) | ( x p P x p x P i i i = = = M i i i P x p x p 1 ) ( ) | ( ) ( Bayesian Decision Theory Reference: Pattern Classification R. Duda, P. Hard & D. Stork, Wiley & Sons, 2001 February 13, 2012 Veton Kpuska 9 Bayes Decision Theory The probability of making an error given x is: P(error|x)=1-P( i |x) if decide class i To minimize P ( error | x ) (and P ( error )): Choose i if P( i |x)>P( j |x) j i February 13, 2012 Veton Kpuska 10 Bayes Decision Theory For a two class problem this decision rule means: Choose 1 if else 2 This rule can be expressed as a likelihood ratio: ) ( ) ( ) | ( ) ( ) ( ) | ( 2 2 1 1 x p P x p x p P x p ) ( ) ( ) | ( ) | ( 1 2 2 1 P P x p x p February 13, 2012 Veton Kpuska 11 Bayes Risk Define cost function ij and conditional risk R ( i | x ):...
View Full Document

This note was uploaded on 02/11/2012 for the course ECE 5526 taught by Professor Staff during the Summer '09 term at FIT.

Page1 / 50

Ch3-Pattern_Classification1 - Speech Recognition Pattern...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online