# Ch2 - Insight by Mathematics and Intuition for understanding Pattern Recognition Waleed A Yousef Faculty of Computers and Information Helwan

This preview shows pages 1–7. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Insight by Mathematics and Intuition for understanding Pattern Recognition Waleed A. Yousef Faculty of Computers and Information, Helwan University. March 13, 2010 Ch2. Introduction and Statistical Decision Theory Types of Variables and Important Notation: Quantitative , where some measure is given as a value; e.g., X = 1 , 3 ,- 2 . 5. Qualitative (or Categorical), where no measures or metrics are associated; e.g., X = Diseased,Nondiseased . Ordered Categorical ; e.g., X = small,medium,... . The variable X ∈ G , a set of possible values. X Random variable (or vector). In general, X = ( X 1 ,...,X p ) Í x i i th observation from X . Therefore, x i = ( x i 1 ,...,x ip ) Í Y To denote a quantitative response G Qualitative (for group) response; G ∈ G X A data matrix: X N × p =         x Í 1 . . . x Í N         =         x 11 ... x 1 p . . . . . . x N 1 x Np         x j All observations of X j ; i.e., j th vector of X : x j =         x 1 j . . . x Nj         ä Y The predicted value of Y ä G Predicted category (or class); ä G ∈ G . tr Training set (dataset); tr = { t i | t i = ( x i ,y i ) ,i = 1 ,...,n } . Regression The predictor X ∈ R p and the response Y ∈ R . What is the “best” prediction ä Y = f ( X )? This should be defined in terms of some loss, for which we can find the best. This best will not be the best for another loss! In terms of square error loss Risk = EPE = E ( Y- f ( X )) 2 = Ú ( Y- f ( X )) 2 f XY ( x,y ) dxdy = Ú ( Y- f ( X )) 2 f Y | X f X dxdy = Ú C Ú ( Y- f ( X )) 2 f Y | X dy D f X dx = E X E Y | X ( Y- f ( X )) 2 , then we can minimize EPE by minimzing it pointwise w.r.t X f * ( x ) = arg min f ( X ) E Y | X ( Y- f ( x )) 2 E Y | X ( Y- f ( x )) 2 ü ûú ý Conditional Risk = E Y | X [( Y- E [ Y | X = x ]) + E ([ Y | X = x ]- f ( x ))] 2 = E Y | X { ( Y- E [ Y | X = x ]) 2 + 2 ( Y- E [ Y | X = x ]) (E [ Y | X = x ]- f ( x )) + (E [ Y | X = x ]- f ( x )) 2 } = E Y | X ( Y- E [ Y | X = x ]) 2 + (E [ Y | X = x ]- f ( x )) 2 = σ 2 Y | X + (E [ Y | X = x ]- f ( x )) 2 f * ( x ) = E [ Y | X = x ] ....
View Full Document

## This note was uploaded on 01/06/2011 for the course IT 342 taught by Professor Waleeda.yousef during the Spring '10 term at Helwan University, Helwan.

### Page1 / 27

Ch2 - Insight by Mathematics and Intuition for understanding Pattern Recognition Waleed A Yousef Faculty of Computers and Information Helwan

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online