In a rbf network as we can see on the right hand side

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: f , it can be taken out of the summation. This yields, . We multiply this by . Then, it becomes, . Next, note that , and . Then rearranging the terms, we finally have the posterior: . where is the probability of future given data, is the probability of class membership given a future. Interestingly, this is just the product of the posterior of the two functions that are summed. Inte rpre tation of RBF Ne twork clas s ification We want to relate the results that we derived above to our RBF Network. In a RBF Network, as we can see on the right hand side, we have a set of data, to , and the hidden basis function, to , and then we have some output, to . Also, we have weights from the hidden layer to output layer. The output is just the linear sum of ’s. Now consider probability of posterior can be written as, given to be , and the probability of given to be the weights , then the . Figure 26.1.2(2): RBF Nerwork Now, let us look at an example in one dimensional case. Suppose, , and is from 1 to 2. We know that is a radial basis function. It's as if we put some Gaussian over data. And for each Gaussian, we consider the center . Then, what computes is the similarity of any data point to the center. Figure 26.1.2(1): Gaussian mixture We can see the graph on the left which plots the density of and . Take for instance, if the point gets far from the center , then it will reduce to become nearly zero. Remember that, we can usually find a non- linear regression or classification of input space by doing a linear one in some extended space or some feature space (more details in Aside). Here, the ’s actually produce that feature space. So, one way to look at this is that this is telling us that given an input, how likely the probability of presence of a particular feature is. Say, for example, we define the features as the centers of these Gaussian distributions. Then, this function somehow computes the possibility given certain data points, of this kind of feature appearing. If the data point is right at the center, then the value of...
View Full Document

This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online