3.5 Kernels and Gaussian Processes If we consider the output of a function f(x), for fixed x ? X, as f is chosen according to some distribution ? defined over a class of real-valued functions ?, we may view the output value as a random variable, and hence as a collection of potentially correlated random variables. Such a collection is known as a stochastic process. The distribution over the function class ? can be regarded as our prior belief in the likelihood that the different functions will provide the solution to our learning problem. Such a prior is characteristic of a Bayesian perspective on learning. We will return to discuss this approach further in the next chapter and in Chapter 6 discuss how to make predictions using Gaussian processes. At this point we wish to highlight the connection between a particular form of prior commonly used in Bayesian learning and the kernel functions we have introduced for Support Vector Machines. Many of the
