This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Journal of Machine Learning Research 8 (2007) 1769-1797 Submitted 7/06; Revised 1/07; Published 8/07 Characterizing the Function Space for Bayesian Kernel Models Natesh S. Pillai NSP2@STAT.DUKE.EDU Qiang Wu QIANG@STAT.DUKE.EDU Department of Statistical Science Duke University Durham, NC 27708, USA Feng Liang FENG@STAT.UIUC.EDU Department of Statistics University of Illinois at Urbana-Champaign Urbana-Champaign, IL 61820, USA Sayan Mukherjee SAYAN@STAT.DUKE.EDU Department of Statistical Science Institute for Genome Sciences & Policy Duke University Durham, NC 27708, USA Robert L. Wolpert WOLPERT@STAT.DUKE.EDU Department of Statistical Science Professor of the Environment and Earth Sciences Duke University Durham, NC 27708, USA Editor: Zoubin Ghahramani Abstract Kernel methods have been very popular in the machine learning literature in the last ten years, mainly in the context of Tikhonov regularization algorithms. In this paper we study a coherent Bayesian kernel model based on an integral operator defined as the convolution of a kernel with a signed measure. Priors on the random signed measures correspond to prior distributions on the functions mapped by the integral operator. We study several classes of signed measures and their image mapped by the integral operator. In particular, we identify a general class of measures whose image is dense in the reproducing kernel Hilbert space (RKHS) induced by the kernel. A conse- quence of this result is a function theoretic foundation for using non-parametric prior specifications in Bayesian modeling, such as Gaussian process and Dirichlet process prior distributions. We dis- cuss the construction of priors on spaces of signed measures using Gaussian and Levy processes, with the Dirichlet processes being a special case the latter. Computational issues involved with sam- pling from the posterior distribution are outlined for a univariate regression and a high dimensional classification problem. Keywords: reproducing kernel Hilbert space, non-parametric Bayesian methods, Levy processes, Dirichlet processes, integral operator, Gaussian processes c 2007 Natesh S. Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee and Robert L. Wolpert. PILLAI, WU, LIANG, MUKHERJEE AND WOLPERT 1. Introduction Kernel methods have a long history in statistics and applied mathematics (Schoenberg, 1942; Aron- szajn, 1950; Parzen, 1963; de Boor and Lynch, 1966; Micchelli and Wahba, 1981; Wahba, 1990) and have had a tremendous resurgence in the machine learning literature in the last ten years (Poggio and Girosi, 1990; Vapnik, 1998; Scholkopf and Smola, 2001; Shawe-Taylor and Cristianini, 2004). Much of this resurgence was due to the popularization of classification algorithms such as support vector machines (SVMs) (Cortes and Vapnik, 1995) that are particular instantiations of the method of regularization of Tikhonov (1963). Many machine learning algorithms and statistical estimators can be summarized by the following penalized loss functional (Evgeniou et al., 2000; Hastie et al.,can be summarized by the following penalized loss functional (Evgeniou et al....
View Full Document
This document was uploaded on 01/16/2011.
- Fall '09