*This preview shows
pages
1–6. Sign up
to
view the full content.*

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **Machine Learning Srihari Bayesian Neural Networks Sargur Srihari [email protected] 1 Machine Learning Srihari Topics discussed here 1. Why Bayesian? 2. DifFculty of exact Bayesian treatment and need for approximation 3. Two approximate approaches • Variational • Laplace (one discussed here) 4. Bayesian neural network for regression • Posterior parameter distribution • Hyper-parameter optimization 5. Bayesian neural network for classiFcation 2 Machine Learning Srihari Why Bayesian? • More complex models ft data better but generalize poorly • Linear with two Free parameters, quadratic with three, cubic with Four? • Occam ʼ s razor says that unnecessarily complex models should not be preFerred to simpler ones • Neural networks are popular but notoriously lack objective grounding • Bayesian approach allows diFFerent models to be compared (no oF hidden units) 3 Machine Learning Srihari Classical and Bayesian neural networks • Classical neural networks use maximum likelihood • To determine network parameters (weights and biases) • Regularized maximum likelihood is MAP (maximum a posteriori) • Regularizer is the logarithm of prior parameter distribution • Bayesian treatment marginalizes over distribution of parameters in order to make prediction 4 Machine Learning Srihari Need for Approximation in Bayesian treatment • In simple linear regression problem, under assumption of Gaussian noise • Posterior is Gaussian and evaluated exactly • Predictive distribution found in closed form...

View
Full
Document