Deep-Learning

Deep-Learning - Deep Learning l l l l Early Work Why Deep...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 678 – Deep Learning 1 Deep Learning l Early Work l Why Deep Learning l Stacked Auto Encoders l Deep Belief Networks
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Deep Learning Overview l Train networks with many layers (vs. shallow nets with just a couple of layers) l Multiple layers work to build an improved feature space First layer learns 1st order unsupervised features (e.g. edges…) 2nd layer learns higher order features (combinations of first layer features, combinations of edges, etc.) Early layers learn most basic feature concepts (e.g. edges) Early layers usually learn in an unsupervised mode and discover general features of the input space – for multiple tasks related to the unsupervised instances (image recognition, etc.) Then final layer features are fed into a supervised layer l And entire network is often tuned using supervised training of the entire net, using the initial weightings learned in the unsupervised phase CS 678 – Deep Learning 2
Background image of page 2
Deep Learning Tasks l Usually best when input space is locally structured – spatial or temporal: images, language, etc. vs arbitrary input features l Images Example: early vision layer CS 678 – Deep Learning 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Why Deep Learning l Biological Plausibility – e.g. Visual Cortex l Hastad proof - Problems which can be represented with a polynomial number of nodes with k layers, may require an exponential number of nodes with k -1 layers (e.g. parity) l Highly varying functions can be efficiently represented with deep architectures Less weights/parameters to update than a less efficient shallow representation l Sub-features created in deep architecture which can be shared between tasks CS 678 – Deep Learning 4
Background image of page 4
Early Work l Fukushima (1980) – Neo-Cognitron l LeCun (1998) – Convolutional Neural Networks Similarities to Neo-Cognitron l Many layered Backpropagation Tried early but without much success l Very slow l Diffusion of gradient Very recent work has shown significant accuracy improvements by "patiently" training deeper MLPs with BP using fast machines (GPUs) CS 678 – Deep Learning 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Convolutional Neural Networks l Convolution Example l Each layer combines (merges, smooths) patches from previous layers Typically like to compress large data (images) into smaller set of robust features Basic convolution can still create many features l Pooling – Pooling Example This step compresses and smooths the data Usually takes the average or max value across disjoint patches l Typically convolution filters and pooling hand crafted – not learned, though tuning can occur l After this unsupervised convolving the final set of features are used to train a supervised model CS 678 – Deep Learning 6
Background image of page 6
Convolutional Neural Network Examples 7 C layers are convolutions, S layers pool/sample
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Training Deep Networks l Build a feature space Note that this is what we do with SVM kernels, or trained hidden layers in BP, etc., but now we will build the feature space using deep architectures Unsupervised training between layers can decompose the problem
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 30

Deep-Learning - Deep Learning l l l l Early Work Why Deep...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online