Transfer Learning 19 Figure 2 1 a Lower level activations followed by b mid

Transfer learning 19 figure 2 1 a lower level

This preview shows page 21 - 24 out of 42 pages.

Transfer Learning | 19
Image of page 21
Figure 2-1. (a) Lower level activations, followed by (b) mid level activations and (c) upper layer activations. Source: Convolutional Deep Belief Networks for Scalable Unsu‐ pervised Learning of Hierarchical Representations, Lee et al, ICML 2009 ( ford.edu/~ang/papers/icml09-ConvolutionalDeepBeliefNetworks.pdf ) If you want to transfer knowledge from one model to another, you want to reuse more of the “general” layers, and fewer of the “specific” layers. In other words, you want to remove the last few layers so that you can utilize the more generic ones, and add layers that are geared towards your specific classification task. This is how trans‐ fer learning is achieved. While transfer learning is the concept, fine-tuning is the implementation process. Fine-tuning, as the name suggests, typically involves tweaking the weights of the last few of layers in the model. You will often hear data scientists saying, “I fine-tuned the model.” which means they took a pretrained model, froze the lower layers, trained the upper part of the network on the new dataset they had (thereby modifying the weights of these upper layers). How many layers of a CNN should we fine-tune? This can be guided by the following two factors: 1. How much data do we have? If you have a couple hundred labeled images, it would be hard to train and test a network from scratch. Hence, you should fine-tune the last few layers. But, if you had a million labeled images, it would be feasible to fine-tune all layers of the net‐ work, and if necessary, train from scratch (i.e build the model architecture with random weights). So, the amount of task-specific data dictates whether or not, and how much you can fine-tune. 2. How similar is the data? If the task-specific data is similar to the data used for the pretrained network, then you can fine-tune the last few layers. But if your task is identifying different bones in an X-ray image, and you want to start out from an ImageNet trained 20 | Chapter 2: Cats vs Dogs - Transfer Learning in 30 lines with Keras
Image of page 22