Caffe Deep Learning Library
Contents:
- Installation of Caffe
- Generating graph of Model
- Parsing log
- Resume Training
- Transfer Learning
- Testing Model
Let’s get brewing !!
I want to dedicate this post to illustrate some cool stuff that one can do with Caffe
framework. You absolutely don’t need to write a single line of code to train the model ! How cool is that ?
Albeit it’s difficult to get started with caffe (if you are developing new layers) due to lack of documentation, it has some cool features that one can use to debug the training phase of Neural Networks. Let CAFFE_ROOT
represent the root of the caffe installation directory.
1. Installation (Ubuntu)
It can be bit of a pain installing Caffe. But with right changes, you can get this step done !
- Clone the repository from here.
- Instead of generating the
Makefile
yourself and messing up the dependencies, use the one provided by the community - There are some changes that you would need to make in the
Makefile
depending on whether you have opencv 2.X or 3.X version installed. - For further instructions, follow this link. It has detailed instructions.
Assuming you got Caffe installed on your workstation, let’s get brewing further along …
2. Generating the graph of the model
Once we have written the model definition file in *.prototxt
file, we can visualize the network in a diagram. One can confirm that the layer connections are correct. Caffe provides us with a script to do this and here is how to generate the image.
$ python $CAFFE_ROOT/python/draw_net.py <prototxt-file> <output-img-file>
Ex: The following is the generated graph of the SqueezeDet which I trained on PASCAL Dataset.
Fig 11: Visualalization of layers of a Network
3. Parsing the log
Caffe uses Google Logging Library to log the debug data. Caffe provides script to parse that data to produce more structured debug information (like training loss, test loss, accuracy, learning rate and so on). Before parsing the log, one needs to create the log file. This can be done by intercepting the output of training process and dumping the output into a file.
$ ./caffe train -solver <solver-file> 2>&1 | tee -a <logger-file>
The above command starts the training process as specified in the <solver-file>
and and in addition, intercepts the output and dumps into a file. This creates the log file. Once this is done, we can parse this file to get useful information in a more structured way.
$python $CAFFE_ROOT/tools/extra/parse_log.py <log-file> <output-dir>
The above command produces <log-file>.train
and <log-file>.test
in the <output-dir>
. We can use these two files to generate beautiful graphs of the loss/accuracy value as a function of the iteration.
4. Resuming the training
What if you wanted to stop the training and continue the training later ? Caffe provides this feature of resuming the training from a snapshot. This is how it’s done.
./caffe train -solver <solver-file> -snapshot <snapshot-file>
Here <snapshot-file>
is the *.snapshot
file which is generated from your initial training of the model.
5. Transfer Learning
Say you want to train a model on a dataset, but sadly the dataset is small. You might then consider to train your model on a bigger dataset (say ImageNet) prior to training on your custom dataset. Finally add some additional layers (initializing the weights of these layers randomly) and train the model ! You can accomplish this using Caffe by:
./caffe train -solver <solver-file> -weights <weight-file>
Here <weight-file>
is the .caffemodel
file which is generated from your initial training of the model on bigger dataset.
6. Testing your model
After making certain changes to your model definition file *.prototxt
, fire the following command to test your model. Take a look at this for the changes to be made.
./caffe test -model <deploy-prototxt> -weights <weights-file>