Some basic modifications to Transfer Learning Tutorial

Mona_Jalal · November 9, 2018, 5:19pm

In the transfer learning tutorial, I have the following questions:

How can I modify the code so that it also reports the test accuracy besides train and validation accuracy?
How can I report per class accuracy?
For academic papers, is it required to report all train, validation, and test accuracy or only train and validation accuracy is enough?
When I use 25 epochs I get better train/test acc than 100 epochs. I have been using 100-200 epochs for the cases I had lots of data points. What number of epochs is rational when we don’t have so many number of epochs?
Which ResNet should I use? ResNet50 or ResNet101? I see different papers use one of the two and sometimes one performs better than the other.

https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

Thanks a lot for suggestions. I have a beginner knowledge of PyTorch and appreciate your patience.

ptrblck · November 9, 2018, 8:25pm

I try to answer all questions, but take some of them with a grain of salt, as some of them are my personal opinion.

You would have to create a test dataset in the same manner as the train and val datasets were created.
Also, you could create a new transformation for the test Dataset or just use the val transform.
In train_model you should change the phase loop to for phase in ['train', 'val', 'test']:. The code should work fine on the test data, as the optimization conditions are performed on train, such that no learning takes place for val and test.
You could create a confusion matrix and fill it with the preds and targets during val and test. After all samples were added to the confusion matrix, you can calculate the per-class accuracy by dividing by the column (or row) sums (that depends on your confusion matrix layout). If you don’t feel like writing the code yourself, sklearn provides an implementation.
Usually you would have to report the test accuracy, as the validation data is “dirty”, i.e. it was usually used for hyperparameter tuning. For example if you observe the validation loss for early stopping or to alter your model architecture, you create a data leak, which marks the val data as “seen” by the model. That’s also the reason why your model shouldn’t see the test data until you’re finished with your experiment. Whether everyone sticks to it is a different matter.
Use the number of epochs, which works the best for your dataset. It’s hard to tell how many epochs your model needs in training. Especially if you’re using something like cyclic learning rate, you might need way less epochs to get to the same accuracy as other training strategies.
That’s also a bit hard to tell, as it depends on your data and use case. While ResNet101 has more layers, therefore potentially more capacity, it might be easier to overfit to your data so that your validation/test accuracy is worse. Try both architectures on your validation data and chose the one which is working better.

Mona_Jalal · November 9, 2018, 9:45pm

Thanks a lot. These are very helpful notes. I will report back here if I get stuck in any of them.