What is the best way to do changes to a torchvision model

Deeply · March 5, 2021, 5:26pm

I would like to try some new regularization methods and thought of using some torchvision models.

I only want to work on one model; say for example googlenet, what is the best way to do this:
1- Download torchvision from source, or
2 - Download the whole vision from source, or
3- Only download googlenet, which I noticed won’t work as it has some dependencies (have already tried this option).

Before spending time on this, I appreciate any thoughts on the best way to do this?

ptrblck · March 6, 2021, 10:20am

Downloading the source and building it after your manipulation might work, but the easier way would probably be to create a custom nn.Module and use the torchvision model as the base class.
Inside your custom model you could then redefine the internal parameters and methods.

Deeply · March 6, 2021, 11:50am

not sure if this would work. For example, I would like to remove droupout from googlenet and use another regularization method.

ptrblck · March 6, 2021, 9:41pm

The dropout is used as a module, you could replace it with nn.Identity. Alternatively, if it’s used via the functional API, you could override the forward method and reimplement it.
I still think this might be the faster and cleaner approach than to manipulate the original model and rebuild torchvision, but you might prefer it.

Deeply · March 8, 2021, 2:25pm

Thank you! I agree that overriding the forward function would be the best way to do it. To do this, do you mean to build a new googlenet_like class inherited from from googlenet class, then, to override the forward function?

NB. Currently, googlenet in tochvision/googlenet.py is working out-of-the-box without any extra dependencies, except the need to complement it with utils.py (from utils import load_state_dict_from_url); although this has little impact for my case as the pretrained model has dropout in its structure. Got acc1 of 20% after 9 epochs on CIFAR-100, trained from scratch.

Sounds I had the problem with the logits thing, and have resolved it already.

ptrblck · March 9, 2021, 5:39am

Yes, that what I was thinking. However, if simply copy-pasting the model code (without the unused utils) works fine, it could be easier to just to this. In any way, I would try to avoid changing the code directly in torchvision and rebuilding it, as these changes would live in you branch of torchvision and (if I’m understanding your use case correctly) you don’t want to commit these changes, as they are a standalone experiment.

Deeply · March 9, 2021, 1:16pm

Absolutely, it is a separate project.