pretrained=True and torch.no_grad()

In VGG network, I often saw the neural_style_tutorial

cnn = models.vgg19(pretrained=True).features.to(device).eval()

It will extract VGG features from pre-trained weight. If I train my network on the custom dataset, then to extract my network feature from the trained model. Instead of using pretrained=True, can I use torch.no_grad()?

cnn = my_model.eval()
with torch.no_grad():
     feature = cnn('fc')

The pretrained argument is passed to the initialization of the vgg19 model, which will load the pretrained state_dict. If you have a custom trained model, you should load this state_dict manually via model.load_state_dict.

torch.no_grad makes sure that all operations in the block won’t be tracked by Autograd.
E.g. intermediate variables, which are created in your forward pass, won’t be stored and you can save some memory. It’s independent of the model or if you are using a pretrained model.

Could you explain your issue a bit more and where you are stuck?

@ptrblck: Thanks for valuable comments. In my case, I want to use pre-trained model to compute the Style Loss. It is done by extract the feature from the pre-trained model such as VGG. However, I will use my custom network that also trained on ImageNet. Instead of using VGG network, I will use my network.

As you comment, I understand that we just change the pretrained=True by load_state_dict. Because the styleloss will be added on total loss with backward function. So, I think I should not use torch.no_grad(). Am I right? If I use torch.no_grad(), then styleloss.backward() will be useless.

@ptrblck: Please confirm my understanding

Sorry for the late follow-up.
Yes, you are right. If you want to call backward on the loss, you shouldn’t wrap the forward pass in a torch.no_grad block.
This is useful for validation, test, inference, where no backward will be called and thus the intermediate variables can be freed to save memory.

1 Like