In VGG network, I often saw the neural_style_tutorial
cnn = models.vgg19(pretrained=True).features.to(device).eval()
It will extract VGG features from pre-trained weight. If I train my network on the custom dataset, then to extract my network feature from the trained model. Instead of using pretrained=True
, can I use torch.no_grad()
?
cnn = my_model.eval()
with torch.no_grad():
feature = cnn('fc')
The pretrained
argument is passed to the initialization of the vgg19
model, which will load the pretrained state_dict
. If you have a custom trained model, you should load this state_dict
manually via model.load_state_dict
.
torch.no_grad
makes sure that all operations in the block won’t be tracked by Autograd.
E.g. intermediate variables, which are created in your forward pass, won’t be stored and you can save some memory. It’s independent of the model or if you are using a pretrained model.
Could you explain your issue a bit more and where you are stuck?
@ptrblck: Thanks for valuable comments. In my case, I want to use pre-trained model to compute the Style Loss. It is done by extract the feature from the pre-trained model such as VGG. However, I will use my custom network that also trained on ImageNet. Instead of using VGG network, I will use my network.
As you comment, I understand that we just change the pretrained=True
by load_state_dict
. Because the styleloss will be added on total loss with backward function. So, I think I should not use torch.no_grad(). Am I right? If I use torch.no_grad(), then styleloss.backward() will be useless.
@ptrblck: Please confirm my understanding
Sorry for the late follow-up.
Yes, you are right. If you want to call backward
on the loss, you shouldn’t wrap the forward pass in a torch.no_grad
block.
This is useful for validation, test, inference, where no backward
will be called and thus the intermediate variables can be freed to save memory.
1 Like