For example, here is a snippet of code of old version of pytorch:
# classifier is the classifier of a torchvision pre-trained model
fc6 = nn.Conv2d(512, 4096, kernel_size=7)
fc6.weight.data.copy_(classifier.weight.data.view(4096, 512, 7, 7))
Is there a better way to write this in version 0.4?
Thanks in advance!
You simply need to remove the .data
operations on .data are hidden from autograd. in this case, if you used weight and bias in a graph before this segment, and don’t do .data or .detach or with torch.no_grad(), autograd may complain about necessary tensors being modified inplace.
As I understood migration guide correctly, we can just simply replace code with .data to .detach if operation was not inplace and torch.no_grad() if it was inplace operation. Like in this example: Detach and .data
Yes you can generally do that, unless you are doing some hacks that you want hidden from autograd
I have a layer that is forward-only and using
.data was the simplest way to implement is.
I could use
no_grad context, but that would have been not as elegant because I do want
second_loss, just not for
why not to use
What happens if I have
second_loss(func(x.detach()), func2(x.detach()), y), then it would only matter if
x.detach(), being a function that I do not know its implementation, has an overhead or not.
x.data is a property that does function calls in the background to re-wrap the underlying stuff in a new variable, I would not know the overhead of
x.data either (but believe both are similar, after looking at the implementation).
The migration guide fairly clearly advises