x = torch.tensor([0,0,0,0])
k=torch.tensor([1,2,3,4])
x.data.copy_(k) #1
#or
x.copy_(k) #2
x
x = torch.tensor([0,0,0,0])
k=torch.tensor([1,2,3,4])
x.data.copy_(k) #1
#or
x.copy_(k) #2
x
Hi,
The first one should never be used as .data
should not be used.
The new api for #1 is:
with torch.no_grad():
x.copy_(k)
And the difference is clearer: use #1 (EDIT: the version of #1 given just above in my comment, not your original one) if the copy should not be tracked by the autograd (like initializing some weights) and use #2 if you want gradients.
I may be tired, but I do not understand. If you said:
The first one should never be used as
.data
should not be used.
And later:
use #1 if the copy should not be tracked by the autograd
This is opposite.
Sorry, I wasn’t very clear. Use the other equivalent version of #1 that I gave you (the one with torch.no_grad()).
Hi, I have a question regarding this reply.
Why shouldn’t .data.copy_ be used?? Is p.copy_ always preferred than p.data.copy_?
.data
should not be used anymore.
It is unsafe and might lead to silently wrong gradients.
Regardless of the pytorch version we are using?? Can I get more understanding of why it is unstable??