Which copy_ is better?

x = torch.tensor([0,0,0,0])
k=torch.tensor([1,2,3,4])

x.data.copy_(k) #1
#or
x.copy_(k) #2
x

Hi,

The first one should never be used as .data should not be used.
The new api for #1 is:

with torch.no_grad():
    x.copy_(k)

And the difference is clearer: use #1 (EDIT: the version of #1 given just above in my comment, not your original one) if the copy should not be tracked by the autograd (like initializing some weights) and use #2 if you want gradients.

I may be tired, but I do not understand. If you said:

The first one should never be used as .data should not be used.

And later:

use #1 if the copy should not be tracked by the autograd

This is opposite.

Sorry, I wasn’t very clear. Use the other equivalent version of #1 that I gave you (the one with torch.no_grad()).

Hi, I have a question regarding this reply.

Why shouldn’t .data.copy_ be used?? Is p.copy_ always preferred than p.data.copy_?

.data should not be used anymore.
It is unsafe and might lead to silently wrong gradients.

Regardless of the pytorch version we are using?? Can I get more understanding of why it is unstable??

@albanD explains it in more detail in this post.

1 Like