Which copy_ is better?

blackbirdbarber · September 19, 2019, 8:15pm

x = torch.tensor([0,0,0,0])
k=torch.tensor([1,2,3,4])

x.data.copy_(k) #1
#or
x.copy_(k) #2
x

albanD · September 19, 2019, 8:28pm

Hi,

The first one should never be used as .data should not be used.
The new api for #1 is:

with torch.no_grad():
    x.copy_(k)

And the difference is clearer: use #1 (EDIT: the version of #1 given just above in my comment, not your original one) if the copy should not be tracked by the autograd (like initializing some weights) and use #2 if you want gradients.

blackbirdbarber · September 19, 2019, 8:34pm

I may be tired, but I do not understand. If you said:

The first one should never be used as .data should not be used.

And later:

use #1 if the copy should not be tracked by the autograd

This is opposite.

albanD · September 19, 2019, 9:36pm

Sorry, I wasn’t very clear. Use the other equivalent version of #1 that I gave you (the one with torch.no_grad()).

pnic · June 10, 2021, 4:51pm

Hi, I have a question regarding this reply.

Why shouldn’t .data.copy_ be used?? Is p.copy_ always preferred than p.data.copy_?

albanD · June 10, 2021, 6:30pm

.data should not be used anymore.
It is unsafe and might lead to silently wrong gradients.

pnic · June 10, 2021, 6:46pm

Regardless of the pytorch version we are using?? Can I get more understanding of why it is unstable??

ptrblck · June 11, 2021, 8:25am

@albanD explains it in more detail in this post.