From this link:
I learned that sometimes we cannot use in-place operations in the forward because it will yield an error during backpropagation, and it is suggested to use
.clone() before in-place operations. However, I don’t understand well when it is needed to use
.clone(). For instance, in the following code:
import torch import torch.nn.functional as F def mut(x, w, mask): return w*x[mask] # Weights w1 = torch.ones(5, requires_grad=True) w2 = torch.ones(3, requires_grad=True) w3 = torch.ones(3, requires_grad=True) x = 2*torch.ones(5) mask = torch.tensor([True, False, True, False, True], dtype=torch.bool) x = w1*x x = F.selu(x).clone() x[mask] = mut(x, w2, mask) x[mask] = F.selu(x[mask]) x[mask] = mut(x, w3, mask) x.mean().backward()
I need to add
.clone() after the first SeLU, if I added it in the next line:
x[mask] = mut(x.clone(), w2, mask) it does not work. Why is this?
Also, it seems that I only need to use
.clone() before the first call to
mut() , but not before the second time. Why?
Probably I am missing something. I would be very grateful for an explanation to where and when it is needed to clone.
Thanks a lot,