In-place operations and autograd

Hello everyone!
I am trying to understand relations between in-place operations and autograd. In the first snippet of code, I have “RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation” and doesn’t have this exception in the second one.

First snippet:

import torch as T
from torch.autograd import Variable
from torch.nn import functional as F

x = Variable(T.rand(2, 2), requires_grad=True)
h = F.sigmoid(x)
h[:, 0] = 0
loss = h.sum()
loss.backward()

Second snippet:

import torch as T
from torch.autograd import Variable
from torch.nn import functional as F

x = Variable(T.rand(2, 2), requires_grad=True)
h = F.relu(x)
h[:, 0] = 0
loss = h.sum()
loss.backward()

Can someone explain why this is the case? Is it because F.sigmoid calls c function directly and F.relu performs some computations before calling c function which allows pytorch to handle in-place operation? Is there any function in Tensor that performs not in-place setitem ? something like this:

def setitem(self, key, value):
    self = self.clone()
    self[key] = value
    return self

Any help would be appreciated.

sigmoid uses the output to compute gradients:

relu uses the input to compute gradients:

Hence the different behaviors you see when modifying the output :). If you change

in second example to

x[:, 0] = 0

, you will see a similar error.

2 Likes