In-place operation error - PyTorch1.6.0

This is my class definition:

class Tracker(nn.Module):
    def __init__(self):
        super(Tracker, self).__init__()
        self.bigru = nn.GRU(input_size=2, hidden_size=100, batch_first=True, bidirectional=True)
        self.fc1 = nn.Linear(200, 32)
        self.fc2 = nn.Linear(32, 2)
    def forward(self, inputs):
        x, states = self.bigru(inputs)
        x = self.fc1(x[:, -1, :])
        x = self.fc2(x)
        return x

While training I use loss.backward(retain_graph=True). But I get the error

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation.

I went through couple of discussions on the topic and realized it was due to the in-place operation of a tensor.So I switched to the following code.

    def forward(self, inputs):
        x, states = self.bigru(inputs)
        k = x.clone()
        y = self.fc1(k[:, -1, :])
        z = self.fc2(y.clone())
        return z

The error still persists. Can anyone tell me where am I going wrong? Thanks!

Hi,

Can you enable anomaly mode? That will show you which op had its input changed
https://pytorch.org/docs/stable/autograd.html#anomaly-detection

Hi,
I have done that but forgot to attach the error. Here it is:

File "venv\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "src/del.py", line 76, in forward
    z = self.fc2(y.clone())
  File "venv\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "venv\lib\site-packages\torch\nn\modules\linear.py", line 91, in forward
    return F.linear(input, self.weight, self.bias)
  File "venv\lib\site-packages\torch\nn\functional.py", line 1674, in linear
    ret = torch.addmm(bias, input, weight.t())

So that means that an input to your linear layer was modified inplace.
Since from the stack trace you clone the input to make sure it is not this one, then I guess it is the weights, which are most like modified by your optimizer when you do optimizer.step() ?

            tracker_opt.zero_grad()
            tracker_out = tracker_model(tracker_in.float())
            loss = nn.MSELoss()
            loss_tracker = loss(tracker_out, torch.as_tensor(np.concatenate([np.real(temp)[:, None],
                                                                             np.imag(temp)[:, None]], axis=1)).float())
            loss_tracker.backward(retain_graph=True)
            tracker_opt.step()
            tracker_in = tracker_out[:, None, :]

This is my training loop. Before I do optimizer.step(), I do not make any changes I believe. Do you spot anything wrong?

What I mean is that optimizer.step() is an inplace operation. So if you try to backward again after doing the optimizer step without re-doing the forward, you will see this error.
Is your code doing something like:

out = model(inp)
first_loss = bar(out, label)
first_loss.backward(retain_graph=True)
opt.step()

opt.zero_grad()
second_loss = baz(out, label)
second_loss.backward()  # This will fail if model has a Linear
# Because the Linear's weight were modified inplace above and
# this backward cannot be done anymore.

Oh I see.
My training method and loop is attached below

def train(tracker_model, tracker_opt, epoch, _):
    global train_tracker
    if train_tracker:
        global tracker_inputs

        tracker_in = torch.as_tensor(np.concatenate([np.real(tracker_inputs)[:, None, None],
                                                     np.imag(tracker_inputs)[:, None, None]], axis=2))
        for i in range(epoch):
            tracker_model.train()
            tracker_opt.zero_grad()
            tracker_out = tracker_model(tracker_in.float())
            loss = nn.MSELoss()
            temp = generate_ar_sequence(16)
            loss_tracker = loss(tracker_out, torch.as_tensor(np.concatenate([np.real(temp)[:, None],
                                                                             np.imag(temp)[:, None]], axis=1)).float())
            loss_tracker.backward(retain_graph=True)
            tracker_opt.step()
            tracker_in = tracker_out[:, None, :]
        train_tracker = False
        tracker_model.eval()

This looks ok I think.
Note that you don’t need retain_graph=True here right? You don’t actually plan to backward on that graph again?

Ok, maybe I don’t understand retain_graph correctly. Isn’t loss_tracker.backward() required at every iteration?

retain_graph is only needed if you call backward again without calling forward again. If you only call things once, there is no need for it.