Hi, I’m trying to add a ReLU layer to PART of my model’s output, here is my code:

import torch.nn as nn
import torch
input = torch.ones(2).cuda()
linear = nn.Linear(2, 2).cuda()
output = linear(input)
output[0] = nn.ReLU(inplace=False)(output[0])
loss = torch.sum(output)
loss.backward()

but I got the error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor []], which is output 0 of SelectBackward, is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I’m sure it was caused by this row:

output[0] = nn.ReLU(inplace=False)(output[0])

but why is this an inplace operation? How should I modify part of the output in a differentiable way? Thanks in advance.

Since you are assigning the result to the same tensor, it’s considered an inplace operation.
You could instead create a new result tensor, using torch.cat and slicing:

The error is thrown, if the output of the computation is needed for the backward pass, as explained here (the linked topic doesn’t exactly use your model, but might give you an idea).