One of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3, 48, 3, 3]] is at version 2; expected version 1 instead

Hi,

Is the training loop the one here?
Given that it happens not at the first backward it can be:

  • An optimizer.step() modified a weight that was already used by a forward and you call backward after:
out = model(inp)
opt_model.step() # Modifies the model's weights inplace
out.sum().backward() # Will fail with this error
  • You reuse some variables from one iteration to the next by mistake. To check this, the simplest thing is to wrap the content of the for loop into a single function to make sure everything goes out of scope properly:
def one_step(sample):
    # Do one step with sample using everything in scope.
for sample in dataloader:
    one_step(sample)
4 Likes