Hi,
I am getting the following error, and torch.autograd.set_detect_anomaly(True) is not able to give me the location of the error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [512, 65]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
The function below is causing the problem, because as soon as I add the loss defined in this function to the rest of the losses I am optimizing for, it throws the error. Does anyone know if I have an in-place operation here?
I am optimizing a few times (it’s a sequential process where I optimize every t iterations) and I have loss.backward(retain_graph=True). It specifically complains at the second time I perform a backward call.
The network I am backpropagating through outputs the variable canvas.
Thanks!
def calculate_random_windows_loss(self, canvas, target, windows):
"""
Calculate error maps between canvas and target.
windows: [batch, num_windows, 3, 128, 128]
canvas and target: [batch, 3, 128, 128]
"""
# 1) Unsequeeze and expand canvas and target
num_windows = windows.shape[1]
nonzero = torch.count_nonzero(windows, dim=(2,3,4))
canvas_ = canvas.unsqueeze(1).expand(-1, num_windows, -1, -1, -1) # same shape as windows
target_ = target.unsqueeze(1).expand(-1, num_windows, -1, -1, -1)
# 2) Calculate error maps -> sum across channel (2nd dim) and pixels (3rd and 4th dimension)
error_maps = torch.nn.functional.mse_loss(canvas_ * windows, target_ * windows, reduction='none').sum((2,3,4)) / nonzero # [batch, total_windows]
window_loss_and_index = torch.topk(error_maps, k=1)
window_loss = window_loss_and_index[0]
window_idx = window_loss_and_index[1]
return window_loss, window_idx