Hi,
I have a loss function that combines loss functions that use the same inputs. However, even though I removed all “in-place operations” and tried different things from “.clone()” or “.clone().detatch()” - I still got an "in-place operation error " with the autograd for the second part (and I can’t find where the problem is).
Also, these 2 parts can be used alone without any “inplace operation errors”, but when used together, it doesn’t work anymore.
Where seg_loss is a “Dice loss” calculation which also works fine when used alone OR when only one of these 2 parts is used (but not when these 2 parts are used together).
Does anyone have any idea what could be causing the “inplace operation” error and how I could do to fix it ?
I also try to use “torch.autograd.set_detect_anomaly(True)”, but eighter I don’t know where I need to set it or it’s doesn’t explain to me where exactly is the issue.
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 21, 1]] is at version 21; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
I’m also strangely getting the same error with another loss function (which also sums multiple losses).
When I enable “torch.autograd.set_detect_anomaly(True)” it indicates an error on the line of the “cardinality” calculation in the main function that I use for the calculation of the Jaccard Loss function which is this function (it was also saying the same thing before):
The problem came from the fact that the “soft_jaccard_score_with_weight” function is used in the “forward” of a class which calculates the “dynamic weights” to be used at each iteration/training batch.
However, since I was using the same “class Object” (of this loss function class) several times in a row, the code for calculating the gradient probably saw it as that this weight (yet without gradient) had been modified in an “inplace” way, thus causing this problem.
Just using different “loss objects” instead, so that each one is separate from each other, solved my problem.