Avoiding a sqrt to 0 values is breaking the backpropagation

Dhorka · October 3, 2019, 11:08am

Hi,

I am trying to apply a torch.sqrt to a vector that contains 0 values. My goal is to avoid the 0 values. I mean only a pply the torch.sqrt to values that are different of 0. To do that I have try the following:

t=torch.tensor([[0.0,1.0,2.0],[3.0,0.0,4.0],[5.0,6.0,0.0]])
aux = t[t!=0].clone()
aux = torch.sqrt(aux)
t[t!=0] = aux.clone()

This works, but breaks the backpropagation. I am getting the following error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [7822, 1]], which is output 0 of DivBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

It seems like this line aux=t[t!=0].clone() is breaking the backpropagation. How can I solve it? Is there any way to achieve my goal easily?

KFrank · October 3, 2019, 1:56pm

Hi Dhorka!

I speculate that you wish to avoid torch.sqrt (0) because the
derivative of sqrt is infinite at 0.

If your goal is, in fact, to avoid the infinite derivative, you could
simply add a small “epsilon” to your value before calling sqrt:

epsilon = 1.e-8
t = torch.sqrt (t + epsilon)

Now the infinite derivative for (elements of) t = 0 just becomes
a large derivative (specifically 1 / (2 * sqrt (epsilon))).

epsilon should be chosen to be comfortably smaller than the
typical small values that show up in this part of your computation.

Best.

K. Frank