Torch.sign breaks autograd dynamic graph

Hi!

I have and autoencoder, and between the coder and decoder I transform the data using torch.sign . When I do it, the backpropagation of the gradient stops in that point.

If I replace torch.sign by torch.sigmoid, then I don’t have that problem and the backpropagation goes back to the beginning.

Do I have to do something different with torch.sign?

Hi,

the sign function is not differentiable (or if you look at it in a differentiable manner, the gradient is 0 almost everywhere). So it is expected that you won’t be able to get gradients through it.

2 Likes

Thanks. Thinking it over again it makes sense.

I see the sign function as a non-linear function and small variations of the input gives the same output, thus the derivative is like the derivative of a constant, 0 (except values near 0)

A linear approximation of the function would be a sigmoid with parameter beta \frac{1}{1+e^{-\beta x}}. Which has derivative \frac{\beta \exp{-\betax}{(1+exp{-\beta*x})^2}. If I want to create a new function of this sigmoid with beta, the parameter received in the backward function, grad_output, is the one that has to be evaluated using the formula of the derivative for each element of grad_ouptut?

Hi,

Note that we have hardsignmoid (https://pytorch.org/docs/master/nn.functional.html#torch.nn.functional.hardsigmoid) if you’re using the nightly builds or similar functions like hardtanh.

If you want to use your custom function:

  • If you want the “true” gradient to be used, then just implement the function you want and autograd will get the gradient for you.
  • If you want the backward to compute something else than the gradient of your function, you can see this doc that explains how to do that with a custom autograd Function where you will have to specify the backward to use.

Thanks again for the answer and the references to the doc.

I think that to build a custom function is far from my knowledge of pytorch.

I think that I will multiply the data by \beta before doing the sigmoid function.

Most grateful