Torch.heaviside disables gradient

yegane · November 7, 2021, 6:12pm

I am trying to backpropagate through a torch.heaviside function, but it seems that it disables the gradient for its output. How can I have it in a backpropagation?

If it is is non-differentiable, is there any function that produces 1,0 only?

KFrank · November 7, 2021, 7:18pm

Hi Yegane!

You can’t use the actual Heaviside function in backpropagation.
Use instead a “soft,” usefully-differentiable approximation to it
such as torch.sigmoid().

There is no such function that would be useful for backpropagation.
Such a function would have zero derivative in regions where it was
either zero or one and would have undefined derivative at those
points where it changed from zero to one. In neither case would
you get any non-trivial backpropagation.

Best.

K. Frank

abarbadan · September 11, 2022, 5:14am

I’ve tried to use the heaviside function to generate piecewise differentiable functions. The gradient of the functions being stitched is what I want, and as such, having a constant gradient for heaviside be zero is useful. If the jump discontinuity has a value of 0.5 for example, this will set the gradient at the stitch point to be the average of the gradients of the left and right stitched functions. I’ll be using sigmoids for the time being, but I’d like to add to this discussion in the hopes that adding this functionality to heaviside will be reconsidered and implemented in the future.