Lack of gradient flow with torch.ge()

Abhishaike_Mahajan · May 6, 2018, 12:04am

I may have found a bug in torch.ge(), but I just wanted to re-post it here in case it’s actually not a bug, but an issue with how I’m defining my model: https://github.com/pytorch/pytorch/issues/7322.

There’s a runnable and short code-sample in the link, along with my version numbers of relevant libraries.

SimonW · May 6, 2018, 12:54am

“greater than or equal” has zero gradient almost everywhere, and nondifferentiable at other points. It’s not a bug.

Abhishaike_Mahajan · May 6, 2018, 2:41am

Thank you for your answer! So is it not possible to have stable training with .ge when used as an intermediate layer?

SimonW · May 6, 2018, 6:23am

If there are no other path from input to output other than comparison ops, then there will be no gradients.

It’s not really an issue of stability because, if you think about the function, greater than or equal is just flat almost everywhere.

Alex6D · September 2, 2019, 12:07pm

How do you solve this problem? I run into the same needs. Thanks

Abhishaike_Mahajan · September 2, 2019, 8:36pm

Like SimonW said, this isn’t really a bug, just how the gradient surface of a piecewise operation is. I got around it by remaking my function into something that isn’t piecewise. I explain it in a bit more detail here: https://scientificattempts.wordpress.com/2018/06/03/conditionally-thresholded-cnns-for-weakly-supervised-image-segmentation/