I may have found a bug in torch.ge(), but I just wanted to re-post it here in case it’s actually not a bug, but an issue with how I’m defining my model: https://github.com/pytorch/pytorch/issues/7322.
There’s a runnable and short code-sample in the link, along with my version numbers of relevant libraries.
“greater than or equal” has zero gradient almost everywhere, and nondifferentiable at other points. It’s not a bug.
Thank you for your answer! So is it not possible to have stable training with .ge when used as an intermediate layer?
If there are no other path from input to output other than comparison ops, then there will be no gradients.
It’s not really an issue of stability because, if you think about the function, greater than or equal is just flat almost everywhere.
How do you solve this problem? I run into the same needs. Thanks
Like SimonW said, this isn’t really a bug, just how the gradient surface of a piecewise operation is. I got around it by remaking my function into something that isn’t piecewise. I explain it in a bit more detail here: https://scientificattempts.wordpress.com/2018/06/03/conditionally-thresholded-cnns-for-weakly-supervised-image-segmentation/