Gradients of torch.where

tom-programming · November 20, 2018, 11:52pm

It seems connected to a thing I noted a while ago: Bug or feature? NaNs influence other variables in backprop
In that case there was a 0* inside the code of PyTorch that did 0*NaN and obtained NaN. The workaround for me has been to substitute NaNs with zeros.

Anyway, even if it’s expected behavior, I don’t feel the PyTorch implementation is the most reasonable