I have a custom torch.autograd.Function where I print out grad_in directly before returning it. The grad_in has non-zero values but when I look at convLayer.weight.grad is sometimes is all zeros. Additionally I am comparing this with old code from PyTorch 0.3 trying to transition to PyTorch 1.0. As far as I can tell the inputs, outputs, weights, and all values throughout the entire network are exactly the same. However with the same input in PyTorch 0.3 and the same grad_in, Pytorch 0.3 assigns a value to convLayer.weight.grad while in Pytorch 1.0 it does not.