The conv weight grad from loss.backward has nans, but it only has infs when compute manually

The operation is Conv2d in stem stage in resnet.
The test steps was posted here