The weight of the convolution kernel become NaN after training several batches

BrianPulfer · May 2, 2022, 8:10am

As @ptrblck suggested, you could use torch.autograd.set_detect_anomaly(True) to see when the gradient go to NaN and debug from there.

Hope this helps.