Best way to debug NaNs in gradients?

We are observing NaNs in a non-standard recurrent neural network implemented in PyTorch.

Is the best way to debug NaNs in gradients to register a backward hook?

I found this thread from googling: Register_backward_hook on nn.Sequential

It seems like this would let us inspect the values of the gradients.

Would it also be possible to get the graph for the gradients using something like this? grad = torch.autograd.grad(loss, vars, create_graph=True)[0] and then somehow print out all values to see where the NaNs start occurring?

Thank you!

1 Like