This post is an extention to
Second Order Derivative with Nan Value - RuntimeError: Function 'SigmoidBackwardBackward0' returned nan values in its 0th output.
g1 = torch.autograd.grad(loss, params),
g2 = torch.autograd.grad(g1, params, grad_outputs), where
params denotes a list of parameters of a model, and hence
g1 would also be a list.
My new finding is that:
torch.autograd.set_detect_anomaly(True), only when setting
grad_outputs = params, the calculation of
g2will be error free, other value of
grad_outputswill lead to an error as shown below, even if
grad_outputsis set to a value that is equal to a slight perturbation of
Mm, meaning that the error still persist when I tried to modify the model a little bit.
Function '*Backward0' returned nan values in its 1th output.
- Situation in 1 seedmed to only bother the model I am currently dealing with, a relatively large contrastive Text-Image model.