Function 'MulBackward0' returned nan values in its 1th output

Ruochen · February 12, 2022, 9:57am

Hey guys, I have a customized loss function here. When I train the model, following errors occurs,
to avoid such issues, I always add torch.clamp() to scale the vector especially for division and log,
I dont know why “MulBackward0” returned nan here, any ideas?

The error detected within the read box I draw.

AlphaBetaGamma96 · February 12, 2022, 12:22pm

You can debug your code by using set_detect_analomy.

Given the errors within MulBackward0 you’re most likely dividng by zero when calculating the gradient. So, using set_detect_analomy will help find it.

Finally, when sharing code make sure to copy and paste in the source rather than uploading a picture, of it. You can wrap code in 3 backticks ``` to get it formatted correctly. For example,

import torch
x = torch.randn(5)
mean = torch.mean(x)

Ruochen · February 12, 2022, 1:47pm

Many thanks! I am new using forums, next time I will use this format to make my codes and questions clear, thanks