RuntimeError: Function 'SqrtBackward' returned nan values in its 0th output

Hi,
The error in title is triggered by the following lines while training:

k = (y**2).sum(dim=2,keepdim=True)
r = k.sqrt()

So it appears we can have a nan derivative for sqrt(0), how can we circumvent the problem in my case?

Thanks!

1 Like

OK, I’ll avoid using sqrt() and formulate the problem in terms of r^2 instead.

1 Like

can use
torch.sqrt(x + 1e-8)
replace
torch.sqrt(x)
to solve this problem

7 Likes

I met the same problem. However, I don’t recommend your choice to change the sqrt into the square, since it might make the number calculated in your model bigger and bigger. I suppose it might cause a problem if you have multiple this kind of layers.

Thanks for the suggestion!

Thanks, that’s it! :rofl: :ghost: