RuntimeError: Function 'DivBackward0' returned nan values in its 1th output

Dear all,
I know this problem has been asked a million times already. But right now I’m facing one and really need your help.
I have a snippet as follow:

import torch

torch.autograd.set_detect_anomaly(True)
temp = torch.load('temp.pt')
clean = temp['clean'].clone().detach().requires_grad_(True)
noise = temp['noise'].clone().detach().requires_grad_(True)

sm_mask = torch.exp(clean) / (torch.exp(clean) + torch.exp(noise) + 1e-12)
sq_mask = torch.sqrt(sm_mask)

sq_mask.mean().backward() <----- error here
print('')

for the clean matrix, it’s max value is 0.004, it’s min is 0
for the noise matrix, it’s max value is 14, it’s min is 0.09

That means there is neither nan nor inf in both of them, and sm_mask will have no negative entries either, since their minimums is positive.
Can anyone please give me an idea why torch.sqrt(sm_mask) causes the error?

PS: If I run

sq_mask = torch.sqrt(sm_mask + 1e-12)  # instead of torchsqrt(sm_mask)

Then there is no error, why is that?

PS2: I’m sorry, here is the stacktrace:

[W ..\torch\csrc\autograd\python_anomaly_mode.cpp:104] Warning: Error detected in DivBackward0. Traceback of forward call that caused the error:
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py", line 2173, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py", line 2164, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py", line 1476, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "D:/Master/JAIST/Proposal/Code/Master3/Test.py", line 8, in <module>
    sm_mask = torch.exp(clean) / (torch.exp(clean) + torch.exp(noise) + 1e-12)
 (function _print_stack)
Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "D:/Master/JAIST/Proposal/Code/Master3/Test.py", line 13, in <module>
    sq_mask.mean().backward()
  File "C:\Users\duyvo\anaconda3\envs\Code\lib\site-packages\torch\_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "C:\Users\duyvo\anaconda3\envs\Code\lib\site-packages\torch\autograd\__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: Function 'DivBackward0' returned nan values in its 1th output.
python-BaseException

Process finished with exit code 1

Can you check your loss values is ok before calling .backward()? Also, the derivative of sqrt(x) is 0.5 * x**(-0.5) so check your input is positive too

1 Like

Dear @AlphaBetaGamma96 ,
Thank you very much for the reply.

Yes, the input x to the sqrt(x) is ok, but it have the minimum value being zero.
I thought it was ok since sqrt(0) = 0. Now that you mention the derivative of sqrt(x). I think x have zeros in it is the issue here. Am I correct?

Most likely as 0^(-0.5) is undefined, so it be should strictly positive which is most likely why your value works if you include a small offset value

1 Like

Yes, I think so too. But what bother me is the stacktrace. Why is it the function ‘DivBackward’ causing error but not the ‘SqrtBackward’?
And also, when I run this snippet

te = torch.zeros(4, requires_grad=True)
sq_mask = torch.sqrt(te)

sq_mask.mean().backward()

Then it is ok?
By adding the small offset, the model is running fine but I’m still bothered because this solution doesn’t make sense with the error whatsoever.

because the sqrt of zero is 0, but by dividing 0 is undefined that’s why the stacktrace has an issue with that command.

1 Like

Yeah, I think that would clear my confusion for now.
The actual DivBackward culprit was 1/(2*sqrt(x)). I thought

sm_mask = torch.exp(clean) / (torch.exp(clean) + torch.exp(noise) + 1e-12)

this division was causing it.

Anyways, thank you very much.

1 Like