Function 'SvdHelperBackward' returned nan values in its 0th output

I am using torch.svd() in my code. I am getting this error at backward() function call. I have read and tried many things. Nothing worked. I have written my own pinv() function and in that I am using torch.svd(). Please help! Following is the stack trace.

/home/kavita/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py:145: UserWarning: Error detected in SvdHelperBackward. Traceback of forward call that caused the error:
File “/home/kavita/anaconda3/lib/python3.8/runpy.py”, line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/home/kavita/anaconda3/lib/python3.8/runpy.py”, line 87, in _run_code
exec(code, run_globals)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/ipykernel_launcher.py”, line 16, in
app.launch_new_instance()
File “/home/kavita/anaconda3/lib/python3.8/site-packages/traitlets/config/application.py”, line 845, in launch_instance
app.start()
File “/home/kavita/anaconda3/lib/python3.8/site-packages/ipykernel/kernelapp.py”, line 612, in start
self.io_loop.start()
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/platform/asyncio.py”, line 149, in start
self.asyncio_loop.run_forever()
File “/home/kavita/anaconda3/lib/python3.8/asyncio/base_events.py”, line 570, in run_forever
self._run_once()
File “/home/kavita/anaconda3/lib/python3.8/asyncio/base_events.py”, line 1859, in _run_once
handle._run()
File “/home/kavita/anaconda3/lib/python3.8/asyncio/events.py”, line 81, in _run
self._context.run(self._callback, *self._args)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/ioloop.py”, line 690, in
lambda f: self._run_callback(functools.partial(callback, future))
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/ioloop.py”, line 743, in _run_callback
ret = callback()
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/gen.py”, line 787, in inner
self.run()
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/gen.py”, line 748, in run
yielded = self.gen.send(value)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/ipykernel/kernelbase.py”, line 365, in process_one
yield gen.maybe_future(dispatch(*args))
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/gen.py”, line 209, in wrapper
yielded = next(result)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/ipykernel/kernelbase.py”, line 268, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/gen.py”, line 209, in wrapper
yielded = next(result)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/ipykernel/kernelbase.py”, line 543, in execute_request
self.do_execute(
File “/home/kavita/anaconda3/lib/python3.8/site-packages/tornado/gen.py”, line 209, in wrapper
yielded = next(result)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/ipykernel/ipkernel.py”, line 306, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/ipykernel/zmqshell.py”, line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 2877, in run_cell
result = self._run_cell(
File “/home/kavita/anaconda3/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 2923, in _run_cell
return runner(coro)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/IPython/core/async_helpers.py”, line 68, in pseudo_sync_runner
coro.send(None)
File “/home/kavita/anaconda3/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 3146, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File “/home/kavita/anaconda3/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 3338, in run_ast_nodes
if (await self.run_code(code, result, async
=asy)):
File “/home/kavita/anaconda3/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 3418, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “”, line 38, in
res2 = pinv2(torch.matmul(zt_U, zt_U_transp))
File “”, line 18, in pinv2
u, s, vt = LA.svd(a, full_matrices=True)
(Triggered internally at /pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:104.)
Variable._execution_engine.run_backward(

RuntimeError Traceback (most recent call last)
in
51 # print(’$’, torch.isnan(param.data).int().sum())
52 optimizer.zero_grad()
—> 53 total_loss.backward()
54 #print(’$’)
55

~/anaconda3/lib/python3.8/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
243 create_graph=create_graph,
244 inputs=inputs)
→ 245 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
246
247 def register_hook(self, hook):

~/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
143 retain_graph = create_graph
144
→ 145 Variable.execution_engine.run_backward(
146 tensors, grad_tensors
, retain_graph, create_graph, inputs,
147 allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag

RuntimeError: Function ‘SvdHelperBackward’ returned nan values in its 0th output.

Following is the array of singular values I get from svd():
[4.7464e-02, 2.9351e+03, 2.1174e+05, 2.7355e+05, 2.7720e+05, 3.3497e+05,
3.8153e+05, 4.6172e+05, 5.1771e+05, 6.3530e+05, 7.1788e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05, 8.1157e+05,
8.1157e+05, 8.8670e+05, 1.3213e+06, 1.3892e+06]

Hi,

You can check the warning for the svd function: torch.linalg.svd — PyTorch master documentation
In particular it says that gradients are not well defined if there are repeated singular values.
You might want to add a small noise to your matrix to avoid such repeated values I think.