Runtime Shape errors in the backward pass

Raahul-Singh · November 19, 2020, 5:37pm

Hi folks!
While running an experiment, I keep getting this error in the backward pass.
The forward works properly. Could anyone help me understand what this is about?

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-83-0b1a3ff3f774> in <module>
      2     out = net(batch[0])
      3     loss = nn.MSELoss()(out, batch[1])
----> 4     loss.backward(retain_graph=False)

~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    183                 products. Defaults to ``False``.
    184         """
--> 185         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    186 
    187     def register_hook(self, hook):

~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    125     Variable._execution_engine.run_backward(
    126         tensors, grad_tensors, retain_graph, create_graph,
--> 127         allow_unreachable=True)  # allow_unreachable flag
    128 
    129 

RuntimeError: shape '[2, 1, 4]' is invalid for input of size 4

albanD · November 19, 2020, 5:48pm

Hi,

You can enable anomaly mode doc to be able to know which forward function caused the issue.
Could you post that information here? That will help us pinpoint the issue.

Raahul-Singh · November 19, 2020, 6:09pm

Hi! Thanks for replying!

So I ran it in detect_anomaly block:

with autograd.detect_anomaly():
    out=net(batch[0])
    loss = nn.MSELoss()(out, batch[1])
    loss.backward(retain_graph=False)

The error still is:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-28-f52d85a34f90> in <module>
      2     out=net(batch[0])
      3     loss = nn.MSELoss()(out, batch[1])
----> 4     loss.backward(retain_graph=False)

~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    183                 products. Defaults to ``False``.
    184         """
--> 185         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    186 
    187     def register_hook(self, hook):

~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    125     Variable._execution_engine.run_backward(
    126         tensors, grad_tensors, retain_graph, create_graph,
--> 127         allow_unreachable=True)  # allow_unreachable flag
    128 
    129 

RuntimeError: shape '[2, 10, 4]' is invalid for input of size 40

Raahul-Singh · November 19, 2020, 6:12pm

I did not know about the anomaly catching block, this would be really useful for this other problem that I was facing where I got NaNs popping up sometimes. Thanks @albanD!

albanD · November 19, 2020, 6:18pm

But you should now have a warning above that error that tells you where it comes from.
If you’re using a non-nightly version of pytorch, you might want to run that in a command line version (not a notebook) to make sure the warning is not eaten by the notebook.

Raahul-Singh · November 19, 2020, 8:14pm

Hi, this is the Error I got:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-3-06ac8d953a44> in <module>
     15         out=net(batch[0])
     16         loss = nn.MSELoss()(out, batch[1])
---> 17         loss.backward(retain_graph=False)
     18     print(i)
     19

~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    183                 products. Defaults to ``False``.
    184         """
--> 185         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    186
    187     def register_hook(self, hook):

~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    125     Variable._execution_engine.run_backward(
    126         tensors, grad_tensors, retain_graph, create_graph,
--> 127         allow_unreachable=True)  # allow_unreachable flag
    128
    129

RuntimeError: Function 'AddmmBackward' returned nan values in its 2th output.

Raahul-Singh · November 19, 2020, 8:57pm

I think this will add more context. I am working with Neural ODEs using the torchdyn library, which is based on torchdiffeq. torchdiffeq recently added support for using SciPy ODE solvers
Here is the complete error log:

/Users/rasalghul/Desktop/OSS/torchdyn/torchdyn/models/neuralde.py:100: UserWarning: CUDA is not available with SciPy solvers.
warnings.warn(UserWarning("CUDA is not available with SciPy solvers."))
-c:14: UserWarning: Anomaly Detection has been enabled. This mode will increase the runtime and should only be enabled for debugging.
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.3246823262888D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1062687637051D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.2656719092627D-18
lsoda-- above warning has been issued i1 times.
it will not be issued again for this problem
in above message, i1 = 10
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.3246823262888D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1062687637051D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.2656719092627D-18
lsoda-- above warning has been issued i1 times.
it will not be issued again for this problem
in above message, i1 = 10
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.3246823262888D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1062687637051D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.2656719092627D-18
lsoda-- above warning has been issued i1 times.
it will not be issued again for this problem
in above message, i1 = 10
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.3246823262888D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1062687637051D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.2656719092627D-18
lsoda-- above warning has been issued i1 times.
it will not be issued again for this problem
in above message, i1 = 10
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.5194917220621D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1298729305155D-16
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.3246823262888D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.8117058157221D-18
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.1062687637051D-17
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
in above, r1 = 0.8615037536311D+00 r2 = 0.2656719092627D-18
lsoda-- above warning has been issued i1 times.
it will not be issued again for this problem
in above message, i1 = 10
lsoda-- at t (=r1) and step size h (=r2), the
corrector convergence failed repeatedly
or with abs(h) = hmin
in above, r1 = 0.8881958337303D+00 r2 = 0.7908523116782D-36
/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/scipy/integrate/_ode.py:1351: UserWarning: lsoda: Repeated convergence failures (perhaps bad Jacobian or tolerances).
self.messages.get(istate, unexpected_istate_msg)))
capi_return is NULL
Call-back cb_f_in_lsoda__user__routines failed.
[W python_anomaly_mode.cpp:60] Warning: Error detected in AddmmBackward. Traceback of forward call that caused the error:
File "<string>", line 1, in <module>
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/traitlets/config/application.py", line 664, in launch_instance
app.start()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/terminal/ipapp.py", line 356, in start
self.shell.mainloop()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 564, in mainloop
self.interact()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 555, in interact
self.run_cell(code, store_history=True)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2877, in run_cell
raw_cell, store_history, silent, shell_futures)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2922, in _run_cell
return runner(coro)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/async_helpers.py", line 68, in pseudo_sync_runner
coro.send(None)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3146, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3337, in run_ast_nodes
if (await self.run_code(code, result, async=asy)):
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-06ac8d953a44>", line 17, in <module>
loss.backward(retain_graph=False)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/tensor.py", line 185, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/autograd/init.py", line 127, in backward
allow_unreachable=True) # allow_unreachable flag
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/autograd/function.py", line 78, in apply
return self._forward_cls.backward(self, *args)
File "/Users/rasalghul/Desktop/OSS/torchdyn/torchdyn/sensitivity/adjoint.py", line 208, in backward
rtol=rtol, atol=atol, method=method, options=options)
File "/Users/rasalghul/Desktop/OSS/torchdiffeq/torchdiffeq/_impl/odeint.py", line 67, in odeint
solution = solver.integrate(t)
File "/Users/rasalghul/Desktop/OSS/torchdiffeq/torchdiffeq/_impl/scipy_wrapper.py", line 36, in integrate
atol=self.atol,
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/scipy/integrate/_ivp/ivp.py", line 576, in solve_ivp
message = solver.step()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/scipy/integrate/_ivp/base.py", line 181, in step
success, message = self._step_impl()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/scipy/integrate/_ivp/lsoda.py", line 150, in _step_impl
self.t_bound, solver.f_params, solver.jac_params)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/scipy/integrate/_ode.py", line 1346, in run
y1, t, istate = self.runner(*args)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/scipy/integrate/_ivp/base.py", line 138, in fun
return self.fun_single(t, y)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/scipy/integrate/_ivp/base.py", line 20, in fun_wrapped
return np.asarray(fun(t, y), dtype=dtype)
File "/Users/rasalghul/Desktop/OSS/torchdiffeq/torchdiffeq/_impl/scipy_wrapper.py", line 49, in np_func
f = func(t, y)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/rasalghul/Desktop/OSS/torchdiffeq/torchdiffeq/_impl/misc.py", line 163, in forward
return -self.base_func(-t, y)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/rasalghul/Desktop/OSS/torchdiffeq/torchdiffeq/_impl/misc.py", line 153, in forward
f = self.base_func(t, _flat_to_shape(y, (), self.shapes))
File "/Users/rasalghul/Desktop/OSS/torchdyn/torchdyn/sensitivity/adjoint.py", line 167, in solve_backward
dzds = self.f(s, z)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/rasalghul/Desktop/OSS/torchdyn/torchdyn/models/defunc.py", line 51, in forward
else: x = self.m(x)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "<ipython-input-1-16a4f604f5d6>", line 743, in forward
self.kernel_x = self.kernel(x)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 91, in forward
return F.linear(input, self.weight, self.bias)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/functional.py", line 1674, in linear
ret = torch.addmm(bias, input, weight.t())
(function print_stack)
[W python_anomaly_mode.cpp:60] Warning: Error detected in autograd_adjointBackward. Traceback of forward call that caused the error:
File "<string>", line 1, in <module>
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/traitlets/config/application.py", line 664, in launch_instance
app.start()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/terminal/ipapp.py", line 356, in start
self.shell.mainloop()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 564, in mainloop
self.interact()
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 555, in interact
self.run_cell(code, store_history=True)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2877, in run_cell
raw_cell, store_history, silent, shell_futures)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2922, in _run_cell
return runner(coro)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/async_helpers.py", line 68, in pseudo_sync_runner
coro.send(None)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3146, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3337, in run_ast_nodes
if (await self.run_code(code, result, async=asy)):
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-06ac8d953a44>", line 15, in <module>
out=net(batch[0])
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "<ipython-input-1-16a4f604f5d6>", line 911, in forward
activity_cur = self.activity_NODE(in_NODE)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "<ipython-input-1-16a4f604f5d6>", line 799, in forward
y = self.neural_de(x)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/rasalghul/Desktop/OSS/torchdyn/torchdyn/models/neuralde.py", line 139, in forward
out = odeint(x)
File "/Users/rasalghul/Desktop/OSS/torchdyn/torchdyn/models/neuralde.py", line 165, in _adjoint
return self.adjoint(self.defunc, x, self.s_span, rtol=self.rtol, atol=self.atol, **self.solver)
File "/Users/rasalghul/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/rasalghul/Desktop/OSS/torchdyn/torchdyn/sensitivity/adjoint.py", line 223, in forward
sol = self._wrapped_adjoint_func.apply(h0, flat_params, s_span)
(function print_stack)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-3-06ac8d953a44> in <module>
15 out=net(batch[0])
16 loss = nn.MSELoss()(out, batch[1])
---> 17 loss.backward(retain_graph=False)
18 print(i)
19
~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
183 products. Defaults to False.
184 """
--> 185 torch.autograd.backward(self, gradient, retain_graph, create_graph)
186
187 def register_hook(self, hook):
~/anaconda3/envs/Juniper/lib/python3.7/site-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
125 Variable._execution_engine.run_backward(
126 tensors, grad_tensors, retain_graph, create_graph,
--> 127 allow_unreachable=True) # allow_unreachable flag
128
129
RuntimeError: Function 'AddmmBackward' returned nan values in its 2th output.