Assertion error in loss.backward()

my stack trace is

AssertionError Traceback (most recent call last)
in ()
105 loss = loss_fn(out,target1)
106 print “loss is :”,[0]
–> 107 loss.backward()
108 optimizer.update()

/home/vaijenath/Vaiju/lib/python2.7/site-packages/torch/autograd/variable.pyc in backward(self, gradient, retain_graph, create_graph, retain_variables)
154 Variable.
155 “”"
–> 156 torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
158 def register_hook(self, hook):

/home/vaijenath/Vaiju/lib/python2.7/site-packages/torch/autograd/init.pyc in backward(variables, grad_variables, retain_graph, create_graph, retain_variables)
97 Variable._execution_engine.run_backward(
—> 98 variables, grad_variables, retain_graph)

/home/vaijenath/Vaiju/lib/python2.7/site-packages/torch/autograd/function.pyc in _do_backward(self, gradients, retain_variables)
289 def _do_backward(self, gradients, retain_variables):
290 self.retain_variables = retain_variables
–> 291 result = super(NestedIOFunction, self)._do_backward(gradients, retain_variables)
292 if not retain_variables:
293 del self._nested_output

/home/vaijenath/Vaiju/lib/python2.7/site-packages/torch/autograd/function.pyc in backward(self, *gradients)
297 def backward(self, *gradients):
298 nested_gradients = _unflatten(gradients, self._nested_output)
–> 299 result = self.backward_extended(*nested_gradients)
300 return tuple(_iter_None_tensors(result))

/home/vaijenath/Vaiju/lib/python2.7/site-packages/torch/nn/_functions/rnn.pyc in backward_extended(self, grad_output, grad_hy)
332 output,
333 weight,
–> 334 grad_weight)
335 else:
336 grad_weight = [(None,) * len(layer_weight) for layer_weight in weight]

/home/vaijenath/Vaiju/lib/python2.7/site-packages/torch/backends/cudnn/rnn.pyc in backward_weight(fn, input, hx, output, weight, grad_weight)
467 # copy the weights from the weight_buf into grad_weight
–> 468 grad_params = get_parameters(fn, handle, dw)
469 _copyParams(grad_params, grad_weight)
470 return grad_weight

/home/vaijenath/Vaiju/lib/python2.7/site-packages/torch/backends/cudnn/rnn.pyc in get_parameters(fn, handle, weight_buf)
169 layer_params.append(param)
170 else:
–> 171 assert cur_offset == offset
173 cur_offset = offset + filter_dim_a[0]


What is cur_offset and offset here?

Have you figured it out? I also met the same issues, not sure how to debug the backward…

Could one of you provide a minimal example that triggers the assertion error?

Hi, like the images, just used a simple GRU as encoder model, and when debug, I tried the MSE loss, it triggered the error. Thanks!

just find if I added .float() in the input_sentence_var=Variable(…float()), then it would work.

Thanks. This also works for my situation, which is basically caused by the type problem.

Where do you add the .float()? it’s not clear to me.

Well, it depends on the specific situation. For me, I add .float() to the output of the embedding layer (which was a LongTensor variable). I suggest that you can check the output of each step to see where the type problem might occur.

Thanks. I am having the same error. I will see if this solution applies there.