Currently I am testing with gradcheck the following loss link.
Gradcheck fails with the ‘backward not multiplied by grad_output’ error.
What does this mean ?
Currently I am testing with gradcheck the following loss link.
Gradcheck fails with the ‘backward not multiplied by grad_output’ error.
What does this mean ?
Let’s say you have the following (pseudo)code:
probs = Variable(probs, requires_grad=True) # tells autograd to compute gradients for probs
cost = ctc_loss(probs, labels, probs_sizes, label_sizes)
cost_3 = cost * 3
cost_3.backward()
and
probs2 = Variable(probs2, requires_grad=True) # tells autograd to compute gradients for probs
cost = ctc_loss(probs2, labels, probs_sizes, label_sizes)
cost.backward()
You’d expect probs.grad
to be 3 times that of probs2.grad
, because of the final multiply by 3 in cost_3
.
That’s what the 'backward not multiplied by grad_output’ checks: that when the output of the function (ctc_loss in this case) is then operated on by some operator, backpropagation still works.
Thanks for the reply! Does this mean that the loss function is broken ?
Yes.
You can fix it pretty easily, though:
https://github.com/SeanNaren/warp-ctc/blob/8719493fbd0dd9b1195d531184d8e8b1d424abe9/pytorch_binding/warpctc_pytorch/init.py#L42
change that to
return self.grads * grad_output, None, None, None
and it’ll probably be fine.
Thanks again! I will try this tomorrow morning.
Hi, it worked tanks! who should submit the patch ? I think that you have done most of the work
Celebrated to early the line should be
return self.grads * grad_output.type_as(self.grads), None, None, None
Go ahead and submit a patch; I have no clue what warp-ctc is/am not involved with it