Gradient after fill_

ZeweiChu · August 3, 2018, 8:49pm

I wrote the following code:

k = topk_probs.shape[1]
positions2 = positions.clone().detach()
positions2.data.fill_(k-2)
positions2.data[target_na.data == 1] = k-1
positions2 = positions2.view(-1, 1)
target = topk_probs.gather(1, positions2)
hinge_loss = hinge_crit(target, topk_probs, 0.05)

positions2 of shape (batch_size, 1) are indices from which we want to get the values. I fill them by k-2 initially, but in the cases where target_na == 1, we set the index to be k-1. Then I gather from topk_probs, which is of shape (batch_size, k).

hinge_crit is a hinge loss I implemented myself.

This would give me an error

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

I am wondering how I can avoid this error.

I tried adding

positions2 = positions2.view(-1, 1).detach()

but it still gives me the same error. I thought detach will make the variable detached from the previous computation graph and there will be no gradient any more. Did I misunderstand it?

tom · August 4, 2018, 8:42pm

If there are two options here based on a boolean criterion using torch.where to create positions2 might be a good way.

Best regards

Thomas