In-place operation in Copynet with coverage mechanism

Zhujunnan · November 26, 2017, 9:24am

Hello, I am implementing a Copynet with coverage mechanism.

# (batch_size * input_len)
batch_indices = batch_indices.expand(input_len, batch_size).transpose(1,0).contiguous().view(-1)
batch_indices = self.to_cuda(batch_indices)
# (batch_size * input_len)
idx_repeat = torch.arange(start=0, end=input_len).repeat(batch_size).long()
idx_repeat = self.to_cuda(idx_repeat)
# (batch_size * input_len)
word_indices = input_extend_vocab.data.contiguous().view(-1)

p_copy = torch.zeros(batch_size, self.trg_vocab_size + max_oov_len)
p_copy = self.to_cuda(Variable(p_copy))
p_copy[batch_indices, word_indices] += attn[batch_indices, idx_repeat]
coverage[batch_indices, idx_repeat] += attn[batch_indices, idx_repeat]
p_copy = torch.mul(p_copy, (1-p_gen))

But when I use loss.backward. There is an error : “RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation”. How can I avoid this error?

chenyuntc · November 26, 2017, 12:11pm

p_copy[batch_indices, word_indices] += attn[batch_indices, idx_repeat]
coverage[batch_indices, idx_repeat] += attn[batch_indices, idx_repeat]

try something like

new_converage = torch.zeros(converage.shape)
# or
# new_converage =  converage.data.copy()
new_converage = Variable(torch.zeros)

new_coverage[batch_indices, idx_repeat] =  attn[batch_indices, idx_repeat] + coverage[batch_indices, idx_repeat]

Zhujunnan · November 26, 2017, 12:31pm

Does it affect the calculation of gradients?

chenyuntc · November 26, 2017, 12:33pm

Well, I don’t know what your computation graph look like.
But In-place operation can not be backward.

I think you should create a new variable for output.

Zhujunnan · November 26, 2017, 2:52pm

Thanks for your patience!