I have a batch of sequences that have a variable length. To save computation I used pack_padded_sequence as following:
input = torch.nn.utils.rnn.pad_sequence(input, batch_first=True)
input = torch.nn.utils.rnn.pack_padded_sequence(input,
batch_first=True,
lengths=lengths)
Because sequences are long, I use gradient checkpointing to save memory
output, hiddens = cp.checkpoint(self.gru, *(input, hiddens, self.dummy_tensor))
As a result I have such error:
File ".../src/sequence_models/gru.py", line 86, in forward
output, hiddens = cp.checkpoint(self.gru, *(input, hiddens, self.dummy_tensor))
File ".../torch/utils/checkpoint.py", line 177, in checkpoint
return CheckpointFunction.apply(function, preserve, *args)
TypeError: CheckpointFunctionBackward.forward: expected Tensor or tuple of Tensor (got PackedSequence) for return value 0
How can I handle it?