When to use padding vs packing sequences

himat · April 12, 2018, 4:22pm

I have some text data where each example is of variable length, and they are currently not padded.

I have seen the using packed sequences seems to be recommended, but I also read on a post that if you want to make a custom RNN cell, that adding support for packed sequences is a lot more effort.

So does it really matter whether I pad or pack my sequences? Is packing recommended because it is more efficient? Will any RNN in PyTorch work with padded sequences by default, or do I have to do some masking on the gradients?