DataLoader for various length of data

chihyaoma · August 18, 2017, 10:36pm

You need to customize your own dataloader.

What you need is basically pad your variable-length of input and torch.stack() them together into a single tensor. This tensor will then be used as an input to your model.

I think it’s worth to mention that using pack_padded_sequence isn’t absolutely necessary. pack_padded_sequence is kind of designed to work with the LSTM/GPU/RNN from cuDNN. They are optimized to run very fast.

But, if you have your own proposed method that prevents you from using standard LSTM/GPU/RNN, as mentioned here:

The easiest way to make a custom RNN compatible with variable-length sequences is to do what this repo does (masking) GitHub - jihunchoi/recurrent-batch-normalization-pytorch: PyTorch implementation of recurrent batch normalization