Simple working example how to use packing for variable-length sequence inputs for rnn

(Justus Schwabedal) #22

Yeah, I think input for all RNN-type modules need to have a filter/channel dimension, or however you’d wanna call it.

(Adi R) #23

I have not seen any examples handle padding/packing to compute the loss.

Suppose I have a tagger (i.e. for each input token I have an output label) can I use padded/packed sequence to compute the loss as well?

(Sherin Thomas) #24

Now that you have pack_sequence available in master (should be available in 0.4) you don’t have to worry about padding your input with zeros and call pack_padded_sequence

>>> import torch
>>> import torch.nn.utils.rnn as rnn_utils
>>> a = torch.Tensor([1, 2, 3])
>>> b = torch.Tensor([4, 5])
>>> c = torch.Tensor([6])
>>> packed = rnn_utils.pack_sequence([a, b, c])

But if you are only concerned about padding your sequence, you can youse pad_sequence

>>> import torch
>>> import torch.nn.utils.rnn as rnn_utils
>>> a = torch.Tensor([1, 2, 3])
>>> b = torch.Tensor([4, 5])
>>> c = torch.Tensor([6])
>>> rnn_utils.pad_sequence([a, b, c], batch_first=True)

 1  2  3
 4  5  0
 6  0  0
[torch.FloatTensor of size (3,3)]

(jpeg729) #25

With pytorch 0.3.1.post2

AttributeError: module 'torch.nn.utils.rnn' has no attribute 'pad_sequence'
AttributeError: module 'torch.nn.utils.rnn' has no attribute 'pack_sequence'

(Sherin Thomas) #26

Looks like my mistake, it is available in current master probably will be available in 0.4 release. Updated my answer!!

(Sitara J) #27

hi,I run your codes, and then I find some errors. The size of batch_in should be (batch_size,feature_dim,max_length).And I changed them,but it has a new error.
“dimension out of range (expected to be in range of [-1, 0], but got 1)”

I don’t know what does it mean,maybe you can try and tell me something about it,thank you !

(Sitara J) #28

When I run the simple example that you have provided, I run into the error
“dimension out of range (expected to be in range of [-1, 0], but got 1)”
Is there anybody has the same problem with me?Can someone tell me why and how to fix it?

(Yifan) #29


According to pytorch doc,

Input can be of size T x B x * where T is the length of the longest sequence (equal to lengths[0]), B is the batch size, and * is any number of dimensions (including 0). If batch_first is True B x T x * inputs are expected.

As we set batch_first to True, the size of batch_in is expected to be (batch_size,max_length, feature_dim).

(Sitara J) #30

You’re right. And finally I find why I had the error. You said vec_1 = torch.FloatTensor([[1], [2], [3]]) and vec_1 = torch.FloatTensor([[1, 2, 3]]) here both are fine, but it isn’t. vec_1 = torch.FloatTensor([[1, 2, 3]]) has the size of [1,3] while the other has the size of [3,1].

(Yu Ching Lee) #31

Thanks @sitara_J, I encountered the same size problem.

(Barry Plunkett) #32

I’m still wondering how back propagation interacts with the padding. I’m trying to solve a problem where:

  1. I have padded variable length sequences as an input to an RNN layer.
  2. I want to pass the output at each step through a torch.nn.Linear layer
  3. I want to compute loss for each element and sum these for each sequence.

Since the behavior of torch.nn.utils.rnn.PackedSequence is somewhat mysterious right now, I’m not sure how to go about this without ruining my gradients.

(Duane Nielsen) #33

Thanks to Sherin, here is my minimal working example of packing

import torch
import torch.nn.utils.rnn as rnn_utils
import torch.nn as nn

a = torch.Tensor([[1], [2], [3]])
b = torch.Tensor([[4], [5]])
c = torch.Tensor([[6]])
packed = rnn_utils.pack_sequence([a, b, c])

lstm = nn.LSTM(1,3)

packed_output, (h,c) = lstm(packed)

y = rnn_utils.pad_packed_sequence(packed_output)

That is all.

How to use pack_sequence if we are going to use word embedding and BiLSTM
(Harsh Trivedi) #34

Here is another minimal example/tutorial for packing and unpacking sequences in pytorch (with diagrams of intermediate steps). Hope it helps!