How to use pack_sequence if we are going to use word embedding and BiLSTM

Hi everyone,

I see that there is a pack_sequence utility function used with Recurrent neural nets. There is a simple example to demonstrate usage of it. However, it does not include the word embedding usage. When I try to do the following I got an error:

import torch
import torch.nn.utils.rnn as rnn_utils
import torch.nn as nn


embeddings = nn.Embedding(6,10)
lstm = nn.LSTM(10, 200,num_layers=1, bidirectional=True)


a = torch.Tensor([[1], [2], [3]])
b = torch.Tensor([[4], [5]])
c = torch.Tensor([[6]])
packed = rnn_utils.pack_sequence([a, b, c])
emb  = embeddings(packed) # This line raises an error

Error message: TypeError: embedding(): argument ‘indices’ (position 2) must be Tensor, not PackedSequence

If I was succesful on the above line, I was going to give it as an input to lstm as below:

packed_output, (h,c) = lstm(emb)

A couple of months ago, following (not the cleanest, but even works for multiple args) worked:

def elementwise_apply(fn, *args):
    return torch.nn.utils.rnn.PackedSequence(fn(*[(arg.data if type(arg)==torch.nn.utils.rnn.PackedSequence else arg) for arg in args]), args[0].batch_sizes)

and then
emb = elementwise_apply(embeds, packed)

Best regards

Thomas

Thank you so much for your response Thomas. If you don’t mind could you explain your function a little bit ? For example, could you tell me what is fn, what should be passed into to args etc. if they are well-known things in pytorch I am sorry for my request but I am also pretty new in pytorch. If possible a simple example demonstrating usage of this function would be awesome.

I tried following but I got an error:

import torch
import torch.nn.utils.rnn as rnn_utils
import torch.nn as nn
embeddings = nn.Embedding(6,10)
lstm = nn.LSTM(10, 200,num_layers=1, bidirectional=True)

a = torch.Tensor([[1], [2], [3]])
b = torch.Tensor([[4], [5]])
c = torch.Tensor([[6]])
packed = rnn_utils.pack_sequence([a, b, c])
emb = elementwise_apply(embeddings, packed)

return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 ‘indices’ to have scalar type Long; but got CPUFloatTensor instead (while checking arguments for embedding)

Fair enough!

Let’s make a simpler version:

def simple_elementwise_apply(fn, packed_sequence):
    """applies a pointwise function fn to each element in packed_sequence"""
    return torch.nn.utils.rnn.PackedSequence(fn(packed_sequence.data), packed_sequence.batch_sizes)

What this does is a) apply fn to the .data (which is where the flattened sequence elements live) and b) return a packed sequence with the result and the “bookkeeping” of .batch_sizes.
The more elaborate version above does the same, but a) takes multiple arguments b) when the arguments are packed sequences it passes the .data to fn and otherwise the full argument.

Best regards

Thomas

1 Like

Is this still the standard, 3 years later? Why is working with sequences so hard?

Hi Rylan!

I am afraid it is.
As much as I sympathize with a utility function “apply” would not hurt:

Best regards

Thomas

Is there a clear tutorial about how to properly call the sequence of torch.nn.utils.rnn.pad_sequence, pad_packed_sequence and pack_padded_sequence? I can’t find one in the documentation.

There might not be…
So there are three equivalent representations here with different types:

  1. List[Tensor] a list of sequences.
  2. Tensor a single tensor of sequences with padding. Crucial extra information: pad value, whether it is seq first or batch first.
  3. PackedSequence an object containing packed sequences. Contains the extra informaiton: batch sizes, indices from reordering.

So then the conversion functions all go between them, and you can just go by the type signatures to see which is appropriate: pad_sequence: 1 → 2 pad_packed_sequence 3 → 2 , pack_padded_sequence 2 → 3, pack_sequence 1 → 3.

1 Like