I see that there is a pack_sequence utility function used with Recurrent neural nets. There is a simple example to demonstrate usage of it. However, it does not include the word embedding usage. When I try to do the following I got an error:
import torch
import torch.nn.utils.rnn as rnn_utils
import torch.nn as nn
embeddings = nn.Embedding(6,10)
lstm = nn.LSTM(10, 200,num_layers=1, bidirectional=True)
a = torch.Tensor([[1], [2], [3]])
b = torch.Tensor([[4], [5]])
c = torch.Tensor([[6]])
packed = rnn_utils.pack_sequence([a, b, c])
emb = embeddings(packed) # This line raises an error
Error message: TypeError: embedding(): argument âindicesâ (position 2) must be Tensor, not PackedSequence
If I was succesful on the above line, I was going to give it as an input to lstm as below:
A couple of months ago, following (not the cleanest, but even works for multiple args) worked:
def elementwise_apply(fn, *args):
return torch.nn.utils.rnn.PackedSequence(fn(*[(arg.data if type(arg)==torch.nn.utils.rnn.PackedSequence else arg) for arg in args]), args[0].batch_sizes)
Thank you so much for your response Thomas. If you donât mind could you explain your function a little bit ? For example, could you tell me what is fn, what should be passed into to args etc. if they are well-known things in pytorch I am sorry for my request but I am also pretty new in pytorch. If possible a simple example demonstrating usage of this function would be awesome.
I tried following but I got an error:
import torch
import torch.nn.utils.rnn as rnn_utils
import torch.nn as nn
embeddings = nn.Embedding(6,10)
lstm = nn.LSTM(10, 200,num_layers=1, bidirectional=True)
a = torch.Tensor([[1], [2], [3]])
b = torch.Tensor([[4], [5]])
c = torch.Tensor([[6]])
packed = rnn_utils.pack_sequence([a, b, c])
emb = elementwise_apply(embeddings, packed)
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 âindicesâ to have scalar type Long; but got CPUFloatTensor instead (while checking arguments for embedding)
def simple_elementwise_apply(fn, packed_sequence):
"""applies a pointwise function fn to each element in packed_sequence"""
return torch.nn.utils.rnn.PackedSequence(fn(packed_sequence.data), packed_sequence.batch_sizes)
What this does is a) apply fn to the .data (which is where the flattened sequence elements live) and b) return a packed sequence with the result and the âbookkeepingâ of .batch_sizes.
The more elaborate version above does the same, but a) takes multiple arguments b) when the arguments are packed sequences it passes the .data to fn and otherwise the full argument.
I am afraid it is.
As much as I sympathize with a utility function âapplyâ would not hurt:
It seems that the use of packed sequences is relatively narrow and in particular it is probably not terribly common to need to intermix the operations on packed sequences a lot with other things. For example, I donât think HuggingFaceâs transformers library - quite popular for processing sequences - is using PackedSequence at all.
Quite likely, the âfiguring out what is the right thing to doâ is still much more difficult than âimplementing itâ.
Is there a clear tutorial about how to properly call the sequence of torch.nn.utils.rnn.pad_sequence, pad_packed_sequence and pack_padded_sequence? I canât find one in the documentation.
There might not beâŚ
So there are three equivalent representations here with different types:
List[Tensor] a list of sequences.
Tensor a single tensor of sequences with padding. Crucial extra information: pad value, whether it is seq first or batch first.
PackedSequence an object containing packed sequences. Contains the extra informaiton: batch sizes, indices from reordering.
So then the conversion functions all go between them, and you can just go by the type signatures to see which is appropriate: pad_sequence: 1 â 2 pad_packed_sequence 3 â 2 , pack_padded_sequence 2 â 3, pack_sequence 1 â 3.