In my network, I’m using packing for variable-length sequence inputs for the gru.
please see code below
. I still have my indices from sorting (variable ‘order’)
I’m doing the unpacking and unsorting as in the following code :
new_s , new_s_lengths = nn.utils.rnn.pad_packed_sequence(s) # s is the PackedSequence
output = unscramble(new_s , new_s_lengths, order, batch_size)
The function unscramble
does the unsorting stuff and is defined as follows
def unscramble(output, lengths, original_indices, batch_size):
"""
Takes the output from the model, the lengths, and original_indices, and batch size.
Unscrambles the data, which had been sorted to make pack_padded_sequence work.
Returns the unsscrambled and unpadded outputs.
"""
final_ids = (Variable(torch.from_numpy(np.array(lengths) - 1))).view(-1,1).expand(output.size(1),output.size(2)).unsqueeze(0)
if cuda:
final_ids = final_ids.to('cuda')
final_outputs = output.cpu().gather(0, final_ids.cpu()).squeeze()
unscrambled_outputs = final_outputs[original_indices]
return unscrambled_outputs
However this is not my problem, the problem this function returns only the output of the last hidden state of the GRU, let’s say I have defined a GRU in the following way:
self.lstm = nn.GRU(300 , 400 , 1)
and having an input of shape : [48 , 128 , 300]
(48 is the longest sequence and 128 is the batch size) so after packing the sequence and unpacking and calling unscramble function which I have defined I will get a new tensor of shape : [128 , 400]
(the last hidden state with the original indices as before sorting by lengths).
my question here is how can I get the output of all of the hidden states? (in my example a tensor of shape [128 , 48 , 400]
.
Thanks.