In my network, I’m using packing for variable-length sequence inputs for the gru.
please see code below
. I still have my indices from sorting (variable ‘order’)
I’m doing the unpacking and unsorting as in the following code :
new_s , new_s_lengths = nn.utils.rnn.pad_packed_sequence(s) # s is the PackedSequence output = unscramble(new_s , new_s_lengths, order, batch_size)
unscramble does the unsorting stuff and is defined as follows
def unscramble(output, lengths, original_indices, batch_size): """ Takes the output from the model, the lengths, and original_indices, and batch size. Unscrambles the data, which had been sorted to make pack_padded_sequence work. Returns the unsscrambled and unpadded outputs. """ final_ids = (Variable(torch.from_numpy(np.array(lengths) - 1))).view(-1,1).expand(output.size(1),output.size(2)).unsqueeze(0) if cuda: final_ids = final_ids.to('cuda') final_outputs = output.cpu().gather(0, final_ids.cpu()).squeeze() unscrambled_outputs = final_outputs[original_indices] return unscrambled_outputs
However this is not my problem, the problem this function returns only the output of the last hidden state of the GRU, let’s say I have defined a GRU in the following way:
self.lstm = nn.GRU(300 , 400 , 1)
and having an input of shape :
[48 , 128 , 300] (48 is the longest sequence and 128 is the batch size) so after packing the sequence and unpacking and calling unscramble function which I have defined I will get a new tensor of shape :
[128 , 400] (the last hidden state with the original indices as before sorting by lengths).
my question here is how can I get the output of all of the hidden states? (in my example a tensor of shape
[128 , 48 , 400] .