I’m using pack_padded_sequence() to preprocess my input x for an RNN model. This requires me to sort the sequences of x in a batch by their lengths and permute x accordingly. Does this mean that when I compute the loss using self.criterion(output, answer) that I need to permute the answers the same way? i.e. answer = answer[perm_idx]? Or should I reverse the x permutation (ist here an easier way to do this?)
Reversing index:
reverse_idx = to_var(torch.zeros(len(perm_idx)))
for i in range(len(perm_idx)):
reverse_idx[i] = perm_idx[i]
output = output[reverse_idx]
Trying to compute loss:
output = self.net(x, x_len)
loss = self.criterion(output, answer)
# DEFINITION OF NET.forward()
def forward(self, x, x_len):
# SORT YOUR TENSORS BY LENGTH!
x_len, perm_idx = x_len.sort(0, descending=True)
x = x[perm_idx]
# pack them up nicely
packed_input = pack_padded_sequence(x, x_len.data.cpu().numpy(), batch_first=True)
h0 = to_var(torch.randn(self.rnn_num_layers, x.size(0), self.rnn_hidden_size))
hiddens, last_hidden = self.gru(packed_input, h0)
preactivations = self.fc(last_hidden.squeeze())
return self.softmax(preactivations)
No. That is wrong. The ordered_idx is a map from sorted element index to original index. You need the reverse of that to achieve an unsort using the line you gave.