Hi,
I am new to pytorch and I have a problem with implementing a model. Unfortunately, the model doesn’t work correctly and I am skeptical about the way I have implemented the sorting and reordering part. Snippet of the function used in the forward function is bellow:
def encoder(self, data, rnn):
if self.use_cuda:
masks = torch.eq(data, 0).cuda()
lens = torch.tensor(list(masks.data.eq(constant.PAD_ID).long().sum(1).squeeze())).cuda()
else:
masks = torch.eq(data, 0)
lens = torch.tensor(list(masks.data.eq(constant.PAD_ID).long().sum(1).squeeze()))
lens, perm_index = lens.sort(0, descending=True)
data = data[perm_index]
batch_size = data.size()[0]
inputs = self.emb(data)
h0, c0 = self.zero_state(batch_size) #a function
inputs = nn.utils.rnn.pack_padded_sequence(inputs, lens, batch_first=True)
outputs, (ht, ct) = rnn(inputs, (h0, c0))
outputs, output_lens = nn.utils.rnn.pad_packed_sequence(outputs, batch_first=True)
hidden = ht[-1,:,:] # get the outmost layer h_n
odx = perm_index.view(-1, 1).expand(batch_size, hidden.size(-1))
decoded = hidden.gather(0, odx)
return decoded
def zero_state(self, batch_size):
state_shape = (self.opt['num_layers'], batch_size, self.opt['hidden_dim'])
h0 = c0 = Variable(torch.zeros(*state_shape), requires_grad=False)
if self.use_cuda:
return h0.cuda(), c0.cuda()
else:
return h0, c0
rnn is an LSTM network. (It is being passed to the function since the model has 2 different LSTM networks)
I am using the encoder function since pack padded sequence expects the inputs to be sorted based on their lengths, however I have to keep the ordering at the end.
Is this the correct way of doing this? Do the weights also reorder accordingly?