I need words to words model. However, it is really hard to understand.
I think Y_hat has to shape [batch_size, target_sequence, vocab_size] when target has [batch_size, sequence].
In an implementation, I can’t get above shape of output.
Please help me.
Code snippet of my model as follow:
x = self.embed(input_x) packed_x = pack_padded_sequence(x, seq, batch_first=True) packed_h, (packed_h_t, packed_c_t) = self.rnn(packed_x, (h0, c0)) unpacked_x, _ = pad_packed_sequence(packed_h, batch_first=True) _output = self.linear(unpacked_x) output = F.log_softmax(_output, dim=1)