Hello all,
I’ve manage to implement a simple LSTM for multi-label classification.
Now I want to see if I can improve the results when using Bidirectional, but I have some questions:
- I understand why I have to change the first dimension tensors that represent the hidden state, but I still don’t know HOW and IF I should change my input.
It is now a sequence of word embeddings:
sent = autograd.Variable(torch.cuda.LongTensor([word_to_ix[w] for w in sent.split(’ ')]))
Do I have to change it?
- I’m pretty sure I have to change my output shape since I’m getting
size mismatch
when calling
y = self.hidden2label(lstm_out[-1])
this is how im instantiating my label variable:
label = autograd.Variable(torch.cuda.FloatTensor([int(e) for e in label_to_ix[label].tolist()]))
I have the impression that I would have two outputs (one from each LSTM) but I don’t know how to modify my label tensor to the correct shape, can someone help me?
- Least but not least, is there a way of making training in batches without defining a fixed
sentence_length
?
Best,
Lucas.