Bidirectional LSTM and batch sentence classification

lucas0 · April 12, 2018, 4:19pm

Hello all,

I’ve manage to implement a simple LSTM for multi-label classification.

Now I want to see if I can improve the results when using Bidirectional, but I have some questions:

I understand why I have to change the first dimension tensors that represent the hidden state, but I still don’t know HOW and IF I should change my input.

It is now a sequence of word embeddings:

sent = autograd.Variable(torch.cuda.LongTensor([word_to_ix[w] for w in sent.split(’ ')]))

Do I have to change it?

I’m pretty sure I have to change my output shape since I’m getting

size mismatch

when calling

y = self.hidden2label(lstm_out[-1])

this is how im instantiating my label variable:

label = autograd.Variable(torch.cuda.FloatTensor([int(e) for e in label_to_ix[label].tolist()]))

I have the impression that I would have two outputs (one from each LSTM) but I don’t know how to modify my label tensor to the correct shape, can someone help me?

Least but not least, is there a way of making training in batches without defining a fixed sentence_length?

Best,
Lucas.