Mini-batching for model with two level RNNs

dpernes · June 6, 2017, 1:59pm

Hi all,

I have a NLP model that includes two RNNs: one of them, say RNN_Word, works at a word-level and the other one, say RNN_Char, works at a character-level. The model receives a sentence as input and outputs a label for each word in the sentence.

For each word, the final state of RNN_Char is concatenated with the word embedding and then this concatenated tensor is fed as input to RNN_Word.

I wonder how I can use mini-batch in training. I could group sentences with the same length (in number of words) and then, for each batch of sentences, I could group words with the same length (in number of characters), but this procedure seems to be rather inefficient.

I think I cannot use padding to force all words to have the same length, because my loss is defined at a word-level and the state of RNN_Char would be changing while I was feeding stuffing characters to it.

What do you suggest?

smth · June 21, 2017, 11:25pm

your best approach is to group sentences of the same length together (as you did mention).
Alternatively, you can mini-batch the char-level separately, and the word-level separately.

dpernes · June 21, 2017, 11:39pm

Thank you for your reply! Yes, but even if I group sentences of the same length together, then, for each group of sentences, I need to group words of the same length together, right?