Stop backpropagation of sequence models for padded sequences

Thomas_Ricatte · January 11, 2021, 9:59am

Hello,

I am working on NLP tagging models (i.e. I have to produce one tag per input token). In this context, I am working with:

padded sequences
BertModel / BiLSTM stack

Ideally, I would prevent any padding side-effect (aka no backpropagation on padded states). What is the correct way to handle this ? I found the PackedSequence object but it’s not clear how it behaves especially when several models are stacked.

Thanks for your help or any pointer on the best practice on this topic !

Abhilash_Srivastava · January 12, 2021, 12:39am

Use masking over the padded sequence.
You can either use either create your own mask or used masked_fill.
Checkout this blogpost.