LSTM layer with masking like in Lasagne/Theano

I’m converting some Theano/Lasagne code to PyTorch, but I’m unsure if an LSTM with masking is possible in PyTorch.
The line in the Theano code is
l_box_lstm = lasagne.layers.LSTMLayer(l_box, num_units=d_word, mask_input=l_bbmask, only_return_final=True)
d_word = 256

Here, l_box is of shape mb_size x seq_len x features_size = 192 x 3 x 256, and l_bbmask is of shape mb_size x seq_len = 192 x 3

I was looking at how the mask_input is used in the LSTM layer, and it seems like it’s not so simple.

Does anyone know if I can just expand the shape of l_bbmask to also be 192 x 3 x 256 and elementwise multiply it with l_box to get the same result, or no? It seems like the lasagne code uses it at every step, but I’m not really sure since I don’t really understand the lasagne code that well.