Use MaskedTensor with Embedding layer

Hi all,

I want to train a binary classifier that takes as input sequences of variable length. The sequences are padded when loading the batches using torch.nn.utils.rnn.pad_sequence(inputs) inside of a user-defined function _collate_fn_padd used with DataPipes. I can create the data loading module with MaskedTensor successfully. The problem comes when defining the architecture and trying to do a forward pass.

My architecture is a transformer model that starts with an embedding layer. The problem is I can’t find a way to use a MaskedTensor object in this setup. The output of

model_args = dict(
    ntoken=len(_MOL_WEIGHTS) + 1 - 8,
model = TransformerModule(**model_args)
summary(model, example_masked[0]["sequence"], batch_size=-1)


.../env/lib/python3.10/site-packages/torch/masked/maskedtensor/ UserWarning: embedding is not implemented in __torch_dispatch__ for MaskedTensor.
If you would like this operator to be supported, please file an issue for a feature request at with a minimal reproducible code snippet.
In the case that the semantics for the operator are not trivial, it would be appreciated to also include a proposal for the semantics.

Is there something I can do or I have to simply forget about MaskedTensor and use regular masks with buffer?

Any help is greatly appreciated,