Use MaskedTensor with Embedding layer

Hi all,

I want to train a binary classifier that takes as input sequences of variable length. The sequences are padded when loading the batches using torch.nn.utils.rnn.pad_sequence(inputs) inside of a user-defined function _collate_fn_padd used with DataPipes. I can create the data loading module with MaskedTensor successfully. The problem comes when defining the architecture and trying to do a forward pass.

My architecture is a transformer model that starts with an embedding layer. The problem is I can’t find a way to use a MaskedTensor object in this setup. The output of

model_args = dict(
    ntoken=len(_MOL_WEIGHTS) + 1 - 8,
    d_model=512,
    nhead=8,
    nlayers=2,
    d_hid=128,
    dropout=0.2,
)
model = TransformerModule(**model_args)
summary(model, example_masked[0]["sequence"], batch_size=-1)

is

.../env/lib/python3.10/site-packages/torch/masked/maskedtensor/core.py:299: UserWarning: embedding is not implemented in __torch_dispatch__ for MaskedTensor.
If you would like this operator to be supported, please file an issue for a feature request at https://github.com/pytorch/maskedtensor/issues with a minimal reproducible code snippet.
In the case that the semantics for the operator are not trivial, it would be appreciated to also include a proposal for the semantics.

Is there something I can do or I have to simply forget about MaskedTensor and use regular masks with buffer?

Any help is greatly appreciated,

Cheers.

2 Likes