I want to train a binary classifier that takes as input sequences of variable length. The sequences are padded when loading the batches using
torch.nn.utils.rnn.pad_sequence(inputs) inside of a user-defined function
_collate_fn_padd used with
DataPipes. I can create the data loading module with MaskedTensor successfully. The problem comes when defining the architecture and trying to do a forward pass.
My architecture is a transformer model that starts with an embedding layer. The problem is I can’t find a way to use a MaskedTensor object in this setup. The output of
model_args = dict( ntoken=len(_MOL_WEIGHTS) + 1 - 8, d_model=512, nhead=8, nlayers=2, d_hid=128, dropout=0.2, ) model = TransformerModule(**model_args) summary(model, example_masked["sequence"], batch_size=-1)
.../env/lib/python3.10/site-packages/torch/masked/maskedtensor/core.py:299: UserWarning: embedding is not implemented in __torch_dispatch__ for MaskedTensor. If you would like this operator to be supported, please file an issue for a feature request at https://github.com/pytorch/maskedtensor/issues with a minimal reproducible code snippet. In the case that the semantics for the operator are not trivial, it would be appreciated to also include a proposal for the semantics.
Is there something I can do or I have to simply forget about MaskedTensor and use regular masks with buffer?
Any help is greatly appreciated,