Hi all,
I want to train a binary classifier that takes as input sequences of variable length. The sequences are padded when loading the batches using torch.nn.utils.rnn.pad_sequence(inputs)
inside of a user-defined function _collate_fn_padd
used with DataPipes
. I can create the data loading module with MaskedTensor successfully. The problem comes when defining the architecture and trying to do a forward pass.
My architecture is a transformer model that starts with an embedding layer. The problem is I can’t find a way to use a MaskedTensor object in this setup. The output of
model_args = dict(
ntoken=len(_MOL_WEIGHTS) + 1 - 8,
d_model=512,
nhead=8,
nlayers=2,
d_hid=128,
dropout=0.2,
)
model = TransformerModule(**model_args)
summary(model, example_masked[0]["sequence"], batch_size=-1)
is
.../env/lib/python3.10/site-packages/torch/masked/maskedtensor/core.py:299: UserWarning: embedding is not implemented in __torch_dispatch__ for MaskedTensor.
If you would like this operator to be supported, please file an issue for a feature request at https://github.com/pytorch/maskedtensor/issues with a minimal reproducible code snippet.
In the case that the semantics for the operator are not trivial, it would be appreciated to also include a proposal for the semantics.
Is there something I can do or I have to simply forget about MaskedTensor and use regular masks with buffer?
Any help is greatly appreciated,
Cheers.