Layers support for MaskedTensor type

Hi there!

I’m currently working with tensors of different shapes. I was looking at the MaskedTensor prototypes which looks like a great solution and everything I needed to concatenate tensors of different shapes while masking the elements that are padded in inputs with shapes different than max shape in the corresponding dimension.

However, when I forward them through standard pytorch layers I get errors like:

In convolution:
TypeError: no implementation found for 'torch._ops.aten.convolution.default' on types that implement __torch_dispatch__: [<class 'torch.masked.maskedtensor.core.MaskedTensor'>]

In linear:
TypeError: no implementation found for '' on types that implement __torch_dispatch__: [<class 'torch.masked.maskedtensor.core.MaskedTensor'>]

I saw there is support for most (if not all. for the linear layer at least) tensor operations required to build these layers. Yet, the layers fail to operate on MaskedTensor. I assume this is because the layers are not implemented using the basic operations supported by MaskedTensor.

If this is the case, is there any work going into making layers support MaskedTensor. This would greatly relief users having to worry about ill-conditioned gradients when a tensor is mostly padded with zeros just using autograd. (NB: My original issue is div-by-zero in a weight normalization backward hook).
Otherwise, is how can we use MaskedTensor in basic layers such as convolution and linear?

Thanks! PyTorch is an amazing tool. 99% of the time I can do what I want without overthinking (everything has been tought of already for my use cases). This is just in the 1% wishlist :smile: and probably being worked on already.

You may be interested in NestedTensor which may have support for the operators you need torch.nested — PyTorch 2.0 documentation.

1 Like

This is very interesting, thank you!

Hello, I am experiencing the same issue. Have you found a solution yet?