Type Error while Scripting Learned Positional Embedding

sukuya · August 21, 2019, 7:58am

I am trying to script LearnedPositionalEmbedding from Fairseq:
CODE

import torch.nn as nn
import torch
import torch.jit
from fairseq import utils

class LearnedPositionalEmbedding(nn.Embedding):
    def __init__(self, num_embeddings, embedding_dim, padding_idx, left_pad):
        super().__init__(num_embeddings, embedding_dim, padding_idx)
        self.left_pad = left_pad
        # self.register_buffer('padding_idx_', padding_idx)

    def forward(self, input_, incremental_state=None):
        """Input is expected to be of size [bsz x seqlen]."""
        if incremental_state is not None:
            positions = input_.data.new(1, 1).fill_(self.padding_idx_ + input_.size(1))
        else:
            positions = utils.make_positions(input_.data, self.padding_idx_, self.left_pad)
        return super().forward(positions)

    def max_positions(self):
        """Maximum number of supported positions."""
        return self.num_embeddings - self.padding_idx_ - 1

padding_idx = torch.tensor([[1]], dtype=torch.long)
model = LearnedPositionalEmbedding(4,5, padding_idx, False)
model_scripted = torch.jit.script(model)

And I get following error:

TypeError: 
'Tensor' object for attribute 'padding_idx' is not a valid constant.
Valid constants are:
  1. a nn.ModuleList
  2. a value of type {bool, float, int, str, NoneType, function, device, layout, dtype}
  3. a list or tuple of (2)

Even issue #16284 is not of help.

driazati · August 23, 2019, 9:31pm

In the nn.Embedding provided by PyTorch, padding_idx is an int that doesn’t change, for that reason we have added it to __constants__ in nn.Embedding (code here). Since LearnedPositionalEmbedding does not override __constants__, it gets the one from nn.Embedding.

If you want padding_idx to be a Tensor (which is not a supported constant type), you’ll have to provide your own __constants__ that removes padding_idx but keeps the other stuff around, something like:

from fairseq import utils

class LearnedPositionalEmbedding(nn.Embedding):
    __constants__ = ['num_embeddings', 'embedding_dim', 'max_norm',
                     'norm_type', 'scale_grad_by_freq', 'sparse']
    def __init__(self, num_embeddings, embedding_dim, padding_idx, left_pad):
        super().__init__(num_embeddings, embedding_dim, padding_idx)
        self.left_pad = left_pad