PyTorch equivalent for tf.sequence_mask

I’m dealing with variable-length sequences and I need to apply the mask on a bunch of different tensors. Is there a PyTorch API that provides the same functionality for tf.sequence_mask?

Thanks a lot!

2 Likes
def sequence_mask(lengths, maxlen, dtype=torch.bool):
    if maxlen is None:
        maxlen = lengths.max()
    mask = ~(torch.ones((len(lengths), maxlen)).cumsum(dim=1).t() > lengths).t()
    mask.type(dtype)
    return mask

Would this work for you? torch.bool is only available in 1.2 version. If you use something older, try a good old int type.

1 Like

you can us torch.masked_select

This implement will have different output compared with tf.sequence_mask when the input is 2d-dimension.

e.g When lengths is array([[1],[3], [2],[4]]).

So I improved this implement as following:

    def sequence_mask(self, lengths, maxlen=None, dtype=torch.bool):
        if maxlen is None:
            maxlen = lengths.max()
        row_vector = torch.arange(0, maxlen, 1)
        matrix = torch.unsqueeze(lengths, dim=-1)
        mask = row_vector < matrix

        mask.type(dtype)
        return mask

Hope this can help those who want to use tf.sequence_mask in PyTorch also.

2 Likes

mask = torch.arange(maxlen)[None, :] < lengths[:, None]

1 Like