Creating a mask tensor from an index tensor

I’m trying to create a mask based on an index tensor.

The mask size is [6, 1, 25]
The index size is [6, 1, 12]

First I have an index tensor indices:

print(indices)
tensor([[[ 0,  1,  2,  5,  6,  7, 12, 17, 18, 22, 23, 21]],

        [[ 2,  3,  4,  7,  8,  9, 15, 16, 20, 21, 22, 13]],

        [[ 0,  1,  5,  6, 10, 11, 15, 16, 17, 20, 21, 12]],

        [[ 1, 10, 15, 16, 17, 18, 20, 21, 22, 23, 24,  2]],

        [[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 13]],

        [[ 3,  4,  8,  9, 13, 14, 18, 19, 22, 23, 24, 17]]], device='cuda:0')

Then I allocate a mask tensor

mask = torch.cuda.ByteTensor(6, 1, 25).zero_()

Now I want to set only [0, 0, 0], [0, 0, 1], …, [0, 0, 23], [0, 0, 25], etc. in tensor mask to 1

I have tried:

mask[indices] = 1,

I have also tried adapting solutions given in

to this problem unsuccessfully.

Any help would be appreciated, thank you.

4 Likes

I think you could use scatter_:

mask = torch.zeros(6, 1, 25)
mask.scatter_(2, indices, 1.)
12 Likes

Wowowow that works, thank you so much @ptrblck!

2 Likes

Hi,

How can I do the reverse operation, meaning I want to convert a mask into indices.
For example: [0, 1, 0, 1] -> [1, 3]
Thank you,

2 Likes

.nonzero() should do the trick.

5 Likes

Hi @ptrblck,

How do I convert a 2d mask to index while retaining the dimensions.
like if my mask is of shape Nx1000. How do i use this to create a index tensor of size (N,) as each sample in the batch may not have the same number of masked elements.

Thanks :slight_smile:

You won’t be able to retain the same shape or dimensions.
nonzero will return a 2-dimensional tensor where dim1 contains the indices for each dimension or alternatively you can return these indices as a tuple.

How should the result tensor look for your use case?
E.g. assuming that not all rows in N contain a nonzero value, what should be stored at this particular index in dim0?

So first let me describe my use case. I have a tensor of shape NxCxseq_len. I have an index tensor of shape NxK (basically K from each batch) where K denotes the indices from the seq_len axis and I need to pick these specific samples. I am currently using a loop to retrieve a sample corresponding to every column in the K axis but I feel there must be an easier way to go about this. Btw the indices are not the same for each sample in the batch.

So my resultant tensor should be of the shape NxCxK

It seems you want index a tensor of [N, C, seq_len] with another tensor of [N, K], so I’m unsure where a mask is coming from.
If I understand the use case correctly, this should work:

N, C, seq_len, K = 2, 3, 4, 5

x = torch.randn(N, C, seq_len)
idx = torch.randint(0, seq_len, (N, K))
result = x[torch.arange(N)[:, None, None], torch.arange(C)[None, :, None], idx.unsqueeze(1)]
print(result.shape)
> torch.Size([2, 3, 5])

Hi @ptrblck,

Is it possible to use scatter_ with some probability? I want to do the same but instead of masking all indices from the indices matrix, I want just to mask indices with probability 15%.

If it can’t be done using scatter, how could I do it?

Thanks!

scatter_ doesn’t directly take a probability, but you could probably create the indices by sampling with a specific probability. Would this work for you?

1 Like