Mean Value Along Dim w Mask

J_Johnson · September 10, 2022, 3:51pm

Suppose we have a matrix as follows:

A=torch.arange(100).view(20,5)

And suppose we also have a mask to apply, such as:

B=torch.rand(100).view(20, 5)

mask=B>0.1

Now suppose we also wish to get the mean along Dim=1 but with the mask applied:

print(torch.mean(A[mask], dim=1))

But this gives an error, because applying this type of mask causes the matrix to become a vector.

What is the best way to apply a mask and get dim-wise mean values?

srishti-git1110 · September 10, 2022, 7:22pm

Use this -

D = torch.where(mask, A, 0).type(torch.float32)
torch.mean(D, dim=1)

KFrank · September 10, 2022, 11:09pm

Hi Srishti (and J)!

This replaces masked elements with 0.0 that then get mixed in with the
mean() computation, diluting it. Based on the pseudocode J posted, I don’t
think this is what he wants.

Computing the mean in terms of a masked sum and a masked count gives
what I think is the desired result:

>>> import torch
>>> print (torch.__version__)
1.12.0
>>>
>>> _ = torch.manual_seed (2022)
>>>
>>> A = torch.arange (100).view (20, 5)
>>> B = torch.rand (100).view (20, 5)
>>> mask = B > 0.1
>>>
>>> masked_mean = (mask * A).sum (dim = 1) / mask.sum (dim = 1)
>>>
>>> masked_mean
tensor([ 1.5000,  7.0000, 12.0000, 17.0000, 22.0000, 27.2500, 32.0000, 37.0000,
        42.0000, 47.0000, 52.5000, 57.0000, 62.0000, 67.0000, 71.7500, 76.7500,
        82.0000, 87.0000, 92.0000, 97.0000])

Note, if an entire row is masked out, the resulting mean will be nan which
is as good an exceptional value as any.

Best.

K. Frank

J_Johnson · September 11, 2022, 2:00am

This is exactly what I was looking for. The actual implementation will not have any empty rows, so NaNs won’t be an issue. Thank you.