J_Johnson
(J Johnson)
September 10, 2022, 3:51pm
1
Suppose we have a matrix as follows:
A=torch.arange(100).view(20,5)
And suppose we also have a mask to apply, such as:
B=torch.rand(100).view(20, 5)
mask=B>0.1
Now suppose we also wish to get the mean along Dim=1 but with the mask applied:
print(torch.mean(A[mask], dim=1))
But this gives an error, because applying this type of mask causes the matrix to become a vector.
What is the best way to apply a mask and get dim-wise mean values?
Use this -
D = torch.where(mask, A, 0).type(torch.float32)
torch.mean(D, dim=1)
KFrank
(K. Frank)
September 10, 2022, 11:09pm
3
Hi Srishti (and J)!
This replaces masked elements with 0.0 that then get mixed in with the
mean()
computation, diluting it. Based on the pseudocode J posted, I don’t
think this is what he wants.
Computing the mean in terms of a masked sum and a masked count gives
what I think is the desired result:
>>> import torch
>>> print (torch.__version__)
1.12.0
>>>
>>> _ = torch.manual_seed (2022)
>>>
>>> A = torch.arange (100).view (20, 5)
>>> B = torch.rand (100).view (20, 5)
>>> mask = B > 0.1
>>>
>>> masked_mean = (mask * A).sum (dim = 1) / mask.sum (dim = 1)
>>>
>>> masked_mean
tensor([ 1.5000, 7.0000, 12.0000, 17.0000, 22.0000, 27.2500, 32.0000, 37.0000,
42.0000, 47.0000, 52.5000, 57.0000, 62.0000, 67.0000, 71.7500, 76.7500,
82.0000, 87.0000, 92.0000, 97.0000])
Note, if an entire row is masked out, the resulting mean will be nan
which
is as good an exceptional value as any.
Best.
K. Frank
J_Johnson
(J Johnson)
September 11, 2022, 2:00am
4
This is exactly what I was looking for. The actual implementation will not have any empty rows, so NaNs won’t be an issue. Thank you.