I would like to get the mean and standard deviation from a tensor with shape (H,W) along the dimension 1 (so the output would be a pair of tensors with shape (H) or (H,1)). My problem is: I need to filter each row and only use the values selected by a given mask and the mask selects a different number of values from each row.
My actual approach is to generate another tensor with NaNs where I don’t care about the value and use torch.nanmean to get the mean ignoring those values but I don’t find an analogue function to get the std.
Example of behaviour.
>>> input = torch.randn(2, 4)
>>> mask = input >= 0
>>> input_nans = torch.where(mask, input, torch.nan)
>>> mean = torch.nanmean(input_nans, dim=1)
This should be feasible (a mean from the values greater than 0) but there is no nanstd function to get the standard deviation ignoring nans. It’s important to note that the first value comes from the mean over 3 values while the second comes from the mean over 2 values (that is the reason I cannot simply filter to get a tensor only with the desired values, the rows length would not match).
If you have a mask constructed you can use the mask_select or just index it.
>>> a = torch.randn(3 ,3)
tensor([[-1.6599, -0.0141, -0.9498],
[ 0.5469, -0.0643, -2.0145],
[ 0.4915, -1.7717, -1.9315]])
>>> mask = torch.randn(3, 3) > 0
tensor([[False, False, False],
[ True, False, False],
[ True, False, True]])
tensor([ 0.5469, 0.4915, -1.9315])
Maybe implement your own nanstd?
or sth similar.
That is not my desired behaviour. I’ve extended the question with an example. I hope it to be more undestable now. Thanks!
With that function, the output would include nans becaouse x can include nans. I’ve extended the question with an example. Thanks
I think that jayz 's answer needed a small modification:
return torch.sqrt(torch.nanmean(torch.pow(x - torch.nanmean(x, dim=-1).unsqueeze(-1), 2), dim=-1))
Wouldn’t this work for you?
this should work,
output (bs,dim) where you reduce seq dim to 1 to get std in seq axis without nan