How come I am getting different results when doing mean using list of dimensions vs step by step?

For example

>>> x
tensor([[[[ 1.,  3.],
          [ 4.,  6.]],

         [[-1., -3.],
          [-4., -7.]]]])
>>> x.mean([3, 0, 1])
tensor([ 0.0000, -0.2500])
>>> x.mean([3, 0])
tensor([[ 2.0000,  5.0000],
        [-2.0000, -5.5000]])
>>> x.mean([3, 0]).mean([1])
tensor([ 3.5000, -3.7500])

I had believed that, doing mean([3, 0, 1]) is equivalent of doing x.mean([3, 0]).mean([1])

if we do,

x.mean(3).mean(0).mean(0)

then it would give same result
basically,
if we have tensor

x = torch.randn(a, b, c, d)

and we apply

torch.mean(x, dim=0)

then its shape would be

torch.Size([b, c, d])

so here, when we do,

x = torch.tensor([[[[ 1.,  3.],
          [ 4.,  6.]],

         [[-1., -3.],
          [-4., -7.]]]])
x.shape

torch.Size([1, 2, 2, 2])
let us assume a=1, b=2, c=2, d=2

x.mean([3, 0, 1])

the resulting shape would be,

torch.Size([c])

while if we do,

x.mean([3, 0]).mean([1])

then after

x.mean([3, 0])

shape would be

torch.Size([b, c]) # [2, 2]

and when we do

x.mean([3, 0]).mean([1])

means shape would be

torch.Size([b])

so, we get different results, if we did

x.mean([3, 0]).mean(0)

then results would be same.

1 Like

This is exactly what I was looking for!