Is there any difference between x[:, None] and x.unsqueeze(1)?

AlphaBetaGamma96 · September 18, 2021, 11:55am

Hi All,

I’ve got a quick question about reshaping Tensors. In order to broadcast Tensors I’ve been using x[:, None] to add a new dim to my Tensors. However, this seems to be the same as unsqueeze(1). Is this technically true, or is it slightly different in any way? For example, is it slower or quicker?

Thank you!

KFrank · September 18, 2021, 1:46pm

Hi Alpha!

Autgrad indicates that pytorch is smart enough to convert x[:, None]
to x.unsqueeze (1):

>>> import torch
>>> torch.__version__
'1.9.0'
>>> x = torch.ones (2, 3, requires_grad = True)
>>> x.unsqueeze (1)
tensor([[[1., 1., 1.]],

        [[1., 1., 1.]]], grad_fn=<UnsqueezeBackward0>)
>>> x[:, None]
tensor([[[1., 1., 1.]],

        [[1., 1., 1.]]], grad_fn=<UnsqueezeBackward0>)

But this seems to be something of a special case. Consider:

>>> x[:, None, :]
tensor([[[1., 1., 1.]],

        [[1., 1., 1.]]], grad_fn=<SliceBackward>)
>>> x[None, :]
tensor([[[1., 1., 1.],
         [1., 1., 1.]]], grad_fn=<SliceBackward>)

Best.

K. Frank

AlphaBetaGamma96 · September 18, 2021, 1:55pm

Hi @KFrank!

Thanks for the response! So, it seems that it should be ok when using it to unsqueeze new dims at the end dimensions but not any in between? (this is the special use case of it?) For example, x[:, None] is fine by x[:, None, :] isn’t?

I’m only using it at the moment in order to broadcast a batch of scalars over a batch of matrices. For example,

A = torch.randn(100, 2, 4, 4) #some matrcies
scalars = torch.randn(100, 2) #scalar scalars
scaled_A = scalar[:,:,None,None] * A

Perhaps it might be safer to do this?

A = torch.randn(100, 2, 4, 4) #some matrcies
scalars = torch.randn(100, 2) #scalar scalars
scaled_A = scalar.unsqueeze(2).unsqueeze(3) * A

Thank you for the help!