# Understanding unsqueeze

Let us assume that we want to multiply 2 tensors:

``````t_a = torch.randn(2, 3, 5, 5) #  2 batch, 3 channels, 5 rows, 5 cols
t_b = torch.tensor([0.26, 0.28, 0.45]) # 1D tensor
``````

In the problem I am reading about they use unsqueeze and unsqueeze in-place before multiplying the tensors as such:

``````t_c = t_b.unsqueeze(-1).unsqueeze_(-1)  # first expand dims

result = t_a * t_c # then multiply
``````

I assume we are expanding the tensor `t_b` to be the same in at least one dimension as `t_a`, for the purpose of broadcasting. So `t_c` should end up being what shape in the end?

What does `t_b.unsqueeze(-1)` do? Where does it add a dimension?

Why are we using `unsqueeze_(-1)` in-place here? Is this because the first unsqueeze creates a new object in memory and the second one is just modifying the second object instead of creating a third?

Thanks for the help!

unsqueeze(-1) adds a new dimension after the last index, so the shape of t_b transforms into:

``````t_b.unsqueeze(-1).unsqueeze(-1)
(3)  (3,1)        (3,1,1)      <- shapes
``````

For comparison, if we wanted to add the new dimensions in â€śfrontâ€ť we would do:

``````t_b.unsqueeze(0).unsqueeze(0)
(3)  (1,3)        (1,1,3)      <- shapes
``````

Chaining two unsqeeze calls (or any Python method for that matter) together is equivalent to

``````t_b = t_b.unsqueeze(-1)
t_b = t_b.unsqueeze(-1)
``````
1 Like

Thanks.

So in this scenario we would be broadcasting `0.26` to the first channel, `0.28` to the second, and `0.45` to the third channel?

Why donâ€™t we need to add a 4th dimension, in-front, for the batch?

Lastly, would `t_b = t_b[None]` do the same as `t_b = t_b.unsqueeze(-1)`?

According to the docs:

Two tensors are â€śbroadcastableâ€ť if the following rules hold:

1. Each tensor has at least one dimension.
2. When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.

In this case we have the shapes `(2,3,5,5)` and `(3,1,1)`. The trailing dimension is the leftmost apparently.

``````Valid according to point 2, since "one of them does not exist"
v
(2,3,5,5)
(3,1,1)
``````

My intuition is that when you match the channels, it will distribute each number in `t_b` for each channel. Since `t_b` does not have a fourth dimension, the process simply gets repeated for each â€śitemâ€ť in the fourth dimension.

I would suggest you play around with tensors that are easier to sanity check (print the things and see what happens), e.g. play around with this code:

``````a = torch.ones(3, 4, 4)
b = torch.tensor([1, 2, 3])
print(a*b.unsqueeze(-1).unsqueeze(-1))

a = torch.ones(2, 3, 4, 4)
b = torch.tensor([1, 2, 3])
print(a*b.unsqueeze(-1).unsqueeze(-1))
``````

Lastly, would `t_b = t_b[None]` do the same as `t_b = t_b.unsqueeze(-1)` ?

No it would not, it would be equal to `t_b.unsqueeze(0)`. Doing `t_b[..., None]`, would be equivalent to `.unsqueeze(-1)`. Also, `t_b[..., None, None]` would be equal to `t_b.unsqueeze(-1).unsqueeze(-1)`.

1 Like

Awesome answers. Thanks for taking the time to thoughtfully give me that insight.

1 Like