How does broadcasting 3-D tensors work?

Preyas · November 6, 2019, 8:09pm

Let’s begin with a simple 2-D tensor:
t1=torch.tensor([1,2,3,5],dtype=torch.int16).
t1=t1.reshape((2,2))

Next I create two 3-D tensors by re-shaping t1, and then try to perform an element-wise operation on them.
t11=t1.unsqueeze(1) # (shape: (2,1,2)
t12=t1.unsqueeze(2) # (shape: (2,2,1)

final= t12 + t11

Screenshot%20from%202019-11-06%2021-04-58

I expected the final output to be of shape (2,2,2), but I don’t understand how did pytorch broadcast t12 and t11 to perform the element-wise addition operation. Could someone please explain how does broadcasting happen with 3D tensors, and how did it result in the final tensor shown above?

P.S.: I am aware how broadcasting works with 2-D tensors, I just cant seem to visualise it happening with 3-D tensors.

Thanks!

albanD · November 6, 2019, 8:18pm

What happens when you have the same number of dimension and in one dimension, the size don’t match but one is 1, you expand to the new size.
In you case, you have final = t12.expand(2,2,2) + t11.expand(2,2,2).
This basically repeats the existing Tensor along the dimension that was of size 1.