Equivalence of torch.tensor([[1]]) and torch.tensor(1)

I’ve encountered the following, and looking for an explanation

import torch
a = torch.tensor([[1]])
b = torch.tensor(1)
print(a.shape, b.shape)
torch.Size([1, 1]), torch.Size([])
print(a == b)

If you want to detect equivalence of sizes and shape, torch.equal is the right thing to use.

In your example above, a == b, we are broadcasting a and b and then performing the operation: https://pytorch.org/docs/stable/notes/broadcasting.html.

The steps look roughly like the following:

  1. a has size [1, 1], b has size []
  2. If the sizes are broadcastable (they are), we make it so that they have the same size before evaluation. So a_broadcasted has size [1, 1], b_broadcasted has size [1, 1]
  3. We compare a_broadcasted with b_broadcasted. This produces a tensor with size [1, 1] that has one element True.

Understood. Is there a proper name for a torch.Size([])?

OK, it looks it’s called a scalar https://discuss.pytorch.org/t/what-does-torch-size-0-means.

1 Like

thanks for answering