Tensor multiplication along certain axis

I have two tensors. A has shape (N, C, H, W) and B has shape (C). Now I want to multiply both tensors along C.

Currently I use torch.einsum("ijkl,j->ijkl", A, B) and it seems to work.

I would like to know if there is a better or more intuitive way to do this? Mabe with .view()?



Yes, you can use view() safely here, or reshape.

See this for the differences between view and reshape

View is quick to perform but will fails on non-contiguous tensors.
Reshape always works. When possible, the returned tensor will be a view of input. Otherwise, it will be a copy.

But I don’t think it’s possible for 1D tensor to be non-contiguous, so you should not have to care about that here.

The broadcasting is performed starting from last dimension, therefor you need B to get the shape (C, 1, 1) before the multiplication.

So basically, torch.einsum("ijkl,j->ijkl", A, B) should be equivalent to A*(B.view(-1, 1, 1)) and equivalent to A*(B.unsqueeze(-1).unsqueeze(-1)).

I would tend to use A*(B.view(-1, 1, 1)) here which is short, clear, and efficient.

1 Like

Thank you! That looks much better to my eyes! But I don’t understand when to use .view() and when to use .reshape(). The explanation didn’t help. Can you please explain the difference in layman’s terms?

Without going into details, the data of a tensor (whatever its shape is) is always stored internally as a kind of 1d array. The shape of the tensor is fixed by how is defined the mapping of indices to memory pointers that we call the memory layout. For instance, when you do b = a.view(some_shape), a and b will share the same data to avoid doing a costly copy, but may have different memory layout. In Pytorch, contiguous refer to some specific memory layout, so a tensor can be “contiguous” or “non contiguous”.
Contiguous tensor are usually more convenient as few operations won’t work with non contiguous tensor, view is one example of operation that won’t work on non contiguous tensor in input. The operation reshape has the same behavior than view except that it will work with non-contiguous data in witch case the data will be copied.

But you should not worry too much about the contiguous property of your tensor. Most of the time you won’t have any problems, and if an error is raised at some point because an operation was expecting a contiguous input tensor you can simply fix it by changing operation(tensor) to operation(tensor.contiguous()) to make it work.

So, in short when you want a reshaped tensor b from a:

  • If a is contiguous and you want a and b to share the same storage, use b = a.view(some_shape). So if you do an in-place operation afterward on a, b will also be modified.
  • If a is contiguous and you don’t want a and b to share the same storage, use b = a.view(some_shape).clone() or b = a.reshape(some_shape).clone().
  • If a is not contiguous, use b = a.reshape(some_shape) or a.contiguous().view(some_shape), a and b won’t share the same storage.
  • If a may be contiguous or non contiguous, use b = a.reshape(some_shape). But you cannot know if the storage of a and b will share the same data or not.
  • If a may be contiguous or non contiguous and you don’t want a and b to possibly share the same data use b = a.reshape(some_shape).clone()

For more information on what mean contiguous in pytorch, you can look to this thread. I invite you also to read this page doc about view.

I know the contiguous thing can be confusing, I hope this answer helped

1 Like

Thank you so much! This helped me a lot!