Optimizing difference of two outer products help

Hi.
I have following computation:
res = torch.outer(po, x) - torch.outer(o, px)

Is there a way to make it with only one outer product?

Sure you can concat po and o and px with x, but its probably only more efficient if your tensors are really small, e.g. when kernel launch overhead is the main bottleneck.

1 Like