For example, numpy’s einsum has an “order” and an “optimize” parameter, which lets you either specify exactly the order to eliminate dimensions, or whether or not/how elimination order should be specified. For example, optimize=‘greedy’ just uses a simple and fast greedy algorithm to find a good ordering.

Does torch.einsum optimize the elimination ordering at all?

No. It has been on my list of things to do when I have time ever since I implemented einsum, but it never happened. You’d make a lot of friends if you implemented that.

The (third party) opt_einsum package supports PyTorch, so you could use that, too.

I’m not brave enough to try my hand at it >.<. But if you want my armchair suggestion: at least implementing the greedy approach (even defaulting to it) would be great, as it gets you basically 95% percent of the way from being prohibitively slow to blazing fast.

Thanks so much for making the einsum function in the first place!

The computation time can be orders of magnitude worse on a bunch of every-day cases. The computation order that einsum uses needs to be at least greedily optimized (which takes hardly any time). I underestimated this fact before, and it gave me an enormous performance hit on what would’ve been a simple computation.

It would be prudent to perhaps test in numpy at least to see if the default einsum computation order is fine (as compared to the optimized order that numpy will let you use).