A few of us have been working on the draft of PEP 646, enabling support for variadic generics in Python. This is relevant to PyTorch because the primary use case is to support array types that are generic in their shapes, e.g. Tensor[Batch, Height, Width]. (For a more detailed example of how this might work, see our experimental tensor_annotations library. We currently only support TensorFlow and JAX, but we’d love to support PyTorch in the future too.)
Before setting the PEP in stone, we’d like to get some feedback from folks working on libraries like PyTorch to reduce the chances of making design decisions that are going to bite us later on. If you’re interested, please leave some comments on the current draft in this doc.
Thank you for posting the draft here and asking for feedback!
Just by skimming through the code snippets in tensor_annotations, it looks similar to PyTorch’s named tensors, but this utility would be built into the Python standard now?
Right, thanks for asking. The problem we’re trying to solve is indeed very similar to what named tensors is trying to solve.
I say “similar”, because (as I understand it - I haven’t used named tensors in anger) the feature sets are slightly different:
Pro named tensors: Named tensors goes a step further in allowing use of semantic axis labels within function calls too. With named tensors, we can do sum('C'). In the tensor_annotations approach, we still have to specify axes by index, e.g. sum(dim=3) (even if the resulting type, e.g. Tensor[Height, Width], allows us to check that we really did sum over the Channels axis).
Pro the tensor_annotations approach: tensor_annotations allows expected shapes be checked statically, and also allows us to specify the expected shapes of e.g. arguments and returns (def foo(x: Tensor[Height, Width]): ...).
Our original motivation was that, seeing how named tensors were never taken up properly by TensorFlow, we were worried that the named tensors approach wouldn’t take off in general because of the amount of work that library authors would have to do to support them. We wanted to see whether it was possible to make a solution solving roughly the same problem in a way that was library-agnostic, without changing existing APIs.
I’m still not personally sure whether the tensor_annotations approach is the right way to go (though of course I wouldn’t be championing the tensor_annotations approach if I didn’t feel somewhat more optimistic about it than named tensors). It’s also a bit of an experiment.
Our intention with PEP 646 is to add the single feature necessary to let us run the experiment and gather some data on how well it works. (Without PEP 646, we have to use different Tensor classes for each rank, e.g. Tensor1[Logits], Tensor2[Height, Width], etc, and that feels cumbersome enough that I suspect it’d put people off.) Even if tensor_annotations doesn’t work out, the TypeVarTuple of PEP 646 is also a useful language feature to have in general, allowing the signatures of e.g. map and zip to be specified properly (in combination with one extra typing operator, which we’ll likely be introducing in a future PEP).
I hope that gives some useful context. Do let me know if you have more questions.