If we would like to work on a custom backend using the canonicalized operators in PT2.0’s PrimTorch, are there any examples or tutorials on how to go about doing this?
Hi, there!
To use the primTorch backend, you can do the following, and the rest of the story remains the same.
In [1]: import torch
In [2]: import torch._prims as prims
In [3]: prims.add(torch.tensor([1, 2, 3]), torch.tensor([6, 5, 4]))
Out[3]: tensor([7, 7, 7])
PS: If you are looking at the working of primTorch, there’s a great post by @mruberry on Tracing with primitives. Although, it’s more of a developer discussion rather than a tutorial or example.
Out of curiosity, why do you want to use primTorch operators?
Thanks!
Thanks for the reply!
I was more so asking about the process for plugging in a custom backend at the ATen IR level. According to a previous PT2.0 announcement, it seems like if I were to write a backend that minimally implements the ~200 PrimTorch operators, that would be sufficient. However I cannot seem to find any additional directions on how to go about doing this.
Hi, just wanted to bump this post again and see if there is any info available on custom backend integration through primtorch.
It would be interesting to get a pointer here, agreed. I guess the general question is “how do we best interpose some graph transforms before our custom backend”?
+1. I’m experimenting with out-of-tree backend using PivateUse1 as a disptach key. I’ve started with eager mode and aten kernels.
At first it looked as it would be sufficient to impelment only those that declared as “dispatch=True” & “default=False” based on Extending dispatcher for a new backend in C++ — PyTorch Tutorials 1.13.1+cu117 documentation . There are ~1k of them. However this ended up in RuntimeError. More kernels need be implemented to run a neural network, e.g.:
aten::convolution_overrideable … “dispatch”: “True”, “default”: “True”
aten::cross_entropy_loss … “dispatch”: “False”, “default”: “True”
And now it feels like I’ll need to implement all the 3k of kernels to make it work.
Noticed PrimTorch project that targets to reduce the number of must have kernels. Is there any way to use PrimTorch in Eager mode without modification of original python code of a neural network?
I am also thinking about trying to write a new PrimTorch backend, particularly using Kompute.
Where can I find the API which is needed to be implemented, so I would comprehend the complexity of the task ahead.
I am also very interested!