Info on how to make custom backend with primtorch

If we would like to work on a custom backend using the canonicalized operators in PT2.0’s PrimTorch, are there any examples or tutorials on how to go about doing this?

1 Like

Hi, there!
To use the primTorch backend, you can do the following, and the rest of the story remains the same.

In [1]: import torch

In [2]: import torch._prims as prims

In [3]: prims.add(torch.tensor([1, 2, 3]), torch.tensor([6, 5, 4]))
Out[3]: tensor([7, 7, 7])

PS: If you are looking at the working of primTorch, there’s a great post by @mruberry on Tracing with primitives. Although, it’s more of a developer discussion rather than a tutorial or example.

Out of curiosity, why do you want to use primTorch operators?

Thanks for the reply!

I was more so asking about the process for plugging in a custom backend at the ATen IR level. According to a previous PT2.0 announcement, it seems like if I were to write a backend that minimally implements the ~200 PrimTorch operators, that would be sufficient. However I cannot seem to find any additional directions on how to go about doing this.

1 Like

Hi, just wanted to bump this post again and see if there is any info available on custom backend integration through primtorch.

It would be interesting to get a pointer here, agreed. I guess the general question is “how do we best interpose some graph transforms before our custom backend”?

+1. I’m experimenting with out-of-tree backend using PivateUse1 as a disptach key. I’ve started with eager mode and aten kernels.
At first it looked as it would be sufficient to impelment only those that declared as “dispatch=True” & “default=False” based on Extending dispatcher for a new backend in C++ — PyTorch Tutorials 1.13.1+cu117 documentation . There are ~1k of them. However this ended up in RuntimeError. More kernels need be implemented to run a neural network, e.g.:
aten::convolution_overrideable … “dispatch”: “True”, “default”: “True”
aten::cross_entropy_loss … “dispatch”: “False”, “default”: “True”
And now it feels like I’ll need to implement all the 3k of kernels to make it work.

Noticed PrimTorch project that targets to reduce the number of must have kernels. Is there any way to use PrimTorch in Eager mode without modification of original python code of a neural network?