Hi, I want to add a package that allows tensors and modules to be sent to a non-GPU accelerator similar to pytorch/xla
by creating a new device type
I believe this device would need its own allocator but I am not sure of what memory management primitives are required for this to work.
Appreciate any help