Multiprocessing with custom class and tensors on GPU

I am writing a DDPG reinforcement learning agent where I will offload the training to another process. To enable the worker to place experiences in the replay buffer and have the optimiser use those experiences for training I am looking to share the ReplayBuffer object. In Python multiprocessing, this is done using class MyManager(BaseManager) and I’m using it as follows:

MyManager.register("ReplayBuffer", ReplayBuffer)
myManager = MyManager()
replay_proxy = myManager.ReplayBuffer(size=100_000)

After which I can replay_proxy.add(.... However, while this passes numpy arrays and whatever else totally fine, it won’t pass a tensor. torch.multiprocessing extends the standard multiprocessing, but it doesn’t include BaseManager so I’m not sure how to get around this.

If anyone has a clever trick I’d appreciate it because I hate doing the whole ‘move to CPU and convert to numpy’ to write the tensor to the replay buffer only to convert it back to a tensor and move it to GPU as soon as I need it.