How to use torchrl example buffer with multiprocessing?

There is multiple processes and i want them to use same example buffer. For example 5 of them produce and send data, 1 of them samples from data.

I can do this with built-in queue whcih is process safe but then i need to get all data, sample and put again.

I wanted to use example buffer from torchrl and tried multiple things with especially LazyMemmapStorage but always got
RuntimeError: Cannot sample from an empty storage.

Is there a way to use example buffer from torchrl with python multiprocessing in a process safe way?

@vmoens said: Regarding the distributed side of things, LazyMemmapStorage paves the way to shared storages across can be placed anywhere on the available partitions: if nodes have access to a common space with reasonably fast common storage, they will all be capable of writing items to the buffer and access it at low cost. Stay tuned to hear more about this! but i couldn’t find a way to make it work.

Source: [RFC] TorchRL Replay buffers: Pre-allocated and memory-mapped experience replay

Thank you for reading!

Hey! Do you have an example of what you’re doing to share?

Hi thank you for your attention,

I will give a basic example: There is 6 cpu processes. Five of them create data and feed it to buffer. One of them learns from that data by using the buffer. I need a shared array which can hold a dictionary preferably. I want to sample from whole data so I have to access whole buffer at once. I thought using LazyMemmapStorage with disk location specified but I couldn’t make it work.

In the paper they used Reverb from Deepmind.
Is there a way to do it with torchrl currently?

I think I see where the problem might be:
the data of the replay buffer will be accessible to all processes, but not the metadata of the buffer (cursor etc).
So when each process feeds the buffer, it will overwrite the data written by other processes!

The solution we use is to rely on RPC, where each process calls the replay buffer extend method in a centralized way.

There is an example here

That being said, we could think about a pure mp-based RB implementation if you think that would be useful!

This PR or a version of it should solve your issue!
If you could test the changes in test/ and tell me if that is the usage you wanted to make it’d be awesome!