Hello, while I’m doing a RL project, I need some way to share model state dict across multi-processes.
I found the A3C implementation which has this feature, but I have some questions about it.
In the above repo, the author defined model instance on CPU shared memory, and then all the sub processes also share that. In each subprocess, it has its own model instance on GPU, and it loads the state from globally shared CPU model instance without any lock.
So my question is, Is it guranteed that it’s safe to access parameters of the model on CPU shared memory simultaneously in multiple processes without any lock or synchronization?
I’m pretty sure it’s fine since model parameters are treated as read-only variable here, but I just want to double check it.
The next question is my main interest. At here, the author simply copies the gradient of model in sub-process to shared model without any lock mechanism, and also optimization is done without synchronization. How come is it possible? Why isn’t everything messed up when multiple processes overwrite the gradient of shared model without any lock mechanism?
If I missed out some points, please let me know