How to Vectorize/Parallelize Reinforcement Learning Environments?

I feel like this is such an obvious problem, but I can’t find any clear answers. I have a Python class that conforms to OpenAI’s environment API, but it’s written in a way that it receives one input action per step and returns one reward per step. How do I parallelize this environment? I haven’t been able to find any clear answer online. A few people suggested baselines or stable_baselines, but these don’t appear to work with PyTorch and they’re currently broken by the switch to TensorFlow 2.0.

There are some other RL libraries (e.g. https://github.com/zuoxingdong/lagom, https://github.com/astooke/rlpyt), but they don’t appear to have professional support, so I’m concerned that if I use one, it’ll quickly become unusable.

I’m not an RL expert, but Catalyst is in the PyTorch ecosystem, so it might get some future support.
Have you had a look at this library already and would it fit your needs?

I decided to simply vectorize my environments and that appears to have given me the speed boost I need. I used the code provided here: https://stackoverflow.com/a/59599319/4570472