Ppo+lstm working code


I am looking for ppo + lstm implementation.
Can someone please help to let me know of available working code in pytorch for ppo + lstm.


I am not sure if it’s too late to answer this but I came across this implementation for ppo with lstm : https://github.com/seungeunrho/minimalRL/blob/master/ppo-lstm.py
and the code is quite simple and easy to follow.
Hope it helps.

Hi @granth_jain
did you find a suitable implementation.

Unfortunately, the one proposed before is not really a good choice. It uses truncated bptt of sequence length 1. The CartPole environment is not a good environment to test if the recurrent policy is working even if you mask out the velocities of the agent’s observation space.