My DQN agent is not learning

Hi ! I tried to implement my first DQN agent for gym Cartpole, but it doesn’t seem to learn : the score at the end is worse than random play
I tried some things :

  • changing some parameters : learning rate, parameters for epsilon greedy, discount rate
  • changing the network architecture by making it much bigger
  • removing the target network
    Those don’t seem to work and I am very confused regarding what I’m doing wrong
    Thanks in advance for your help !

Do you have a chart of the progress? What I’ve found is DQNs often get better up to a point and then much worse if you keep training them. So it’s good to set milestones to save.

Getting the correct rewards and Bellman’s target can often be a weak point and may need some tweaking. This developer was having a similar issue(albeit in Keras): DQN debugging using Open AI gym Cartpole - ADG Efficiency

So you might need to review and tweak accordingly. DQNs are an ongoing area of research.

Last comment, Pytorch has a tutorial with code you could give a try. It worked when I tried it at improving over time.

Minor note here:
We’re working on improving the DQN tutorial, you can check it there:

1 Like