My DQN doesn't learn

moskomule · September 29, 2017, 4:59am

Hi, I’m new to reinforcement learning and trying to implement DQN as the original paper proposed.

But as the title, this DQN doesn’t seem to learn even after 1 million steps. Indeed the target value and the q value are going to close to each other, the accumulated reward (and loss) doesn’t increase.

I cannot find out what is wrong. I will be happy if you can point out what is wrong in my code. Any advice is also welcome.

Thank you for advance.