Similar error has been posted in several other topics, however I don’t see how recommendations from other articles apply to my case. In short, I was following this example and tried to use my own environment class. I’d appreciate if someone would point me to a mistake I am making. Please find source t002.py, requirements.txt and a csv file with data on this GitHub project. Thank you for your time.
You might want to double check your code to make sure that all the Tensors and models are of the same dtype. In this case it looks like you expect float everywhere.
You should be especially careful if you create Tensors from numpy arrays as they are in double precision by default and you will have to explicitly change them to float.
For anyone interested, explicit casting of reward, returned by environment’s step function, to float32, allowed dqn.update function to succeed.