How to avoid this error?

iteration 803…
Traceback (most recent call last):
File “”, line 229, in
File “”, line 134, in train
policy_loss = policy_loss.sum()
RuntimeError: value cannot be converted to type double without overflow: -inf


The error is because your loss is -inf is that expected? If not you want to look why you loss goes to -inf.

the loss is about 200000 at last step

after 800 step, the loss go from 700000 to 200000