Based on the error message it seems the actor
is creating NaN outputs after a few iterations of training. Are you seeing an increase in the value range of its output during training, which could then overflow after a while?
Based on the error message it seems the actor
is creating NaN outputs after a few iterations of training. Are you seeing an increase in the value range of its output during training, which could then overflow after a while?