According to the DeepMind DQL Paper, the error term is clipped between -1 and 1. I am using clamp for that, but using it doesn’t allow me to use a default loss function like MSELoss.
I don’t know how to do it correctly. If I try to use default errors, they usually take two arguments (target, prediction), but then I am unable to clamp on the loss for each pair. On the other hand, if I use clamp(target - prediction, min, max), I end up with only one tensor, and then I can’t use default errors.
If I understand right, it is mentioned to clip the update of the difference, which probably means the gradient. so this becomes another case of Gradient Clipping and not the Loss clipping directly.