Hi,
If I want to summarize, you only need to use .cuda
on criterion if the criterion function has parameters.
Here is the posts related to this situation: Move the loss function to GPU and Why `criterion.cuda()` is not needed but `model.cuda()` is?
Bests