I want the model to be trained on the GPU. Besides the input and model to be cuda, what else needs to be cuda?
All data needed for an operation should be on the same device.
So besides the model and input, you would have to push the target to the same device in order to calculate the loss.
This should be sufficient for a vanilla training loop.
Your reply helped me a lot!