But then I get this error message; RuntimeError: Expected object of device type cuda but got device type cpu for argument #3 'index' in call to _th_index_select
Could you run your code while setting CUDA_LAUNCH_BLOCKING=1 python your_code.py and see where the error comes from please?
The cuda api is asynchronous by default so the stack trace above might we wrong.
My command line is via PUTTY.
I have just run sbatch main.sh CUDA_LAUNCH_BLOCKING=1
Not sure this is what you meant. I normally run my code from VS Code rather than command line. However I also run it on a remote server as it takes time to run.
Great. So this means that the problem comes from the nll loss.
The most common reason here is that Y_d contains indices that are invalid. either negative or larger than the number of scores in the output.