Iter.device(arg).is_cuda() INTERNAL ASSERT FAILED

Pranavk · November 27, 2020, 10:00am

Hi I am using PPO algorithm

during the back-propagation there is this step
" return torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))"

and the error is
RuntimeError: iter.device(arg).is_cuda() INTERNAL ASSERT FAILED at “/opt/conda/conda-bld/pytorch_1603728993639/work/aten/src/ATen/native/cuda/Loops.cuh”:94, please report a bug to PyTorch.

Can anyone help me with this ? thank you …

ptrblck · November 28, 2020, 7:26am

Could you rerun your code with CUDA_LAUNCH_BLOCKING=1 python script.py args and check, if you get any error message?
If so, could you post it here please?
Also, make sure to use the latest PyTorch version.

Pranavk · November 28, 2020, 4:07pm

Thank you for the replay …I am using PyTorch 1.7.0 and I am running it in a Notebook so I do not know how can I pass this argument there…

ptrblck · November 29, 2020, 7:43am

You could try to use os.environ to set this env var at the beginning of the notebook.
Note, that you would have to set it before importing PyTorch, as it won’t have any effect otherwise.
Alternatively you could export the notebook as a Python script and run it in the terminal, which should work as well.

Pranavk · November 30, 2020, 9:16am

Hii I did that but still it is the same error.

Error : “RuntimeError: iter.device(arg).is_cuda() INTERNAL ASSERT FAILED at “/opt/conda/conda-bld/pytorch_1603728993639/work/aten/src/ATen/native/cuda/Loops.cuh”:94, please report a bug to PyTorch.”

ptrblck · December 1, 2020, 4:49am

OK, I think the blocking launch didn’t work or maybe Jupyter is not forwarding the stack trace.
Could you post an executable code snippet to reproduce this issue and give us information about your setup, i.e. PyTorch CUDA, cublas version, how did you install it, which GPU are you using?

Pranavk · December 1, 2020, 9:07am

Hi thank you it was solved I had a different issue I guess

berlin · August 17, 2023, 9:08am

how did you solve the problem?