RuntimeError: CUDA error: device-side assert triggered

Peter_Ham · January 9, 2019, 4:29am

I’m putting my code here:

with torch.no_grad():
        retrieval_one_hot = torch.zeros(k, 10).cuda()
        for batch_idx, (inputs, targets, indexes) in enumerate(testloader):
             inputs, targets, indexes = inputs.to(device), targets.to(device), indexes.to(device)
             net.to(device)
             targets = targets.cuda()#async=True)
             batchSize = inputs.size(0)
             features = net(inputs)
             zz = torch.zeros(batchSize * k, 10).cuda()
             ....

and it returns the error at the line zz = torch.zeros(batchSize * k, 10).cuda(), with the error message “RuntimeError: CUDA error: device-side assert triggered”, any suggestions?

Tony-Y · January 9, 2019, 4:32am

Please the error message!

Tony-Y · January 9, 2019, 4:44am

This suggestion might be the case for your problem.

Peter_Ham · January 9, 2019, 8:56pm

The code runs fine on cpu, but fails to run on gpu.

Tony-Y · January 10, 2019, 4:46am

Debugging CUDA device-side assert in PyTorch
This article would help you to debug your code. If you get a better traceback setting CUDA_LAUNCH_BLOCKING=1, post it.

Gaurav_Koradiya · June 14, 2019, 4:36am

I am also getting error in CPU

Tony-Y · June 14, 2019, 7:23am

I think you unintentionally used CUDA tensors. Please place every tensor x in CPU memory:

x = x.cpu()

christopherney · February 29, 2020, 6:13pm

You should read this post: https://towardsdatascience.com/cuda-error-device-side-assert-triggered-c6ae1c8fa4c3

jh_chen · March 11, 2020, 5:11pm

The reason why you met this problem becasue your “target” labels include minus values such as “-1”.
I find this reason when i met this problem. hope useful!

Yufeng_Zhang · March 11, 2020, 8:10pm

Very helpful! Thanks a lot!

Baltz · January 19, 2021, 5:50pm

In my case, this error is caused because my loss function just receive values between [0, 1], and i was passing other values.

So, normalizing my loss function input, solve this:

    saida_G -= saida_G.min(1, keepdim=True)[0]
    saida_G /= saida_G.max(1, keepdim=True)[0]

Read this: link

SaharCarmel · January 21, 2021, 10:09am

Another issue that might raise this error is mismatch with the last layer of the network. Validate that the output of the network is the same as the number of labels.

Yanting_Huang · December 10, 2021, 12:30am

Yes, the same situation caused this error for me. I just fixed that. Thanks!

milan_kalkenings · December 31, 2021, 2:06am

for noobs like me:
use

import os
os.environ["CUDA_LAUNCH_BLOCKING"] = "1"

on top of your script and run the script to get a way better stack trace. in my case the cuda error told me to look at a completely different line which caused an hour of debugging