`torch.mm()` returns RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasSetStream(handle, stream)`

Dylan_Hanna · June 3, 2022, 12:54pm

Steps to reproduce:

# Create test tensor
x = torch.tensor([[1.0, 2.0]])

# Ensure tensor is on cpu
x = x.cpu()

# This code works fine, so there is no issue with syntax/shapes
torch.mm(x, x.T)

# Send tensor to cuda device (without errors)
x = x.cuda()

# This returns an error.
torch.mm(x, x.T)

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasSetStream(handle, stream)

Some specs:

ubuntu 20.04
torch==1.10.1
RTX 3090

Anyone have any idea what’s going on? Most discussions around this topic talk about this error actually being because of shape mis-matches or CUDA memory errors, but neither of these are the case with this example. Happy to provide more detail about my environment if necessary.

ptrblck · June 4, 2022, 6:13am

Which CUDA runtime is your PyTorch installation using? If it’s 10.2, please update the binaries with the CUDA 11 runtime as this is needed for your Ampere GPU.

Dylan_Hanna · June 4, 2022, 8:02am

It turns out that I had built torch with the 10.2 runtime.
I uninstalled it and reinstalled the nightly binaries for 11.6 using this command:

pip install torch --pre --extra-index-url https://download.pytorch.org/whl/nightly/cu116

But now the code above is giving me a different error:

RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.

CUDA_VISIBLE_DEVICES is set to 0.

Dylan_Hanna · June 5, 2022, 8:33am

Somehow I fixed the issue by playing around with a bunch of cuda stuff and restarting my machine. I’ll close this issue now.