NVIDIA Quadro 4000 too old

SYSTEM
OS: Ubuntu 18.04
Python version: 3.6.8
Torch version: 1.0.0
GPU: Quadro 4000
Cuda: 8.0
NVIDIA Driver: 390

I am using the transformers framework to retrain a NLP model. Ultimately, I get the following error:

RuntimeError: cuda runtime error (48) : no kernel image is available for execution on the device at /pytorch/aten/src/THC/generic/THCTensorMath.cu:238

From what I can gather, pytorch finds the GPU, but says it is too old.

 Found GPU0 Quadro 4000 which is of cuda capability 2.0.
    PyTorch no longer supports this GPU because it is too old.

However, that is a warning. What I read was that the gpu is no longer supported by pytorch (despite cuda working with the gpu and this version of pytorch is recommended for cuda 8.0). I was wondering if there were a remedy to this problem (other than getting a new gpu)? I have read on here that it might be possible to build pytorch from source, but I have no idea how to do that.

Any guidance is much appreciated. Thanks in advance!

Hi Alex!

I don’t have much of an answer for you, but I can share my experience.

I found myself in a similar situation as discussed in this thread:

The short story is that with some help from the experts here, I was
able to find a pre-built pytorch that supported my creaky gpu. But
this came at the cost of running the old, 0.3.0 version of pytorch.

Since (if I understand you correctly) you’re looking for compute
capability 2.0, the outlook for you is probably bleaker still. But
maybe there is something out there that will work for you. (You
might take a look in the links suggested in the aforementioned
thread.)

Building pytorch myself was also suggested as a possible solution,
but I elected not to try it. I have no idea how easy or hard it would be,
but my expectation is that doing so would be a substantive project
in its own right. (Maybe it would be easier than I think, or maybe you
are more courageous than I …).

Good luck.

K. Frank

Hi KFrank,

Thanks for your message. Let me elaborate more on what I have done.

For my particular gpu, I was able to get the gpu and cuda to work together (as evident from calling the deviceQuery binary installed whenever I installed cuda). This gpu, which is a cuda capability 2.0, works with cuda 8.

The most recent version of pytorch will not work with this gpu. So I went to the pytorch website and looked for previous versions of pytorch that will work with cuda 8 (those versions can be found here) and that turned out to be version 1.0.0. I got that installed with

pip install torch==1.0.0 torchvision==0.2.1

then tested pytorch to see if it could find and identify my gpu:

Python 3.6.8 (default, Oct  7 2019, 12:59:55)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.current_device()
/home/user/.local/lib/python3.6/site-packages/torch/cuda/__init__.py:117: UserWarning:
    Found GPU0 Quadro 4000 which is of cuda capability 2.0.
    PyTorch no longer supports this GPU because it is too old.

  warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
0
>>> torch.cuda.device(0)
<torch.cuda.device object at 0x7fb833751d30>
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name(0)
'Quadro 4000'
>>> torch.cuda.is_available()
True

At this point, I am assuming that cuda, my gpu, and this older version of pytorch are all playing nicely. However, when I go to train my model I get that RuntimeError I posted originally (copied below):

RuntimeError: cuda runtime error (48) : no kernel image is available for execution on the device at /pytorch/aten/src/THC/generic/THCTensorMath.cu:238

Here is the full output of the error, if that would be more useful:

THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCTensorMath.cu line=238 error=48 : no kernel image is available for execution on the device
Traceback (most recent call last):
  File "run_lm_finetuning.py", line 548, in <module>
    main()
  File "run_lm_finetuning.py", line 500, in main
    global_step, tr_loss = train(args, train_dataset, model, tokenizer)
  File "run_lm_finetuning.py", line 206, in train
    output_device=args.local_rank)
  File "/home/user/.local/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 215, in __init__
    self.broadcast_bucket_size)
  File "/home/user/.local/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 377, in _dist_broadcast_coalesced
    dist._dist_broadcast_coalesced(self.process_group, tensors, buffer_size, False)
RuntimeError: cuda runtime error (48) : no kernel image is available for execution on the device at /pytorch/aten/src/THC/generic/THCTensorMath.cu:238

EDIT:
One last bit of information that might be useful. I am calling the python script as follows:

python -m torch.distributed.launch run_lm_finetuning.py

Hello Alex!

I don’t remember in detail how things unfolded when I got an old
build of pytorch to work with my old gpu. However, my understanding
of this (which could well be wrong) is that it is perfectly possible for
you to have your cuda version / cuda driver work properly with
your gpu, and have your version of pytorch work properly with your
cuda driver, but have pytorch not work with the specific compute
capability of your gpu.

I believe that a given cuda version / cuda driver typically works with
a range of compute capabilities.

In your case, your version of pytorch explicitly warns you that it needs
a higher compute capability.

So my interpretation is that yes, in a sense, pytorch and cuda and your
gpu are working together, and pytorch correctly recognizes your gpu.
But your version of pytorch knows that it uses some post-capability-2.0
gpu computation features, and warns you of this. (You could argue
that pytorch could simply say “no gpu” at this point, but it chooses to
admit that there is a recognized gpu, but warns you it’s not good enough.)

Then when you actually try to use the gpu (or at least some specific
“kernel” that uses an unsupported feature), you get the runtime error.

My recollection and understanding is that somewhere in the pytorch
code is the required cuda compute capability, but that this has not
be systematically recorded for the various pytorch builds. So you can
either try installing a build and see what it thinks of your cpu, or if
you can somehow match up a build to the version of the source from
which it was built, you could look in the code to check the required
compute capability.

What I did when I wanted to use pytorch with my old gpu was to keep
trying older and older archived pytorch builds until I found one that
worked.

But the bottom line is torch.cuda.is_available() == True is not
equivalent to “this pytorch works with this gpu.”

Have fun.

K. Frank

This is exactly what I was wondering about. I agree with your recommendation and will keep trying older versions of pytorch to see if I find one that will work with my gpu. Thank you for your detailed response!

Does anyone know where I can download older version of pytorch for linux?