How to check torch gpu compatibility without initializing CUDA?

Older GPUs don’t seem to support torch in spite of recent cuda versions.

In my case the crash has the following error:

/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/cuda/__init__.py:83: UserWarning: 
    Found GPU%d %s which is of cuda capability %d.%d.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability supported by this library is %d.%d.
    
  warnings.warn(old_gpu_warn.format(d, name, major, minor, min_arch // 10, min_arch % 10))
WARNING:lightwood-16979:Exception: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1. when training model: <lightwood.model.neural.Neural object at 0x7f9c34df1e80>
Process LearnProcess-1:13:
Traceback (most recent call last):
  File "/home/maxs/dev/mdb/venv38/sources/lightwood/lightwood/model/helpers/default_net.py", line 59, in forward
    output = self.net(input)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 96, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/maxs/dev/mdb/venv38/lib/python3.8/site-packages/torch/nn/functional.py", line 1847, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

This happens in spite of:

assert torch.cuda.is_available() == True
torch.version.cuda == '10.2'

How can I check for an older GPU that doesn’t support torch without actually try/catching a tensor-to-gpu transfer? The transfer initializes cuda, which wastes like 2GB of memory, something I can’t afford since I’d be running this check in dozens of processes, all of which would then waste 2GB of memory extra due to the initialization.

Since the answer here might not be very torch-related and instead implicate usage of e.g. nvidia specific tools I also asked here: python - How to check torch gpu compatibility without initializing CUDA? - Stack Overflow | Putting in the link in case anyone stumbles upon this question later and finds the SO answers useful.

I think the latest cuda version vailable is 11.3, use the command provided in pytorch installation guide https://pytorch.org
It installs automatically pytorch cuda compatible.
Pardon if I misunderstood your request! :upside_down_face:

It seems like you are correct assuming:

Since the answer here might not be very torch-related and instead implicate usage of e.g. nvidia specific tools

According to this comment there is a relationship between the CUDA version and the deprecated compute capabilities you won’t be able to use. You can check the CUDA 10.2 Toolkit Docs where it says which compute capabilities are deprecated.

I am more so referring to doing this check on someone else machine, that already has torch installed.

I.e. for a package that a user installs where I have to check “given this environment, should I run this with cuda or on the cpu?”

It might be not be the best solution, but you could do a look-up table where you do a relationship between CUDA versions and deprecated GPUs.
Then, using torch.cuda.get_device_name(torch.cuda.current_device()) you could check if the code should be executed on GPU or CPU.

Hi George!

As far as I know, the only airtight way to check cuda / gpu compatibility
is torch.cuda.is_available() (and to be completely sure, actually
perform a tensor operation on the gpu). That’s what I do on my own
machines (but once I check a that a given version of pytorch works with
my gpu, I don’t have to keep doing it).

You want to check somebody else’s pytorch with somebody else’s gpu,
so I would say it’s doubly important to actually run the gpu.

I would run a separate python process that runs a simple gpu test
script before running your “real” program. (It could store its result in
a text file or environment variable or simply inform the user.) When
that process exits, it will release any cuda overhead.

If you want to get fancy, you could have your “real” program spawn
your test script in a separate process, and proceed with or without
the gpu depending on the test script’s results. Again, when the
spawned process exits, its cuda overhead will be released.

Best.

K. Frank

Based on the code in torch.cuda.__init__ that was actually throwing the error the following check seems to work:

import torch
from torch.cuda import device_count, get_device_capability


def is_cuda_compatible():
    if torch.version.cuda is not None:
        compatible_device_count = 0
        for d in range(device_count()):
            capability = get_device_capability(d)
            major = capability[0]
            minor = capability[1]
            current_arch = major * 10 + minor
            min_arch = min((int(arch.split("_")[1]) for arch in torch.cuda.get_arch_list()), default=35)
            if (not current_arch < min_arch
                    or not torch._C._cuda_getCompiledVersion() <= 9000
                    or not (major >= 7 and minor >= 5)):
                compatible_device_count += 1

        return True
    return False

Not sure if it’s 100% correct but putting it out here for feedback and in case anybody else needs it. Will be PRing into torch itself later since it seems like the kind of functionality it ought to have.

Based on the code in torch.cuda.__init__ that was actually throwing the error the following check seems to work:

import torch
from torch.cuda import device_count, get_device_capability

def is_cuda_compatible():
    compatible_device_count = 0
    if torch.version.cuda is not None:
        for d in range(device_count()):
            capability = get_device_capability(d)
            major = capability[0]
            minor = capability[1]
            current_arch = major * 10 + minor
            min_arch = min((int(arch.split("_")[1]) for arch in torch.cuda.get_arch_list()), default=35)
            if (not current_arch < min_arch
                    and not torch._C._cuda_getCompiledVersion() <= 9000):
                compatible_device_count += 1

    if compatible_device_count > 0:
        return True
    return False

Not sure if it’s 100% correct but putting it out here for feedback and in case anybody else needs it. Will be PRing into torch itself later since it seems like the kind of functionality it ought to have.