Hi,
I am using PyTorch 1.2.0 self-compiled with CUDA compute capability 5.2 with C++ and everything works as expected.
I read somewhere that everything down to compute capability 3.5 is supported. Hence, as we aim to support as many graphic cards as possible, i tried to compile PyTorch with compute capability 3.5. However, I get an error when using this version:
.THCudaCheck FAIL file=D:/tools/pytorch-v1.2.0/aten/src\THC/generic/THCTensorMath.cu line=16 error=209 : no kernel image is available for execution on the device
exception message: cuda runtime error (209) : no kernel image is available for execution on the device at D:/tools/pytorch-v1.2.0/aten/src\THC/generic/THCTensorMath.cu:16
The above operation failed in interpreter, with the following stack trace:
at code/model-input_rgbip-output_14classes_best_train_2019_08_03_cpu-eval-mode-export_latest_pytorch.py:292:12
_135 = getattr(_131, ā1ā)
_136 = _135.weight
_137 = _135.bias
_138 = getattr(self.decoder0, ā0ā)
_139 = _138.weight
_140 = _138.bias
_141 = getattr(self.logit, ā0ā)
_142 = _141.weight
_143 = _141.bias
input0 = torch._convolution(input, 1, None, [2, 2], [3, 3], [1, 1], False, [0, 0], 1, True, False, True)
~~~~~~~~~~~~~~~~~~ <ā HERE
input1 = torch.batch_norm(input0, weight, bias, running_mean, running_var, False, 0.10000000000000001, 1.0000000000000001e-05, True)
input2 = torch.relu(input1)
input3 = torch.max_pool2d(input2, [3, 3], [2, 2], [1, 1], [1, 1], False)
input4 = torch._convolution(input3, 5, None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1, True, False, True)
input5 = torch.batch_norm(input4, weight0, bias0, running_mean0, running_var0, False, 0.10000000000000001, 1.0000000000000001e-05, True)
input6 = torch.relu(input5)
input7 = torch._convolution(input6, 7, None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1, True, False, True)
out = torch.batch_norm(input7, weight1, bias1, running_mean1, running_var1, False, 0.10000000000000001, 1.0000000000000001e-05, True)
input8 = torch.add(out, input3, alpha=1)Compiled from code /opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py(340): forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(523): _slow_forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(537): call
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py(92): forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(523): _slow_forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(537): call
ā¦/dl/models/unet.py(153): forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(523): _slow_forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(537): call
/opt/conda/lib/python3.6/site-packages/torch/jit/init.py(883): trace_module
/opt/conda/lib/python3.6/site-packages/torch/jit/init.py(751): trace
(5):
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py(3296): run_code
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py(3214): run_ast_nodes
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py(3049): run_cell_async
/opt/conda/lib/python3.6/site-packages/IPython/core/async_helpers.py(67): _pseudo_sync_runner
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py(2874): _run_cell
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py(2848): run_cell
/opt/conda/lib/python3.6/site-packages/ipykernel/zmqshell.py(536): run_cell
/opt/conda/lib/python3.6/site-packages/ipykernel/ipkernel.py(294): do_execute
/opt/conda/lib/python3.6/site-packages/tornado/gen.py(209): wrapper
/opt/conda/lib/python3.6/site-packages/ipykernel/kernelbase.py(534): execute_request
/opt/conda/lib/python3.6/site-packages/tornado/gen.py(209): wrapper
/opt/conda/lib/python3.6/site-packages/ipykernel/kernelbase.py(267): dispatch_shell
/opt/conda/lib/python3.6/site-packages/tornado/gen.py(209): wrapper
/opt/conda/lib/python3.6/site-packages/ipykernel/kernelbase.py(357): process_one
/opt/conda/lib/python3.6/site-packages/tornado/gen.py(742): run
/opt/conda/lib/python3.6/site-packages/tornado/gen.py(781): inner
/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py(743): _run_callback
/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py(690):
/opt/conda/lib/python3.6/asyncio/events.py(145): _run
/opt/conda/lib/python3.6/asyncio/base_events.py(1451): _run_once
/opt/conda/lib/python3.6/asyncio/base_events.py(438): run_forever
/opt/conda/lib/python3.6/site-packages/tornado/platform/asyncio.py(148): start
/opt/conda/lib/python3.6/site-packages/ipykernel/kernelapp.py(505): start
/opt/conda/lib/python3.6/site-packages/traitlets/config/application.py(658): launch_instance
/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py(16):
/opt/conda/lib/python3.6/runpy.py(85): _run_code
/opt/conda/lib/python3.6/runpy.py(193): _run_module_as_main
So I assume compute capability 3.5 is also not fully supported anymore?
Down to which compute capability PyTorch 1.2.0 should work correctly?
My system:
Windows 10
Visual Studio 2019 - CUDA 10.1
Python 3.7
Self-Compiled PyTorch 1.2.0
Thanks!
Best,
Thomas