A PTX JIT compilation failed

tom · October 11, 2019, 9:04am

So the PTX JIT compilation only kicks in when you have a CUDA compute arch for your hardware that isn’t supported by the binary you are running and the PTX JIT jumps in to bridge that gap.
So the questions would be

What is the compute arch of your hardware? (If you don’t know, Wikipedia has the information for “sales name -> arch”.)
What is are the compute archs included in your PyTorch?
Is something up with the CUDA installation that makes it fail?

Typically, official PyTorch binaries come with the supported arch binaries all compiled while by default self-compile PyTorch only comes with the arch of the hardware you compile on.
You can use cuobjdump /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch.so | grep 'arch' | sort | uniq to check what PyTorch has.
For example, on my GTX1080Ti, a self-compiled PyTorch will have only arch = 6.1 while the 1.2 wheel from the PyTorch site has 30,35,50,60,61,70,75.

The easiest remedy is likely to deploy a PyTorch that includes the right arch binaries.

Best regards

Thomas