When running with a torch.compile() model with PyTorch 2.0 and CUDA 11.6, my code is segfaulting with just some sample code. When I remove torch.compile(), the code executes just fine. Any insight into what might be going on would be greatly appreciated .
Here is my CUDA environment:
NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.6
import torch
import torchvision.models as models
if __name__ == "__main__":
model = models.resnet18().cuda()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
# compiled_model = model # Works fine when not actually compiled
compiled_model = torch.compile(model)
x = torch.randn(16, 3, 224, 224).cuda()
optimizer.zero_grad()
out = compiled_model(x)
out.sum().backward()
optimizer.step()
@ptrblck As far as I can see from the cu116 index, the latest version of torch for cu116 is the one from 2023-02-02 which is what I already have installed: https://download.pytorch.org/whl/nightly/torch/. (And that’s the one that’s installed when I do pip3 install numpy --pre torch torchvision torchaudio --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cu116)
Is there a newer one that you know of?
The strange part is, when I tried this about a month ago, this code worked (no segfault) with the same CUDA + nvidia driver set up. But I’m not sure which versions of I had then of the different Torch libraries…
You are right and the binaries using CUDA 11.6 were deprecated in favor of 11.7 and 11.8.
Could you install one of these newer nightly releases and rerun your code, please?
Yes, the binaries with CUDA 11.6 will be deprecated soon and the PyTorch 2.0 binary release will support CUDA 11.7 and 11.8 as described here.
Note that you will still be able to build PyTorch from source with other CUDA versions and the “deprecation” only means the pip wheels and conda binaries will ship with CUDA 11.7 and 11.8 in the next release.
The ptxas issue seems to be a regression and is tracked here and here.