When running with a torch.compile()
model with PyTorch 2.0 and CUDA 11.6, my code is segfaulting with just some sample code. When I remove torch.compile()
, the code executes just fine. Any insight into what might be going on would be greatly appreciated .
Here is my CUDA environment:
NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.6
The exact python packages:
pytorch-triton 2.0.0+0d7e753227
torch 2.0.0.dev20230202+cu116
torchaudio 2.0.0.dev20230201+cu116
torchvision 0.15.0.dev20230201+cu116
And the code that I’m trying to execute.
import torch
import torchvision.models as models
if __name__ == "__main__":
model = models.resnet18().cuda()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
# compiled_model = model # Works fine when not actually compiled
compiled_model = torch.compile(model)
x = torch.randn(16, 3, 224, 224).cuda()
optimizer.zero_grad()
out = compiled_model(x)
out.sum().backward()
optimizer.step()