Hi,
I am getting the following warning when trying to use the max-autotune mode using torch.compile.
“”"
torch._inductor.utils: [WARNING] not enough SMs to use max_autotune_gemm mode
skipping cudagraphs due to [‘non-cuda device in graph’]
“”"
Does this mean that I am using max-autotune no cudagraph mode automatically, if so, can someone explain how much performance could I possibly be using.
Also, I am using NVIDIA A100 GPU, I don’t think this warning should have come at all.