NvFuser with torch.compile

Hi,
What is the status of Dynamo support for NvFuser? I could not find many references/tutorials on the topic and the ones I found seem deprecated and/or use TorchScript as the graph format.
Is it possible to use NvFuser with torch.compile?

Thanks!

nvFuser is deprecated and the default backend for TorchScript (which is also in maintenance mode) was switched back to NNC in this PR.

Hi @ptrblck, thanks for the clarification. I see NvFuser listed as a prospective codegen backend for Pytorch 2.0 here (PyTorch 2.0 | PyTorch) and I believe that support is not available yet.
Can you elaborate on what either nvFuser or NNC can offer in this context given that we already have Triton generating code for GPUs?

This is correct and the nvFuser support in TorchDynamo was removed here.

As mentioned before nvFuser is also deprecated in TorchScript and NNC is used as the default backend. Note that TorchScript is in “maintenance mode” and I believe won’t get any major updates or fixes anymore.

nvFuser development continues out-of-core in a standalone repository and we are exploring different directions.

1 Like

Is there any torch dynamo backend that generates CUDA code (.cu) and compiles it using nvcc (similar to how inductor uses g++ to compile fused c++) ? AFAIK, triton generates ptx code directly

1 Like

AFAIK, the backend you’re describing doesn’t exist. Can you explain the motivation for having this capability, given we can already generate PTX?

Was just curious if this ever was a part of the project. I guess one advantage would have been to use the nvcc compiler optimizations.

Hi, Ptrblck I think this is not accurate. The PR has been reverted. Also I frequently see the warning in Megatron-LM that the nvfuser is no longer supported in torch script. It does not tell me whether NNC is used when compilation optimization is applied with dynomo.

The PR was initially reverted but relanded in c913f385 as also shown in the same PR further down.