What is different between FusionGroup and CudaFusionGroup?

Different API will result in different group such as FusionGroup and CudaFusionGroup. Why? And what is different?

These are actually different fuser generations.
The “classic” 1st-gen fuser only did pointwise ops and created FusionGroup nodes. The newer fuser developed by a team at NVIDIA creates CUDAFusionGroup. To round off the trio, there is TensorExprGroup nodes created by the TensorExpr/NNC fuser developed by a team at FB. The latter two also support some reductions.
A while ago, I wrote a blog on the various fusers.

Best regards

Thomas

2 Likes

Thanks for your reply. I read your blog first. It seems that the mechanism is not easy to figure out.

The JIT optimization steps probably are among the most sophisticated bits in PyTorch (along with the dispatcher,…). For a deep dive on one of the fusers, I can also enthusiastically recommend Christian Sarofeen’s talk (I think you need to register to see it).

Best regards

Thomas

Excellent! I will look into it and will email you when encounter any question. Much thanks.