I am looking into PyTorch-jit-fusion now. I try to set a breakpoint in fuser_kernel.cpp at FusedKernelCPU::FusedKernelCPU(…) when I run a jit-trace stript. Anyone can tell me when and how such function will be invoked? Any reply will be appreicated. thanks
Last I looked, CPU fusion was disabled by default, so you need to enable it with
myfn.graph_for(inp) will show if you have fusion for your inputs.
Note that as optimized graphs are cached, you need to re-define the traced/scripted function in order to clear the cache between after enabling CPU fusion if you previously ran it with CPU fusion disabled.
Note that CPU fusion is disabled by default due to performance & flakiness issues. In turn AVX & co are disabled in the CPU fuser because they sometimes cause problems (see the commentary in