In inductor, can we unify the way to generate code between using extern_kernels and torch.ops?

Freerock_Neverdrop · October 3, 2024, 12:37pm

in pytorch inductor source code, there is the var named extern_kernels, which stores kernels to be called, also there is torch.ops, which is monkey patched to include ops there are called. Here the op and kernel are roughly the same thing. Strictly speaking here the kernel is actually an op.
Some node are converted to ExternKernelSchedulerNode, and generate code like:

extern_kernels.addmm(primals_2, primals_3, reinterpret_tensor(primals_1, (2, 3), (1, 2), 0), alpha=1, beta=1, out=buf0)

While for ops, SchedulerNode is created, and generated code is like below:

torch.ops.aten.bernoulli_.float(buf2, 0.8)

My question is, why not do it in the same way? e.g., treat addmm and an op and generate code like

torchops.aten.addmm()

We know that implementation of extern_kernels.addmm is actually aten.addmm.

What is the point to have the similar function duplicated?

Sahoo · May 19, 2025, 2:59pm

How did you get these kind of ops like torch.ops.aten.bernoulli_ ? I either get the triton ops or extern_kernels ops?

I extern_kernels can link to external libraries as well.