How does PythonOp been introduced?

my network contains:

# A memory-efficient implementation of Swish function
 class SwishImplementation(torch.autograd.Function):
     @staticmethod
     def forward(ctx, i):
         result = i * torch.sigmoid(i)
         ctx.save_for_backward(i)
         return result
 
     @staticmethod
     def backward(ctx, grad_output):
         i = ctx.saved_tensors[0]
         sigmoid_i = torch.sigmoid(i)
         return grad_output * (sigmoid_i * (1 + i * (1 - sigmoid_i)))
 
 class MemoryEfficientSwish(nn.Module):
     def forward(self, x):
         return SwishImplementation.apply(x)

after jit.trace, there are PythonOp nodes.
I know that grad introduces PythonOp, but why and how, please?