How bad is it to use torch.ops.aten?

gngdb · February 29, 2024, 9:07pm

They’re not documented much aside from this page. Is it recommended to use them in practice? Specifically, if I want to replicate the functionality called in the backward pass I can find the backward op associated with a given forward op and call it separately to get the same result and it should be exactly the same. For example, this replicates the functionality of F.embedding’s backward pass (approximately):

class EmbeddingTest(torch.autograd.Function):
    # replicate the behavior of the embedding backward
    @staticmethod
    def forward(ctx, input, weight):
        ctx.save_for_backward(input, weight)
        return F.embedding(input, weight)

    @staticmethod
    def backward(ctx, grad_output):
        input, weight = ctx.saved_tensors
        grad_input = grad_weight = None
        if ctx.needs_input_grad[0]:
            raise NotImplementedError('non-differentiable in general')
        if ctx.needs_input_grad[1]:
            grad_weight = torch.ops.aten.embedding_dense_backward(grad_output, input, weight.size(0), -1, False)
        return grad_input, grad_weight

What are the potential pitfalls to calling torch.ops.aten...?

soulitzer · March 9, 2024, 6:27pm

What are the potential pitfalls to calling torch.ops.aten... ?

There is less stability there, things can be subject to change