Custom torch.autograd.Function Inherited Class

I want to implement my own quantized and clipped ReLU. This is how I implemented it:

class _quantAct(torch.autograd.Function):
    def forward(ctx, input, clip_low=0., clip=6., bits=8, inplace=False):
        if inplace:
            output = input
            output = input.clone()
        output = output.div(clip).mul((2**bits)-1).round().div((2**bits)-1).mul(clip)

        return output

    def backward(ctx, grad_output):
        # saved tensors - tuple of tensors with one element
        mask, = ctx.saved_tensors
        grad_input = grad_output.masked_fill(mask,0)
        return grad_input, None, None, None, None

class quantReLU(nn.ReLU):
    def __init__(self, clip=6., bits=8, inplace=False):
        super(quantReLU, self).__init__()
        self.clip = clip
        self.bits = bits
        self.inplace = inplace

    def forward(self, inputs):
        return _quantAct().apply(inputs, 0, self.clip, self.bits, self.inplace)

How many grads do I have to return from the static backward method of my torch.autograd.Function inherited class? Why does it expect me to return 5 of them?

Appreciate your inputs, thanks!

You need to return as many values from backwards as were passed to to forward, this includes any non-tensor arguments (likeclip_low etc). For non-Tensor arguments that don’t have an input gradient you can return None but still need to return a value. So, as there were 5 inputs to forward, you need 5 outputs from backward. Technically, I gather if the user only passed input and left the others as default you could just return one gradient, but then you’d have to track that, instead you can also return extra Nones which will be ignored.
Explained in the docs on extending autograd.

Thanks @TomB, this answers my question. Appreciate it.