A question about autograd.Function

sadegh94 · April 22, 2024, 1:03pm

hello everyone
I’m new here and this is my first topic.
I have problem with my code.
I’ve searched and didn’t find a solution.

The problem is I have a function that has an output shape different than input
for example if input shape be (batch_no,1), my output shape will be (batch_no,2,10).
the torch.autograd.Function makes the grad shape as the output but excpects as input.
I thought this should not be an issue because we should have n_output*n_input gradians.

i share a dummy code here which makes the same error.
I would appreciate any help

class test2(torch.autograd.Function):
    @staticmethod
    def forward(ctx,x):
        ctx.save_for_backward(x)        
        result = [np.ones((5)) for i in x]
        return torch.tensor(result)
    @staticmethod
    def backward(ctx, grad):
        x = ctx.saved_tensors
        return grad

    
inp = torch.tensor([0.5,0.5,0.5,0.5], requires_grad=True)
x = test2.apply(inp)
x.shape
x.sum().backward()

and i get this error:
RuntimeError: Function test2Backward returned an invalid gradient at index 0 - got [4, 5] but expected shape compatible with [4]

eqy · April 22, 2024, 8:32pm

I recommend this guide PyTorch: Defining New autograd Functions — PyTorch Tutorials 2.3.0+cu121 documentation
which specifies that the return should be the gradient with respect to the input (it must have the same shape as the input). In practice if an input variable is used in multiple places (multiple outputs), its gradient is the total derivative (sum of gradient contribution from each of these outputs). See also: Total derivative - Wikipedia

sadegh94 · April 23, 2024, 7:21am

Thanks that’s right