Weird difference between function.forward(input) and function(input)

If I create a function and I apply it by calling forward method, the gradient computed seems independent of my backward() method and seems correct, even if the backward() was incorrect.

For example, with this code:

class Cube(Function):
    def forward(self,input):
        return input*input*input

    def backward(self, grad_output):
        input, = self.saved_tensors
        # wrong backward function:
        return grad_output

cube = Cube()
input = Variable(torch.ones(2,2).double(), requires_grad=True)
output = cube(input).sum()
print(input.grad) # gives [[1,1],[1,1]] what does my backward do
output = cube.forward(input).sum()
print(input.grad) # gives [[3,3],[3,3]] the good gradient ?!

You can’t apply a function by directly calling its forward method; what that ends up doing is just calling that forward method on the Variable objects, as if it were the forward of a Module, and building a standard autograd graph without ever using the function’s backward implementation.