Question regarding differentiation

Hi guys, please see the following image.


I don’t get why the third element of a.grad is -0.333. Could anyone fill me in?

The gradients you are seeing is accumulated from the division and the max operation.
Here is the step by step approach:

a = torch.tensor([1., 2., 3.], requires_grad=True)

# Get div gradients
y = a / a.max().detach()
y.backward(torch.ones_like(y))
print('div grad: ', a.grad)
a.grad.zero_()

# Get max gradients
y = a.detach() / a.max()
y.backward(torch.ones_like(y))
print('max grad: ', a.grad)
a.grad.zero_()

# Get both
y = a / a.max()
y.backward(torch.ones_like(y))
print('both: ', a.grad)

div grad:  tensor([0.3333, 0.3333, 0.3333])
max grad:  tensor([ 0.0000,  0.0000, -0.6667])
both:  tensor([ 0.3333,  0.3333, -0.3333])

Much appreciated. I get it now.