Question regarding differentiation

Haimin_Hunter_Zhang · July 30, 2019, 11:02am

Hi guys, please see the following image.

I don’t get why the third element of a.grad is -0.333. Could anyone fill me in?

ptrblck · July 30, 2019, 9:55pm

The gradients you are seeing is accumulated from the division and the max operation.
Here is the step by step approach:

a = torch.tensor([1., 2., 3.], requires_grad=True)

# Get div gradients
y = a / a.max().detach()
y.backward(torch.ones_like(y))
print('div grad: ', a.grad)
a.grad.zero_()

# Get max gradients
y = a.detach() / a.max()
y.backward(torch.ones_like(y))
print('max grad: ', a.grad)
a.grad.zero_()

# Get both
y = a / a.max()
y.backward(torch.ones_like(y))
print('both: ', a.grad)

div grad:  tensor([0.3333, 0.3333, 0.3333])
max grad:  tensor([ 0.0000,  0.0000, -0.6667])
both:  tensor([ 0.3333,  0.3333, -0.3333])

Haimin_Hunter_Zhang · July 31, 2019, 12:29am

Much appreciated. I get it now.