Backpropagation accumulate gradient on a Tensor

Harsh_Pathak · January 30, 2021, 7:32am

Hi All,
By default, say if a tensor x is used multiple times in a computation graph, and we do back-propagation on this graph, then x.grad stores a sum of the gradients coming from each outgoing edge of x. As an example,

import torch
x = torch.ones((1), requires_grad = True)
a = torch.ones((1), requires_grad = True)
b = a + x
c = b + x
d = c + x
d.backward()
print(x.grad) // will be 3 (1+1+1)

However, my use-case requires that I need to fill x.grad with gradient coming from the branch d = c + x (meaning only d to x immediate edge). However, by default, the gradient will also flow from c to x and b to x as well.

Also note, that I cannot detach x while calculating b and c. My use-case involves an LSTM where I am calculating a loss and back-propagation at each time.

Hence, Is there a way to concatenate gradients for x, something like x.grad = [1,1,1] instead?