Grad_fn hidden by inplace operations

Sam-gege · April 3, 2023, 7:04pm

import torch
x = torch.tensor([0.,0.,0.], requires_grad=True)
y = x * 3
print(f'grad_fn before inplace op: {y.grad_fn}')
y[:2]=100
print(f'grad_fn after inplace op: {y.grad_fn}')
y.sum().backward()
print(f'grad at x: {x.grad}')

# grad_fn before inplace op: <MulBackward0 object at 0x7fad0404e3a0>
# grad_fn after inplace op: <CopySlices object at 0x7fad0404e3a0>
# grad at x: tensor([0., 0., 3.])

As shown above, for a tensor y that already has a grad_fn MulBackward0, if you do inplace operation on it, then its grad_fn will be overwritten to CopySlices. However, pytorch still manages to do backward correctly to x. I have two questions:

how many tensors are actually created in the computational graph? If there’s only two (x and y) then,
how come pytorch still knows there’s a MulBackward0 associated to tensor y, although its grad_fn is overwritten to CopySlices? If you would say that MulBackward0 is not lost but hidden, how to show it?

soulitzer · April 3, 2023, 8:10pm

This does not apply to mul, because the tensor modified inplace is not actually needed to use in the backward pass, but if that IS the case we’d need to do a clone to save the original tensor before it was modified so it can be used for backward - so yes generally an extra tensor is created
The CopySlices node actually holds the original MulBackward node as a field, so that during backward it will use it to compute backward.