Confused by torch.roll

frkmac1 · December 2, 2021, 6:22am

Could anyone tell me how the “torch.roll” works in backward process? I was trying to check how gradients changes in backward process, but it seems that there is no gradient information for the layer of “torch.roll”.

ptrblck · December 2, 2021, 7:14am

The backward operation of roll would just undo the roll operation on the tensor:

x = torch.randn(10, requires_grad=True)

y = x * torch.arange(10)
y.mean().backward()
print(x.grad)
# > tensor([0.0000, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000, 0.9000])

x.grad = None
y = x.roll(1) * torch.arange(10)
y.mean().backward()
print(x.grad)
# > tensor([0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000, 0.9000, 0.0000])