Confused by torch.roll

Could anyone tell me how the “torch.roll” works in backward process? I was trying to check how gradients changes in backward process, but it seems that there is no gradient information for the layer of “torch.roll”.

The backward operation of roll would just undo the roll operation on the tensor:

x = torch.randn(10, requires_grad=True)

y = x * torch.arange(10)
y.mean().backward()
print(x.grad)
# > tensor([0.0000, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000, 0.9000])

x.grad = None
y = x.roll(1) * torch.arange(10)
y.mean().backward()
print(x.grad)
# > tensor([0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000, 0.9000, 0.0000])
2 Likes