Is the gradient of tensor slice right?

forever · July 22, 2021, 3:24am

As shown in the code, why the gradient of tt is not [2., 4., 6., 8.], but [0., 4., 6., 8.].
Is it affected by the design principle of a[0] = value? Why is it designed like this?Is it right?


import torch
import numpy as np


def func(t, value):
    a = t * t
    a[0] = value
    return a.sum()


array = np.array([1, 2, 3, 4], dtype='float32')
value = np.array([5.], dtype='float32')

tt = torch.tensor(array, requires_grad=True)
tvalue = torch.tensor(value, requires_grad=True)

l1 = func(tt, tvalue)
l1.backward()

print('torch:array', tt.grad)  # torch:array tensor([0., 4., 6., 8.])
print('torch:value', tvalue.grad)  # torch:value tensor([1.])

Varal7 · July 23, 2021, 5:50pm

The gradient calculation is correct.
When you set a[0] = value, the first value of the tt array is not modified.
Then, the computation of l1 never uses this first value of tt so its gradient will be 0.