As shown in the code, why the gradient of tt
is not [2., 4., 6., 8.]
, but [0., 4., 6., 8.]
.
Is it affected by the design principle of a[0] = value
? Why is it designed like this?Is it right?
import torch
import numpy as np
def func(t, value):
a = t * t
a[0] = value
return a.sum()
array = np.array([1, 2, 3, 4], dtype='float32')
value = np.array([5.], dtype='float32')
tt = torch.tensor(array, requires_grad=True)
tvalue = torch.tensor(value, requires_grad=True)
l1 = func(tt, tvalue)
l1.backward()
print('torch:array', tt.grad) # torch:array tensor([0., 4., 6., 8.])
print('torch:value', tvalue.grad) # torch:value tensor([1.])