I don’t think you need to initialize W
with data and requires_grad=True
, if you are overwriting the values in the next line.
Would this work for you?
w1=torch.tensor(0.1, requires_grad=True)
w2=torch.tensor(0.1, requires_grad=True)
w3=torch.tensor(0.1, requires_grad=True)
W=torch.empty(3, requires_grad=False)
W[0]=w1 * w2; W[1]=w2 * w3; W[2]=w3 * w1
Yp=torch.sum(X*W, dim=1)
loss = torch.nn.MSELoss()(Yp, Y)
loss.backward()
print(w1.grad)