from torch import nn, optim
net1 = torch.nn.Linear(1,2)
net2 = torch.nn.Linear(1,1)
x = torch.Tensor([[1.]])
z = torch.Tensor([[2.]])
optimizer = optim.Adam(net1.parameters(), lr=1e-3)
Loss = 
for _ in range(50):
y = net1(x) # get the output from first neural network
net2.weight.data = y.reshape(-1,1) # using the output from first net as parameter to second net
net2.bias.data = y.reshape(-1,1)
pred = net2(z)
## above can also be written as pred = y.reshape(-1,1) @ z + y.reshape(-1,1) ## which I think can work
loss = (pred - 2).pow(2).sum()
print(loss.grad) # prints None value for either approach above
Above method is simply pred = [z,1] @ (W1*x + b1) where @ is dot product of two vector; and I try to make the network to output 2.
But clearly; the loss.backward() doesn’t give proper value with first approach. How do I fix it ? I kind of want to make it work for nn.Conv2d so I can avoid manually compute the convolutionary layer
loss.grad is expected to remain None as only the gradients for leaf variables are saved in the .grad field.
These two lines
net2.weight.data = y.reshape(-1,1) # using the output from first net as parameter to second net2.bias.data = y.reshape(-1,1)
Use .data and thus break the computational graph. Meaning that no gradients will flow back to n1.
If you want to set these weights, the right way to do this is to delete the existing parameters with del net2.weight and then set the field to the tensor that you want: net2.weight = y.reshape(-1,1).