Hi guys,
I have some difficulties with backpropagation and the function nn.utils.vector_to_parameters() .
Say I have a parameter mu, perform some operations on it and finally want to insert it as weights and biases into a network (net).
Then I pass a batch of data x through the network, compute a loss and backpropagate the error back to mu (the original, flattened parameter).
Here is a minimal code example:
import torch
import torch.nn as nn
import torch.optim as optim
x = torch.ones((1, 8)) # input example
net = nn.Sequential(nn.Linear(8, 8))
size=sum(p.numel() for p in net.parameters() if p.requires_grad)
mu = nn.Parameter(torch.ones((size,))*0.05, requires_grad=True)
# perform some operations on mu
nn.utils.vector_to_parameters(mu, net.parameters())
y = net(x)
optimizer = optim.SGD([mu], lr=1e-3)
loss = y.sum()
loss.backward()
optimizer.step()
print(mu)
print(mu.grad)
After the update step, mu remains the same as before, i.e. it is not getting optimized. Also, mu.grad returns None.
Is the function nn.utils.vector_to_parameters(mu, net.parameters()) stopping the gradient flow, and if so, is there an alternative way to insert mu as “external”, flattened parameter into the weights of the network?
Thank you all!