I have some difficulties with backpropagation and the function nn.utils.vector_to_parameters() .
Say I have a parameter mu, perform some operations on it and finally want to insert it as weights and biases into a network (net).
Then I pass a batch of data x through the network, compute a loss and backpropagate the error back to mu (the original, flattened parameter).
Here is a minimal code example:
import torch import torch.nn as nn import torch.optim as optim x = torch.ones((1, 8)) # input example net = nn.Sequential(nn.Linear(8, 8)) size=sum(p.numel() for p in net.parameters() if p.requires_grad) mu = nn.Parameter(torch.ones((size,))*0.05, requires_grad=True) # perform some operations on mu nn.utils.vector_to_parameters(mu, net.parameters()) y = net(x) optimizer = optim.SGD([mu], lr=1e-3) loss = y.sum() loss.backward() optimizer.step() print(mu) print(mu.grad)
After the update step, mu remains the same as before, i.e. it is not getting optimized. Also, mu.grad returns None.
Is the function nn.utils.vector_to_parameters(mu, net.parameters()) stopping the gradient flow, and if so, is there an alternative way to insert mu as “external”, flattened parameter into the weights of the network?
Thank you all!