I have a trained model with some parameters.
What I would like to do is to assign to some of those parameters variables that I create, so that I can backpropagate through the variables.
Up until now I have done something like
for p in model.parameters():
p.data = #whatever
But now this won’t work because the backpropagation will not function this way. I need to do something like
myParameter = functionToBackward(parameterToSubstitute)
module.parameterToSubstitute = myParameter
l = loss(module.forward(input))
#use gradient on myParameter
How can I achieve this? Also note that I want to substitute every parameter in the model, so I would like something that works with a for loop or something
Update: I have found something that almost works, namely:
for name, parameter in self.named_parameters():
newParam = myFunction(parameter)
setattr(self, name, newParam)
The issue now is that setattr does not work, as it complains that newParam is a Variable as opposed to a Parameter. I guess that even if we pass a Parameter to a function, the result is always a variable. How can I then set the parameter equal to the variable?
Doing something like nn.Parameter(newParam.data) clearly won’t work, because we won’t be able to backpropagate the gradients through myFunction. What do you suggest?
Have a look at the answer in Why can't model parameters be variables?
Let me know if something is not clear
thanks for your answer. But I am not sure how to use the functional interface in my case.
I have a certain trained model; now I want to modify its weights in a certain way (using a function with learnable parameters) and be able to backpropagate to optimize the parameters of the function that modify the weights.
Is there a “clean” way to achieve this? The “dirty” option, if I am not mistaken, would be to call backward() manually; so doing something like
for p in self.parameters():
p.data = myFunction.forward(p.data)
for p in self.parameters():
p.grad.data = myFunction.backward(p.grad.data)
I am not sure this would work though. What do you think?
Suppose you have your trained model
m with a convolution layer
m.conv, which contains a
weight and a
You want to perform some function on the
bias, and be able to backprop through it.
Consider the following snippet in the
forward of your model:
# define init etc
def forward(self, input):
new_weight = self.net_weight(self.conv.weight, input)
new_bias = self.net_bias(self.conv.bias, input)
result = F.conv2d(input, self.conv.weight, self.conv.bias, stride)
Note that instead of using
result = self.conv(input), I used the functional interface of it in its weight and bias.
Thank you for your answer! I was looking for something a little more general though. In your example if I want to modify the weights of a certain model (and use backward() on them) I need to rewrite the whole model using the functional interface.
If all I take in input is the model itself and treat it like a black box, I can’t do that; and since I can’t assign variables to the model parameters, I will have to do it using forward() and backward() manually
Not necessarily. You can write a
nn.Module class for your operation and reuse it, e.g., https://github.com/szagoruyko/diracnets/blob/master/diracconv.py#L15-L41
But how will that help? I still can’t assign variables to networks weights, so I can’t backprop through it unless I modify the network to support a functional interface.
I can create a custom module class for my operation, but then what?
You don’t want to assign Variables to the network weights. You want the network weights to be
Parameters, and you can perform any operation you want on them, using those generated Variables as weights if you want.
Without further information, it’s difficult to understand why the proposed approach is not sufficient.
" using those generated Variables as weights if you want"
exactly, I want to use them as weights in the same network. Without rewriting the network using the functional interface.
The proposed approach works if I can use the transformed parameters to do my computations, but I can’t do that, because the computations I want to do on them is the ones that the network would do, and there is no way to do that without rewriting the network in functional form.
Hi, @antspy. Have you found the solution for this problem? I’m facing exact same issue now. It would be really helpful if you share the solution.
Unfortunately no, I haven’t found a solution.
I know that convnets could use F.conv2d, but how about LSTMs, RNNs?
Is there a functional interface for LSTMs?
EDIT: It looks like 2nd order gradients are not supported for LSTMs anyway. So it is of no use to assign weights to LSTMs for now.
Hi, thank you for your question. I got the same issue as you. Have you fixed this problem now?
I’m facing the same problem.
setattr(self, name, newParam) require a nn.Parameter, but once you convert Variable to nn.Parameter, you lost the gradients creating that Variable.
There must be a way that we can bypass this!
Oh yeah! Re-write all nn.Module layers (using register_buffer) would definitely solve to problem.
But Is it worth?
This is old, but I was able to work around this issue by accessing module attributes using
__dict__ instead of
some_nn_module.__dict__['weight'] = myFunction(x)
This requires using
self.named_modules() instead of
self.named_parameters(), but doesn’t require writing any custom modules or changing existing ones.
When I first encountered this thread (over a year ago) I was doing some meta-learning, and this (quite recent) library https://github.com/facebookresearch/higher does exactly what I was trying to achieve, maybe it will help you as well!