What I would like to do is to assign to some of those parameters variables that I create, so that I can backpropagate through the variables.
Up until now I have done something like
for p in model.parameters():
p.data = #whatever
But now this won’t work because the backpropagation will not function this way. I need to do something like
myParameter = functionToBackward(parameterToSubstitute)
module.parameterToSubstitute = myParameter
l = loss(module.forward(input))
l.backward()
#use gradient on myParameter
How can I achieve this? Also note that I want to substitute every parameter in the model, so I would like something that works with a for loop or something
Update: I have found something that almost works, namely:
for name, parameter in self.named_parameters():
newParam = myFunction(parameter)
setattr(self, name, newParam)
The issue now is that setattr does not work, as it complains that newParam is a Variable as opposed to a Parameter. I guess that even if we pass a Parameter to a function, the result is always a variable. How can I then set the parameter equal to the variable?
Doing something like nn.Parameter(newParam.data) clearly won’t work, because we won’t be able to backpropagate the gradients through myFunction. What do you suggest?
thanks for your answer. But I am not sure how to use the functional interface in my case.
I have a certain trained model; now I want to modify its weights in a certain way (using a function with learnable parameters) and be able to backpropagate to optimize the parameters of the function that modify the weights.
Is there a “clean” way to achieve this? The “dirty” option, if I am not mistaken, would be to call backward() manually; so doing something like
for p in self.parameters():
p.data = myFunction.forward(p.data)
loss.backward()
for p in self.parameters():
p.grad.data = myFunction.backward(p.grad.data)
I am not sure this would work though. What do you think?
Suppose you have your trained model m with a convolution layer m.conv, which contains a weight and a bias.
You want to perform some function on the weight and bias, and be able to backprop through it.
Consider the following snippet in the forward of your model:
class Model(nn.Module):
# define init etc
def forward(self, input):
new_weight = self.net_weight(self.conv.weight, input)
new_bias = self.net_bias(self.conv.bias, input)
result = F.conv2d(input, self.conv.weight, self.conv.bias, stride)
return result
Note that instead of using result = self.conv(input), I used the functional interface of it in its weight and bias.
Thank you for your answer! I was looking for something a little more general though. In your example if I want to modify the weights of a certain model (and use backward() on them) I need to rewrite the whole model using the functional interface.
If all I take in input is the model itself and treat it like a black box, I can’t do that; and since I can’t assign variables to the model parameters, I will have to do it using forward() and backward() manually
But how will that help? I still can’t assign variables to networks weights, so I can’t backprop through it unless I modify the network to support a functional interface.
I can create a custom module class for my operation, but then what?
You don’t want to assign Variables to the network weights. You want the network weights to be Parameters, and you can perform any operation you want on them, using those generated Variables as weights if you want.
Without further information, it’s difficult to understand why the proposed approach is not sufficient.
" using those generated Variables as weights if you want"
exactly, I want to use them as weights in the same network. Without rewriting the network using the functional interface.
The proposed approach works if I can use the transformed parameters to do my computations, but I can’t do that, because the computations I want to do on them is the ones that the network would do, and there is no way to do that without rewriting the network in functional form.
Hi, @antspy. Have you found the solution for this problem? I’m facing exact same issue now. It would be really helpful if you share the solution.
Thanks.
When I first encountered this thread (over a year ago) I was doing some meta-learning, and this (quite recent) library https://github.com/facebookresearch/higher does exactly what I was trying to achieve, maybe it will help you as well!