Adding new parameters

pytorcher · February 10, 2018, 6:10pm

I’d like to add a new Parameter to my network. I have successfully created one, incorporated it into forward() and have a grad calcualted in backward(). However when I apply optimizer.step() the grad is not applied. Searching through here I have seen the register_parameter() function. This adds the parameter to my network’s _parameters, but not to its named_parameters which seems to be what actually gets accessed by the optimizer. I am cautious about brute force adding it to named_parameters in case that leaves it out anywhere else that it should be. Is there a different function I should be calling?

yoelshoshan · February 10, 2018, 7:38pm

A usually simpler option that you have is to add this parameter to your model, usually on the __init__ method.
Doing something like:
self.my_param = ...
it will be automatically registered.
And then when you do something like
optim.SGD(model.parameters(),...)
it should work well.

Do notice that if you want to accumulate lists of parameters, you need to use ModuleList()

pytorcher · February 10, 2018, 9:26pm

Thanks for the reply! I do in fact need to accumulate a list after init. ModuleList appears to only allow Modules and not Parameters. I tried making a new Module that only has my Parameter in it, but once again the Parameter has a grad calculated but not applied by the optimizer. Is the only way to do this by using premade nn.Modules? I am just trying to do a simple dot product of my parameter vector and a layers output vector. I could try to extend a nn.Linear to not be fully connected maybe?

pytorcher · February 10, 2018, 9:48pm

That also led me to ParameterList, but that seems to do the same thing as register_parameter from my original post when I add the parameter to a parameter list in my new module. Am I using ParameterList wrong, ie I have to call something to move things from the _parameters field to the named_parameters field?

edit: actually calling named_parameters().next() does seem to return my vector, so it must be in the graph somewhere. and .grad is getting filled in but the optimizer does not change the values.

SimonW · February 10, 2018, 11:08pm

When you add new parameter:

assign the parameter to attribute of the module
add it to the optimizer via optim.add_param_group({"params": my_new_param})

pytorcher · February 10, 2018, 11:57pm

Ahhh I did not realize the optimizer made its own list rather than just pointing to the networks parameters. For some reason my optim.SGD doesn’t seem to have a add_param_group function. But the code I’m starting with instantiates a new optimizer before each training epoch. It was currently being created before I make these parameters, by creating it afterwards it looks like it works! Works with both a ParameterList, and with the register_parameter() function I was using originally.

Thank you so much!

SimonW · February 11, 2018, 3:06am

@pytorcher You are probably using an older version of pytorch then. The method is added in 0.3. I highly suggest upgrading as 0.3 includes many new features, bug fixes, and speed increases. Also, creating a new optimizer at each epoch will be problematic if you want to apply momentum, as it basically throws away the momentum information.

Rohit_Keshari · May 9, 2018, 12:37pm

@pytorcher, @SimonW, @yoelshoshan
For my one of the application, I am trying to analyse my m1, m2, m3 parameters values and their gradients. My network architecture is as follows:

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden1, n_hidden2, n_hidden3, n_output):
        super(Net, self).__init__()
        self.hidden1 = torch.nn.Linear(n_feature, n_hidden1)   # hidden layer        
        self.m1      = torch.nn.Parameter(torch.randn(n_hidden1))
        self.hidden2 = torch.nn.Linear(n_hidden1, n_hidden2)   # hidden layer
        self.m2      = torch.nn.Parameter(torch.randn(n_hidden2))

        self.hidden3 = torch.nn.Linear(n_hidden2, n_hidden3)   # hidden layer        
        self.m3      = torch.nn.Parameter(torch.randn(n_hidden3))
        self.predict = torch.nn.Linear(n_hidden3, n_output)   # output layer

    def forward(self, x):
          
        x = F.relu(self.hidden1(x) )
        x = torch.mul(self.m1,x)
        print('My new parameter',self.m1)
        print('gradient',self.m1.grad)   
        
        x =  F.relu(self.hidden2(x))
        x = torch.mul(self.m2,x)

        x =  F.relu(self.hidden3(x))
        x = torch.mul(self.m3,x)
        x = self.predict(x)             # linear output
        return x

However, when I print my parameters using print('parameters',net.parameters), it gives me

<bound method Net.parameters of Net(
  (hidden1): Linear(in_features=784, out_features=10, bias=True)
  (hidden2): Linear(in_features=10, out_features=10, bias=True)
  (hidden3): Linear(in_features=10, out_features=10, bias=True)
  (predict): Linear(in_features=10, out_features=10, bias=True)
)>

m1, m2, m3 parameters have not reflected here. However, values are getting updated. I am trying to understand that how my optimizer getting m1,... parameter when it is not comming in the list. I have defined my optimizer as follows:

optimizer = optim.SGD(net.parameters(), lr=args.lr, momentum=0.95, weight_decay=5e-4)

SimonW · May 9, 2018, 3:08pm

.parameters is a method.

yoelshoshan · May 24, 2018, 8:08am

as @SimonW said - you need to use net.parameters() and not net.parameters

right now what you are seeing is a description of the method itself, instead of its invocation.

emmanuelrouxfr · April 17, 2019, 11:43am

You can list and verify that the parameters m1, m2 and m3 are present in your model as follow:

for name, param in net.named_parameters():
print(name, type(param.data), param.size())

The code is from this discussion:

Cheers !