Missing weights in model.parameters

CedricLy · January 5, 2021, 4:06pm

Happy New Year community,

I stumbled across a strange phenomena.
I define 3 Sequentials in a model and when I go for model.parameters, I expect 6 parameters.
One weight and one bias parameter for each sequential.

Instead I get only the first sequentials.
Isn’t this problematic? The optimizer will later get all the model.parameters, but if 2 Sequentials are missing, they will not get trained as a consequence right?

import torch
import torch.nn as nn

class model(nn.Module):
    
    def __init__(self):
        super(model,self).__init__()
        sequence_generated_var = [nn.Linear(5, 1)]
        self.gen_energy = nn.Sequential(*sequence_generated_var)
        self.gen_theta = nn.Sequential(*sequence_generated_var)
        self.gen_phi = nn.Sequential(*sequence_generated_var)
    def forward(self,x):
        return
   
print([x[0] for x in m.named_parameters()])#outputs  ['gen_energy.0.weight', 'gen_energy.0.bias']

I guess there is a good explanation for the case and I hope someone can enlighten me.

Thanks in advance.

JuanFMontesinos · January 5, 2021, 6:11pm

hmmm Can you try to instantiate the linear model once per definition?
I wonder if it can detect they all are pointing to the same object

CedricLy · January 5, 2021, 6:34pm

Well i get a response if I go model.gen_theta or model.gen_energy.

They even have different weights.

JuanFMontesinos · January 6, 2021, 7:26pm

Hmm
The modules should be registered at
model._modules
So you can check the keys at
model._modules.keys()
Can you post a reproducible script?
I think you may be facing a silly error.

CedricLy · January 6, 2021, 9:54pm

import torch
import torch.nn as nn

class model(nn.Module):
    
    def __init__(self):
        super(model,self).__init__()
        sequence_generated_var = [nn.Linear(5, 1)]
        self.gen_energy = nn.Sequential(*sequence_generated_var)
        self.gen_theta = nn.Sequential(*sequence_generated_var)
        self.gen_phi = nn.Sequential(*sequence_generated_var)
    def forward(self,x):
        return
  
m = model()
print([x[0] for x in m.named_parameters()])#outputs  ['gen_energy.0.weight', 'gen_energy.0.bias']
m._modules.keys() # outputs  odict_keys(['gen_energy', 'gen_theta', 'gen_phi'])

This is my minimal example.

JuanFMontesinos · January 7, 2021, 1:37am

Hmm I don’t know why but for some reason instantiating from the same module causes this error. If you create one instance per nn.Sequential it does work.

Internally the id of

id(self._modules['gen_energy']._modules['0'].weight)

is the same than

id(self._modules['gen_theta']._modules['0'].weight)

Soo it’s somehow recognizing that when generating the params.

If you want to do that you need to deepcopy the nn.Linear

CedricLy · January 7, 2021, 8:41pm

Ok I suspected something like that.
Is this intentional? I can’t think of a use case. Maybe something with recurrent layers?

JuanFMontesinos · January 8, 2021, 1:36pm

Well, I don’t know at which point of the code these happens but makes sense since gradients are tracked at a tensor level. Sooo when you call the iterable the parameters are returned only once (you don’t want several backprop updates).

If you look at pieces of code which uses an already-instantiated layer to create several ones deepcopy is always used.