I’ve just found that the variables names (keys) could be different in the model.state_dict()
and model.named_parameters()
if you don’t correctly initialize your model.
I created a self-defined model by recursion in order to add a submodule in each level. The code is minimized:
Wrong version:
class My_Module(nn.Module):
def __init__(self,depth):
super().__init__()
self.my_submodule = my_submodule()
self._generate_network(depth)
def _generate_network(self, level):
if level > 1:
self._generate_network(level - 1)
else:
# Do other things
self.add_module('submodule_' + str(level),self.my_submodule)
# Do other things
Right version:
class My_Module(nn.Module):
def __init__(self,depth):
super().__init__()
self._generate_network(depth)
def _generate_network(self, level):
if level > 1:
self._generate_network(level - 1)
else:
# Do other things
self.add_module('submodule_' + str(level),my_submodule())
# Do other things
If you do the wrong version, in the model.state_dict()
, you could find the tensors such as module.submodule_0.weight
, module.submodule_1.weight
etc. but in model.named_parameters()
, you could only find only one variable named module.my_submodule.weight
If you do the right version, you will have module.submodule_0.weight
, module.submodule_1.weight
etc. in both model.named_parameters()
and model.state_dict()
I have several questions concerning the mechanism of state_dict
and named_parameters
,
- Actually the wrong version could run, but has all my submodules in each level successfully added to the graph? Or it is always the same variable added in each level?
- Is it reasonable to add a warning or raise an error if this uncorrespondent situation occurred?