Different keys found in model.state_dict() and model.named_parameters()

Yozey · October 5, 2017, 4:44pm

I’ve just found that the variables names (keys) could be different in the model.state_dict() and model.named_parameters() if you don’t correctly initialize your model.

I created a self-defined model by recursion in order to add a submodule in each level. The code is minimized:

Wrong version:

class My_Module(nn.Module):
    def __init__(self,depth):
        super().__init__()
        self.my_submodule = my_submodule()
        self._generate_network(depth)

    def _generate_network(self, level):
        if level > 1:
            self._generate_network(level - 1)
        else:
            # Do other things
        self.add_module('submodule_' + str(level),self.my_submodule)
        # Do other things

Right version:

class My_Module(nn.Module):
    def __init__(self,depth):
        super().__init__()
        self._generate_network(depth)

    def _generate_network(self, level):
        if level > 1:
            self._generate_network(level - 1)
        else:
            # Do other things
        self.add_module('submodule_' + str(level),my_submodule())
        # Do other things

If you do the wrong version, in the model.state_dict(), you could find the tensors such as module.submodule_0.weight, module.submodule_1.weight etc. but in model.named_parameters(), you could only find only one variable named module.my_submodule.weight

If you do the right version, you will have module.submodule_0.weight, module.submodule_1.weight etc. in both model.named_parameters() and model.state_dict()

I have several questions concerning the mechanism of state_dict and named_parameters,

Actually the wrong version could run, but has all my submodules in each level successfully added to the graph? Or it is always the same variable added in each level?
Is it reasonable to add a warning or raise an error if this uncorrespondent situation occurred?

smth · October 11, 2017, 5:43am

in the “wrong version”, you are not specifying that self.my_submodule has to be deep copied. You are just adding additional references to it via self.add_module. So it is doing what you asked for.

Maybe in the wrong version what you want is:
self.add_module('submodule_' + str(level),self.my_submodule.clone())

Yozey · October 11, 2017, 1:00pm

Thank you very much for your kind answer and comment.
My question here might be more related to “Are there any chances that the model.state_dict() and model.named_parameters() have different variable keys”?

I found that they could be different if we don’t init our model in a right way and i’m wondering why this could happen.

Thank you:slight_smile: