What numbering convention is used in the keys of a state dictionary?

AreTor · June 15, 2018, 1:00pm

I’ve seen that in many models there are layer parameters named like 'features.0.weight'/'features.0.bias' and 'features.3.weight'/'features.3.bias', skipping numbers 1 and 2. Why are they skipped?

ptrblck · June 15, 2018, 1:08pm

Maybe because the layers 1 and 2 are nn.ReLU() and nn.MaxPool2d.
Not easy to tell without a model definition, but this might be the case.

AreTor · June 15, 2018, 1:14pm

It makes sense, but it complicates things when someone wants to iterate through the state dictionary.

ptrblck · June 15, 2018, 1:14pm

What is your use case?
Could you post an example of a complicated situation?

AreTor · June 15, 2018, 1:29pm

I need to retrieve 2D-convolutional weights (and biases) from a state_dict.

The easiest thing it comes to my mind is iterating over the dictionary keys for 4d shaped Tensors whose name ends with .weight and then retrieve also their .bias counterpart.

This approach seems error prone to me, since I have not the guarantee that the .weight/.bias I retrieve come from a convolutional layer. I mean, the only guarantee I have is that it is 4d shaped.

Wouldn’t it be more practical to store keys like features.conv2d.0.weight, features.conv2d.1.weight, and so on?

ptrblck · June 15, 2018, 1:35pm

Here is a dummy example to get the conv parameters:

model = nn.Sequential(
    nn.Conv2d(3, 6, 3, 1, 1),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Linear(10, 10)
)
    
for child in model.children():
    if isinstance(child, nn.Conv2d):
        print(child.weight)
        print(child.bias)

You could also call .named_children() to get the name of the current layer, if you need it.