What numbering convention is used in the keys of a state dictionary?

I’ve seen that in many models there are layer parameters named like 'features.0.weight'/'features.0.bias' and 'features.3.weight'/'features.3.bias', skipping numbers 1 and 2. Why are they skipped?

Maybe because the layers 1 and 2 are nn.ReLU() and nn.MaxPool2d.
Not easy to tell without a model definition, but this might be the case.

It makes sense, but it complicates things when someone wants to iterate through the state dictionary.

What is your use case?
Could you post an example of a complicated situation?

I need to retrieve 2D-convolutional weights (and biases) from a state_dict.

The easiest thing it comes to my mind is iterating over the dictionary keys for 4d shaped Tensors whose name ends with .weight and then retrieve also their .bias counterpart.

This approach seems error prone to me, since I have not the guarantee that the .weight/.bias I retrieve come from a convolutional layer. I mean, the only guarantee I have is that it is 4d shaped.

Wouldn’t it be more practical to store keys like features.conv2d.0.weight, features.conv2d.1.weight, and so on?

Here is a dummy example to get the conv parameters:

model = nn.Sequential(
    nn.Conv2d(3, 6, 3, 1, 1),
    nn.Linear(10, 10)
for child in model.children():
    if isinstance(child, nn.Conv2d):

You could also call .named_children() to get the name of the current layer, if you need it.