Parameter Ordering

brookisme · August 7, 2018, 4:49pm

Hi Everyone.

I’m confused by the ordering of the parameters when you call model.parameters() and model.named_parameters().

Here is a portion of my model summary (using pytorch-summary) where I’ve added in the divisions between layers/blocks:

# ----------------- conv2dtranspose
  ConvTranspose2d-32        [-1, 128, 114, 114]         131,200
# ----------------- conv-block
           Conv2d-33        [-1, 128, 112, 112]         295,040
             ReLU-34        [-1, 128, 112, 112]               0
           Conv2d-35        [-1, 128, 110, 110]         147,584
             ReLU-36        [-1, 128, 110, 110]               0
# ----------------- squeeze and excitation block
AdaptiveAvgPool2d-37            [-1, 128, 1, 1]               0
           Linear-38                    [-1, 8]           1,032
             ReLU-39                    [-1, 8]               0
           Linear-40                  [-1, 128]           1,152
          Sigmoid-41                  [-1, 128]               0
SqueezeExcitation-42        [-1, 128, 110, 110]               0

The corresponding parameters (layer-index, name, shape) looks like…

# ----------------- conv2dtranspose
20 up_blocks.0.up.weight torch.Size([256, 128, 2, 2])
21 up_blocks.0.up.bias torch.Size([128])
# ----------------- squeeze and excitation block
22 up_blocks.0.conv_block.se.fc.0.weight torch.Size([8, 128])
23 up_blocks.0.conv_block.se.fc.0.bias torch.Size([8])
24 up_blocks.0.conv_block.se.fc.2.weight torch.Size([128, 8])
25 up_blocks.0.conv_block.se.fc.2.bias torch.Size([128])
# ----------------- conv-block
26 up_blocks.0.conv_block.conv_layers.0.weight torch.Size([128, 256, 3, 3])
27 up_blocks.0.conv_block.conv_layers.0.bias torch.Size([128])
28 up_blocks.0.conv_block.conv_layers.2.weight torch.Size([128, 128, 3, 3])
29 up_blocks.0.conv_block.conv_layers.2.bias torch.Size([128])

The issue is then that although the model.parameters() is reversing the order of the conv-block and the squeeze-and-excitation-block.

Is this the expected behavior? Is there a way to ensure model.parameters() returns the weights in the order that they are executed in the forward pass?

CONTEXT: I was looking to transfer weights from a keras model to pytorch. My approach was simple - I converted the keras weights to list of numpy arrays where I used np.swapaxes to change from bands-last to bands first. I was then planing on doing something like

def update_weights(model):
    for i,((tn,tp),kp) in enumerate(zip(model.named_parameters(),keras_weights)):
        kp=torch.tensor(kp)
        if tp.shape==kp.shape:
            tp.data=kp.data
        else:
            print("SHAPE MISMATCH:",i,tn,tp.shape,kp.shape)
    return model

albanD · August 7, 2018, 5:22pm

Hi,

The thing is that parameters are created in the __init__ method and no information about how they’re going to be used in forward is known in advance.
If you want to enforce an order, storing your Modules in an nn.Sequential or a nn.ModuleList should work.

brookisme · August 7, 2018, 5:25pm

Makes sense. Thanks!