Hi Everyone.
I’m confused by the ordering of the parameters when you call model.parameters()
and model.named_parameters()
.
Here is a portion of my model summary (using pytorch-summary) where I’ve added in the divisions between layers/blocks:
# ----------------- conv2dtranspose
ConvTranspose2d-32 [-1, 128, 114, 114] 131,200
# ----------------- conv-block
Conv2d-33 [-1, 128, 112, 112] 295,040
ReLU-34 [-1, 128, 112, 112] 0
Conv2d-35 [-1, 128, 110, 110] 147,584
ReLU-36 [-1, 128, 110, 110] 0
# ----------------- squeeze and excitation block
AdaptiveAvgPool2d-37 [-1, 128, 1, 1] 0
Linear-38 [-1, 8] 1,032
ReLU-39 [-1, 8] 0
Linear-40 [-1, 128] 1,152
Sigmoid-41 [-1, 128] 0
SqueezeExcitation-42 [-1, 128, 110, 110] 0
The corresponding parameters (layer-index, name, shape) looks like…
# ----------------- conv2dtranspose
20 up_blocks.0.up.weight torch.Size([256, 128, 2, 2])
21 up_blocks.0.up.bias torch.Size([128])
# ----------------- squeeze and excitation block
22 up_blocks.0.conv_block.se.fc.0.weight torch.Size([8, 128])
23 up_blocks.0.conv_block.se.fc.0.bias torch.Size([8])
24 up_blocks.0.conv_block.se.fc.2.weight torch.Size([128, 8])
25 up_blocks.0.conv_block.se.fc.2.bias torch.Size([128])
# ----------------- conv-block
26 up_blocks.0.conv_block.conv_layers.0.weight torch.Size([128, 256, 3, 3])
27 up_blocks.0.conv_block.conv_layers.0.bias torch.Size([128])
28 up_blocks.0.conv_block.conv_layers.2.weight torch.Size([128, 128, 3, 3])
29 up_blocks.0.conv_block.conv_layers.2.bias torch.Size([128])
The issue is then that although the model.parameters()
is reversing the order of the conv-block and the squeeze-and-excitation-block.
Is this the expected behavior? Is there a way to ensure model.parameters()
returns the weights in the order that they are executed in the forward pass?
CONTEXT: I was looking to transfer weights from a keras model to pytorch. My approach was simple - I converted the keras weights to list of numpy arrays where I used np.swapaxes
to change from bands-last to bands first. I was then planing on doing something like
def update_weights(model):
for i,((tn,tp),kp) in enumerate(zip(model.named_parameters(),keras_weights)):
kp=torch.tensor(kp)
if tp.shape==kp.shape:
tp.data=kp.data
else:
print("SHAPE MISMATCH:",i,tn,tp.shape,kp.shape)
return model