Copy weights of some layers

zoythum · January 11, 2023, 10:20am

I have two models that are built with the same structure except for a single layer placed in the middle of the second net which is not present in the first one, like this:

Model1:
layer1
layer2
layer3

Model2:
layer1
CustomLayer
layer2
layer3

I trained the first model and stored the weights in a file, I would now like to initialise the second model with these weights and perform some predictions with it.
To do so I instantiate the model and pass some parameters that are required by the layer in the middle but then I don’t know how to manually copy the weights from the first model into the second one.
I don’t think that I can use the usual load_state_dict function since the two models have different structures but I would like to do so while ignoring the single different layer.

Does anyone know if there is a way to manually copy the weights? In keras I would do something like

for layer_name in model_1:
     model_2.get_layer(layer_name).set_weights(model_1.get_layer(layer_name).get_weights())

Does PyTorch have something similar?

EDIT:
It was easier than expected, the solution seems to be the following

with torch.no_grad():
   model2.layer_name.weight.copy_(model1.layer_name.weight)
   model2.layer_name.bias.copy_(model1.layer_name.bias)

This process must be applied to each layer.

ptrblck · January 11, 2023, 8:20pm

Your approach looks correct, but note that you could also use .load_state_dict() on each submodule e.g. via:

model2.layer_name.load_state_dict(model1.layer_name.state_dict())

in case you would prefer this approach.