How to initiate weights for new layer when loading pretrained model

I add more convs but still want to use the pretrained model.
The error says:

Missing key(s) in state_dict:  "Missing key(s) in state_dict: "layer1.0.ae_module.fc.0.weight", "layer1.0.ae_module.fc.0.bias"

Would it be possible to load the pretrained state_dict before including the new layers?
This would avoid these kind of errors.

Hi ptrblck,

I’m trying to add a dropout layer after each of the conv2d layers in the pretrained mobilenet_V2, but I’m not sure how to code this. Could you please help me out here? Thank you so much!

The clean way would be to derive a custom class from the desired base class, add the dropout layer in its __init__, and change the forward method.
The hacky way might be this post.


Thank you so much for the reply! I tried the method you provided in that other post, it worked in terms of adding dropout layers after all conv2d layers. But when I tried to send the model to GPU I got ‘cuda out of memory’. I’m guessing, is it because of the deepcopy?

And I’m not sure that I understand the first solution you mentioned, could you please provide me with some code examples?

Thank you!

The OOM might come from the deepcopy.
How close the the memory limit are you before copying the modules?
Are you able to train the original model, since the forward and backward passes should also use some memory (which should be larger than copying some modules).

Have a look at this post for an example how to derive a custom class from a torchvision model.