Hi! I found several similar topics, but not exactly what I was looking for.
Let’s Assume I have a pre-trained EfficientNetB0. I want to create a new model and tweak architecture a little bit, then I want to load weights from trained model (for every unaltered layer) and randomly init weights for new layers.
import torch
import torch.nn as nn
import torchvision.models as models
def build_model(num_classes=5):
model = models.efficientnet_b0(
weights=models.EfficientNet_B0_Weights.DEFAULT,
)
model.classifier[1] = nn.Linear(in_features=1280, out_features=num_classes)
return model
model_path = 'path/model.pt'
new_model = build_model(num_classes=3)
# load old weights to the new model (iterate the named_modules in both models and load the state_dict per layer)
old_model_state_dict = torch.load(model_path)
new_model_state_dict = new_model.state_dict()
for k, v in old_model_state_dict.items():
if k in new_model_state_dict:
new_model_state_dict[k] = v
new_model.load_state_dict(new_model_state_dict)
For example in this case I will get an error because I’ve changed number of output channels, but my small checker only will (maybe) work if layer is new, but not changed
So I want to be able to add layers, delete layers and change something in the layer and still be able to use old weights (to every other unchanged layer)
Would be happy to get some recommendations on the best way to do that!