Missing_keys and unexpected_keys in state_dict while loading pre-trained model

s_karki · February 9, 2022, 7:59am

This is what pre-trained model architecture look like:

Sequential(
  (0): Sequential(
    (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (1): ReLU()
    (2): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (3): ReLU()
    (4): Conv2d(32, 64, kernel_size=(7, 7), stride=(1, 1))
  )
  (classifier): Sequential(
    (conv1): Conv2d(64, 128, kernel_size=(5, 5), stride=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=3, stride=1, padding=0, dilation=1, ceil_mode=False)
    (conv2): Conv2d(128, 192, kernel_size=(5, 5), stride=(1, 1))
    (relu2): ReLU()
    (pool2): MaxPool2d(kernel_size=5, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv3): Conv2d(192, 200, kernel_size=(5, 5), stride=(1, 1))
    (relu3): ReLU()
    (pool3): MaxPool2d(kernel_size=4, stride=2, padding=0, dilation=1, ceil_mode=False)
    (dp1): Dropout(p=0.3, inplace=False)
    (flatten): Flatten(start_dim=1, end_dim=-1)
    (fc1): Linear(in_features=7200, out_features=4096, bias=True)
    (relu4): ReLU()
    (dp2): Dropout(p=0.3, inplace=False)
    (fc2): Linear(in_features=4096, out_features=256, bias=True)
    (relu5): ReLU()
    (dp3): Dropout(p=0.3, inplace=False)
    (fc3): Linear(in_features=256, out_features=2, bias=True)
  )
)

This is what my model look for loading:

# Autoencoder class

class Autoencoder_cnnclassifier(nn.Module):
  def __init__(self):
    super(Autoencoder_cnnclassifier, self).__init__()

    self.encoder = nn.Sequential(
            nn.Conv2d(3, 16, 3, stride = 2, padding = 1),
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, stride =2, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, 7),    #Conv2d(in_channels, out_channels, kernel_size)
        )
    self.classifier = nn.Sequential(collections.OrderedDict([
          ('conv1', nn.Conv2d(64,128,5, stride=1)),
          ('relu1', nn.ReLU()),
          ('pool1', nn.MaxPool2d(kernel_size=3, stride=1)),
          ('conv2', nn.Conv2d(128,192,5, stride=1)),
          ('relu2', nn.ReLU()),
          ('pool2', nn.MaxPool2d(kernel_size=5, stride=2)),
          ( 'conv3', nn.Conv2d(192,200,5, stride=1)),
          ('relu3', nn.ReLU()),
          ('pool3', nn.MaxPool2d(kernel_size=4, stride=2)),
          ('dp1', nn.Dropout(0.3)),
          ('flatten', nn.Flatten()),
          ('fc1', nn.Linear(7200, 4096)),
          ('relu4', nn.ReLU()),
          ('dp2', nn.Dropout(0.3)),
          ('fc2', nn.Linear(4096, 256)),
          ('relu5', nn.ReLU()),
          ('dp3', nn.Dropout(0.3)),
          ('fc3', nn.Linear(256, 2)),
   

  def forward(self, x):

    out = self.encoder(x)
    out = self.classifier(out)


    return out

The Error looks something like this:

_IncompatibleKeys(missing_keys=['encoder.0.weight', 'encoder.0.bias', 'encoder.2.weight', 'encoder.2.bias', 'encoder.4.weight', 'encoder.4.bias'], unexpected_keys=['0.0.weight', '0.0.bias', '0.2.weight', '0.2.bias', '0.4.weight', '0.4.bias'])

I believe the error is because pre-trained model has name ‘0’ and my defined model has name ‘encoder’ in the first sequential list.
How do i load pre-trained model without any weight mismatch and also no keys get left out.?

thecho7 · February 9, 2022, 8:12am

I don’t know how you save a model but

new_dict = {}
for k, v in pretrained.items():
    if k[0] == '0':
        k = 'encoder' + k[1:]
    
    new_dict[k] = v

model.load_state_dict(new_dict)

Assuming your pretrained is a dictionary of state_dict

s_karki · February 9, 2022, 8:34am

Thank You.
Also, to copy the weight exact weight i used.

`model.load_state_dict(copy.deepcopy(new_dict))`

Is there anything more you’d like to add??

thecho7 · February 9, 2022, 9:17am

deepcopy is not required. Any reasons to do that?
If the size of your model is considerably huge, then it probably causes a memory trouble.

s_karki · February 12, 2022, 4:24pm

Without copy.deepcopy, pre_trained and custom model doesn’t seem to have equal parameter(weights and biases).