Problem loading model from state dict

Hi everyone I am getting an error wen trying to load the model from the saved checkpoint file.

Model:

class RecNet2(nn.Module):
    def __init__(self, in_shape, num_classes=12 ):
        super(RecNet2, self).__init__()

        self.layer1 = nn.GRU(in_shape[-1], 256)
        self.layer2 = nn.LSTM(256, 512)
        self.fc1 = nn.Linear(512, 256)
        self.fc2 = nn.Linear(40 * 256, num_classes)

    def forward(self, x):
        x, h_out = self.layer1(x)
        x = F.dropout(x, p=0.5)
        x, h_ou2 = self.layer2(x)
        x = F.dropout(x, p=0.3)
        x = self.fc1(x)
        x = self.fc2(x.view(-1, 40 * 256))

        return x  #logits

Error message:

>>> model.load_state_dict(fle)
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 482, in load_state_dict
    own_state[name].copy_(param)
RuntimeError: invalid argument 2: sizes do not match at /opt/conda/conda-bld/pytorch_1512387374934/work/torch/lib/THC/generic/THCTensorCopy.c:101

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/miniconda3/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in load_state_dict
    .format(name, own_state[name].size(), param.size()))
RuntimeError: While copying the parameter named layer1.weight_ih_l0, whose dimensions in the model are torch.Size([768]) and whose dimensions in the checkpoint are torch.Size([768, 101]).

The .pth file has been created during training of the above model. Any suggestions on how to investigate this and why are there mismatch tensor shape errors?

that says that layer of your model has been changed from the checkpoint to the definition. In your case it is nn.GRU

Thanks Soumith. Appreciate it!

RuntimeError Traceback (most recent call last)
in ()
1 model = CaptionModel_B(2048, 50, 160, vocab_size, num_layers=1)
----> 2 model.load_state_dict(torch.load(‘im_caption_35.727_0.316_epoch_20.pth.tar’, map_location=‘cpu’))
3 solver = NetSolver(data, model)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
843 if len(error_msgs) > 0:
844 raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
–> 845 self.class.name, “\n\t”.join(error_msgs)))
846 return _IncompatibleKeys(missing_keys, unexpected_keys)
847

RuntimeError: Error(s) in loading state_dict for CaptionModel_B:
size mismatch for rnn.embed.weight: copying a param with shape torch.Size([9080, 50]) from checkpoint, the shape in current model is torch.Size([8947, 50]).
size mismatch for rnn.linear.weight: copying a param with shape torch.Size([9080, 160]) from checkpoint, the shape in current model is torch.Size([8947, 160]).
size mismatch for rnn.linear.bias: copying a param with shape torch.Size([9080]) from checkpoint, the shape in current model is torch.Size([8947]).

Can you help me to solve this error?