Size mismatch for decoder.rule_logits.weight , bias and embedding weight

Muradean · February 2, 2021, 5:16pm

Hi everybody.

Recently I’ve git cloned and trained a natural language to SQL model (GitHub - microsoft/rat-sql: A relation-aware semantic parsing model from English to SQL). However, the training in my machine was prolonged. In order to solve that, I went to the issues pages which was providing a pre-trained model (config.json + model_checkpoint) and I tried to load it; however, I was greeted with the error below . I’ve successfully loaded checkpoints some days ago, but those were from models I had previously trained.

I am wondering if there is any fix I can do to solve this and perform inference with the pre-trained model. Thanks a lot.

  File "/app/ratsql/utils/saver.py", line 42, in load_checkpoint
    item_dict[item_name].load_state_dict(checkpoint[item_name])
  File "/root/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for EncDecModel:
        size mismatch for decoder.rule_logits.2.weight: copying a param with shape torch.Size([94, 128]) from checkpoint, the shape in current model is torch.Size([97, 128]).
        size mismatch for decoder.rule_logits.2.bias: copying a param with shape torch.Size([94]) from checkpoint, the shape in current model is torch.Size([97]).
        size mismatch for decoder.rule_embedding.weight: copying a param with shape torch.Size([94, 128]) from checkpoint, the shape in current model is torch.Size([97, 128]).```

ptrblck · February 3, 2021, 9:12am

You would have to change the model architecture such that all parameters match to the one in the pretrained state_dict.
If you are loading this state_dict from a repository, I assume the model definition might also be there.