But when I later load the model, I get this error:
File "/.../bert.py", line 329, in main
model = BertForSequenceClassification.from_pretrained(args.output_dir)
File "/.../transformers/modeling_utils.py", line 486, in from_pretrained
model.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BertForSequenceClassification:
While copying the parameter named "bert.encoder.layer.0.attention.self.query.weight", whose dimensions in the model are torch.Size([768, 768]) and whose dimensions in the checkpoint are torch.Size([768, 768]).
While copying the parameter named "bert.encoder.layer.0.attention.self.key.weight", whose dimensions in the model are torch.Size([768, 768]) and whose dimensions in the checkpoint are torch.Size([768, 768]).
...
Would appreciate any guidance for how to fix this.
I tried again with Torch 1.4.0. The previous error is no longer present, but quantization breaks the model. Performance goes from 68.6% down to 3.8% on my task.
I’ll dig deeper to see if I can figure out why, but any suggestions would be greatly appreciated.