Save a huggingface BERT model

Petro_Key · January 21, 2024, 5:58am

Hi all, I am trying to save a BERT pretrained model from huggingface.
I want to torch.save and load_state_dict a finetuned model.
When I use the tokenizer as is, saved checkpoints can be resumed with loss values that were consistent with those at saved.
However, when I add_tokens and resize_token_embeddings, saved checkpoints cannot be resumed with quite different loss values.

Is there appropriate ways to save the BERT model when I add new tokens?

Thank you.

SravaniSolo · January 21, 2024, 7:43am

Use save_pretrained and from_pretrained to ensure consistency.
Resize token embeddings after loading the model and before saving.
Save and load both model and tokenizer together.
Consider using a checkpoint manager for managing multiple checkpoints.

Petro_Key · January 21, 2024, 8:08am

Thank you Sravani for your kind reply.
I have resized token embeddings after and before saving, and save and load model and tokenizer.
So, the problem seems that I have used torch.save; I now understand I should have used save_pretrained.

I would appreciate any help for my another question.
When I wrap some models as a class and want to save all the models together, should I separately use save_pretrained for BERT model and torch.save for the others?
Is there any smart way to save the wrapped model altogether?

Thank you.