I am trying to add a row for a new set of tokens I want to use for this model but I’m getting an error. Seems like I can access the data I want from state_dict and then do as below, but then it complains that the old and new dimensions are different. This is true, but can I get this to work somehow? Basically, I need to resize the Embedding layer …
I now Hugging Face has resize_token_embeddings
but what about as below?
state_dict = MODEL.state_dict()
state_dict['tokens_embed.weight'] = nn.Parameter(
torch.cat(
(state_dict['tokens_embed.weight'], torch.zeros(1, MODEL.config.n_embd))
),
)
MODEL.load_state_dict(state_dict)
This is the error.
RuntimeError: Error(s) in loading state_dict for OpenAIGPTModel:
size mismatch for tokens_embed.weight: copying a param with shape torch.Size([40479, 768]) from checkpoint, the shape in current model is torch.Size([40478, 768]).