I am trying to train a transformer model after extending its vocabulary. The problem is I want to keep the original weights frozen and train only the weights associated with the new vocabulary. I was thinking of doing something like this:
processor = processor() # Loading processor
model = model() # Loading model
for param in model.parameters():
param.requires_grad = False
processor.tokenizer.add_tokens(new_tokens)
model.resize_token_embeddings(len(processor.tokenizer))
Is this a valid option? If not, what other options do I have?