Hi all,
I have a Transformer model that is trained with diferent sets of data. In some of them could be new tokens that increase the vocab_size, and so the embeddings/possitions… These are under control. The problem arise when I load a previous optimizer because vocab_size could increase.
I know in Pytorch this can be achieved by copying the old info into new. and continue training but don’t know how to do it in Libtorch C++.
the pseudo code could be.
torch::load(optimizer, “.\IA\” + sEmpresa + “_optimizer.pt”);
model->IncreaseVocab(new_vocab_size);
torch::optim::Adam new_optimizer(model->parameters(), torch::optim::AdamOptions(0.001));
then I need that new_optimizer has all the previous info (new_optimizer = optimizer+new_size)
any help would be appreciated.