I have 2 tasks which output is not the same.
The first one is a classifier.
The second one is a NER.
I would like to fine-tune a dataset via Roberta for task 1.
I want to save the weight of model, and just save the weight origin Roberta has.
And them use this weight to add down-stream layer and fine-tune second model.
Can model.save_state do that?
Or I should load all layer and trunct the layer I don’t need.
Hi I know this function. But I just want to load the certain layer instead of all.
Assuming I use base-bert and connect 2 layers for downstream task.
The bert’s layers are 12 and plus 2 layers I connect, therefore total is 14.
And I just want the original 12 layers which already fine-tuned.
You can define a model with only the first 12 layers, then load the state_dict into it with strict=false. That should work (assuming the layers have the original names). Otherwise you could iterate over the model parameters like so:
for name, param in model.named_parameters():
... do something with each parameter