Fine tuning the trained model, the initial loss is quite high

weitaoliu · April 8, 2024, 3:35pm

Hi,

I am trying to fine-tune my previously trained model. What I did was load the model using load_state_dict and then I changed the initial learning rate to a smaller value. One thing that confused me is that the initial training loss when conducting fine-tuning is quite high, and the value is quite different from the loss I got in previous training. If they have similar model coefficients, how can they have very different losses? or if I misconducted something?

More details are given here, it is a fully connected MLP model, and the loading is as follows:

if model[tag].fineTuneSwitch == True:
                model[tag].load_state_dict(torch.load(model[tag].annModelPath))

the loss records are attached.
the previous training case (initial learning rate: 1e-3):
Screenshot from 2024-04-08 17-32-49
the fine-tuning case (initial learning rate: 4e-6):

weitaoliu · May 1, 2024, 9:02am

The problem was addressed. The way of loading is right for model fine-tuning.

Shilaj_Baral · April 16, 2025, 4:10pm

Hi @weitaoliu. Can I please know how did you address your problem?

weitaoliu · April 30, 2025, 12:03pm

Hey, I had the problem due to not correctly using the inputs for fine-tuning. The new inputs should have the same or similar meaning as the original inputs (at least in my case). To check your implementation, if you do not have dropout, you can change the learning rate to zero and check the loss values during fine-tuning. If implementation is fine, the loss values should not be large.