I encounter a weird situation when resume training. Let’s say I have 10 epochs. From 1st epoch, the train loss decrease steadily whereas the valid loss is floating around 1.8-2.0. At 5th epoch, the training loss is 0.95 with validation loss of 1.7. I save this epoch with model state dict and optimizer state dict. The first epoch when I resume training is that the training loss rise a bit, to 1.05, whereas the validation loss drop to 1.1. After that, the training loss continuously drop, whereas the validation loss constantly rise to 1.6.
My question is, I have save the state dict of model and optimizer, why would the valid loss is not following the trend of the previous when I resume training. My expectation of the valid loss after resume training is that it will drop from 1.7, but the fact is it drop to 1.1 and rise again back to 1.6.
I use adam optimizer with cross entropy loss. What would the issues causing this problem?