Loss increases after loading model from checkpoint

I am experiencing this weird model behavior. I am training a video captioning model. The training loss for epochs decreases for each epoch during the first 8 epochs. When I load the model again from the last checkpoint to train again for 4 epochs, the loss is greater than the last recorded loss and does not drop in subsequent epochs.


I load my model in this way

The loss on retraining

Is this common? What are the reasons for it and how can I improve/avoid this?
Thank you.