Loss is much higher after resume the training

Good day,
I am trying to resume the training. When the training was performed on one gpu, the training continues - the loss is nearly the same as it was before, as well as the evaluation score. When I am trying to continue the training of the model that was trained on multiple gpus, the loss is much higher than it was (0.99 in comparison to 0.34).

Could you please help me, what is the correct way to load the previous weights of the model trained on multiple gpus?

I use pytorch-unet. Can the error be in these lines?

state = torch.load(checkpoint_path, map_location='cpu')
model.load_state_dict(state['model_state_dict'])

The problem shouldn’t be in the optimizer because its state is also loaded