Saved model have higher loss

I try to save my model while in training so that I can resume it later, but why my saved model always have higher loss compared to non resumed training?
I’m following this thread to save my models, I save my decoder and encoder model and I also save my adam optimizer

def save_checkpoint(state):, os.path.join(model_path, 'checkpoint-{}-{}.pth'.format(epoch+1, i+1)))

for epoch in range(300):
    for i, (images, captions, lengths) in enumerate(data_loader):
        # Set mini-batch dataset
        images =
        captions =
        targets = pack_padded_sequence(captions, lengths, batch_first=True)[0]
        # Forward, backward and optimize
        features = encoder(images)
        outputs = decoder(features, captions, lengths)        
        loss = criterion(outputs, targets)
        if (i+1) % 50 == 0:
            print('Epoch [{}/{}], Step [{}/{}], Loss:{:.4f}, Perplexity: {:5.4f}'
                  .format(epoch, 300, i+1, total_step, loss.item(), np.exp(loss.item())))
        # Save the model checkpoints
        if (epoch+1) % 100 == 0 and (i+1) % total_step == 0:
              'epoch': epoch + 1,
              'encoder': encoder.state_dict(),
              'decoder': decoder.state_dict(),
              'optimizer' : optimizer.state_dict(),

you can check the whole code in here, the notebook also have training data for download

and here is the code how I load the model

I run this code right before training code block

#LOAD model
checkpoint = torch.load('drive/My Drive/model/checkpoint-200-350.pth')
start_epoch = checkpoint['epoch']

edit 1:
hello all after further investigating I think the problem is within my optimizer state_dict

because everytime I reset my environment my saved optimizer always have different state_dict

Hi, I have the same problem when saving adam optimizer. As you mentioned, we got different state_dict, but how can prevent it?
How did you fix your code, please?

Hi, what do you mean by “saved optimizer always have different state_dict”
since you saved state_dict of optimizer and reload it, too.