Loaded checkpoint giving constant output during validation

anirudh_puligandla · March 28, 2025, 8:32pm

I am loading a model checkpoint to evaluate it on the MNIST dataset. The checkpoint is loading correctly and producing correct outputs, loss, etc., when I resume the training.

In another script I load the checkpoint the same way to evaluate on the same data. However, the loaded model produces a constant prediction output for every model(input) call.

Am I missing something that automatically gets called during training? I cannot understand what am I missing here.

I load the checkpoint like this in both the scripts:

checkpoint = torch.load('file_name', weights_only=True)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']

In the evaluation evaluation script I don’t load the optimizer, epoch and loss.
I call “model.eval()” before evaluation.
In the resume training script, I make these additional calls, that I feel are the only difference between the two scripts:

best_model = model.state_dict()
model.train(True)

I checked the layers’ weights after loading the checkpoint and they are the same in both the scripts.

Alex_17127 · March 28, 2025, 11:29pm

same issue and i can’t understand what’s going on