Testing model's performance after saving and without saving the model

I am observing a difference in the model’s performance when I save my model load separately for testing on testing data. So, the accuracy is very bad when I load the model from the path whereas the accuracy of my model is very good when the testing is performed in continuation after training and validation.

I wonder what causes such a difference?

Currently, the model I am talking about is an Autoencoder with a classifier attached at the end and the training is done simultaneously for both autoencoder and classifier.

I assume you are already calling model.eval() in both cases.
If that’s the case, you could pass a constant tensor through both models and compare the output (e.g. use a torch.ones(size) tensor).

If the results are equal, your data loading or processing might differ in some ways.
On the other hand, if the results are different, you could check the intermediate activations of each layer via forward hooks.

1 Like

@ptrblck Thank you very much for the explanation.
Yes I am calling model.eval() in both the cases.

So I have ran the test with torch.ones in three different cases;
Since I am training on google colab;

  1. I passed torch.ones just after training finishes. (using ‘gpu’)
  2. I loaded the saved model on colab (using ‘gpu’)
  3. I also loaded the saved model in my local pc. (using ‘cpu’)

Results 2. and 3. match but 1. is different. So that comes under second case you mentioned about forward hook. I actually did not understand how forward hook works.
But let me mention following 2 things;

First, I calculate accuracy in the following way;

 _, predicted = torch.max(pred.data, 1)
total += ys.size(0)
correct += (predicted == ys).sum().item()
train_acc = (100 * correct / total)

Secondly, my model is designed in the following way;

Class model()
    def __init__():
        super()
        ....
        ....

    def forward():
        en = encoder(x)
        out = decoder(en)
        ## Copy pasted ResNet model which takes decoder's input ##
        x = Conv1(out)
        ....
        return pred

I am not sure if this will help in understanding the problem better but it would be great if you can explain forward hook in terms of my model.

If I understand the comparison correctly, case 1 would be the “working case” with a high test accuracy and cases 2 and 3 are where the accuracy drops.

To compare the intermediate activations you could use this code to register the forward hooks for each layer. After you’ve passed the torch.ones tensor through the model, you could store the activations for both runs (e.g. case 1 and 2) and compare them afterwards in another script.

This would narrow down where the different output of coming from.

Also, how large is the current difference of your output?

1 Like

@ptrblck Yes, that’s right case 1 is the working case with high train and test accuracy and 2 and 3 are accuracy drops.
I’ll check out the code. Thanks!

Differences:
The output is a dictionary of a prediction and a reconstructed image.

Case 1:

Pred: tensor([[-0.7676,  0.0572, -0.2913,  0.2349,  0.5005, -1.1794,  0.4590,  0.2181,
         -2.1164, -1.3106], grad_fn=<AddmmBackward>)

image: tensor([[[[ 0.5258,  0.4485,  0.7366,  ...,  0.5316,  0.3521,  0.2883],
          [ 0.6817,  0.7242,  0.1581,  ...,  0.6326,  0.4853,  0.2883],
          [-0.0272, -0.1409, -0.1605,  ...,  1.1503,  0.4435,  0.2883],
          ...,
          [ 0.4976,  0.5694,  0.9019,  ...,  0.6768,  0.6060,  0.2883],
          [ 0.4517,  0.0776,  0.3625,  ...,  0.4241,  0.5714,  0.2883],
          [ 0.2883,  0.2883,  0.2883,  ...,  0.2883,  0.2883,  0.2883]],

         [[ 0.6070,  0.4169,  0.8111,  ...,  0.7332,  0.5797,  0.2265],
          [-0.0116,  0.6817,  0.0530,  ...,  0.9291,  0.5413,  0.2265],
          [ 1.1952,  1.0150,  0.8058,  ...,  0.5320,  1.0475,  0.2265],
          ...,
          [ 0.5384,  0.6242,  0.5577,  ...,  0.6715,  0.8561,  0.2265],
          [ 0.4094,  0.6437,  0.2078,  ...,  0.8093,  0.5136,  0.2265],
          [ 0.2265,  0.2265,  0.2265,  ...,  0.2265,  0.2265,  0.2265]],

         [[ 0.8264,  0.8105,  0.3731,  ...,  0.6286,  0.7814,  0.2504],
          [ 0.6968,  0.9409,  1.2026,  ...,  0.8817,  0.0785,  0.2504],
          [-0.1297,  0.3147, -0.0102,  ...,  0.6539, -0.0164,  0.2504],
          ...,
          [ 0.7424,  0.9211,  0.6916,  ...,  0.6464,  0.1591,  0.2504],
          [-0.0205,  0.3855,  0.0819,  ...,  0.4896,  0.4849,  0.2504],
          [ 0.2504,  0.2504,  0.2504,  ...,  0.2504,  0.2504,  0.2504]]]

Case 2:

Pred: tensor([[ 2.2933, -0.8836,  2.3208, -1.1150,  0.1603, -0.7463, -0.6391, -0.2642,
         -3.1370, -0.2606], grad_fn=<AddmmBackward>)

image: tensor([[[[ 0.2977,  0.4950,  0.5564,  ...,  0.5029,  0.4113,  0.2883],
          [ 0.3410,  0.5465,  0.5328,  ...,  0.4874,  0.4242,  0.2883],
          [ 0.3324,  0.5292,  0.3376,  ...,  0.5825,  0.4509,  0.2883],
          ...,
          [ 0.3655,  0.5581,  0.4323,  ...,  0.5095,  0.5081,  0.2883],
          [ 0.2985,  0.3943,  0.5151,  ...,  0.3950,  0.4275,  0.2883],
          [ 0.2883,  0.2883,  0.2883,  ...,  0.2883,  0.2883,  0.2883]],

         [[ 0.2727,  0.3019,  0.4901,  ...,  0.3821,  0.4321,  0.2265],
          [ 0.3167,  0.5233,  0.3820,  ...,  0.5307,  0.4235,  0.2265],
          [ 0.7326,  0.5277,  0.6648,  ...,  0.6426,  0.5306,  0.2265],
          ...,
          [ 0.3303,  0.5173,  0.4646,  ...,  0.5538,  0.3874,  0.2265],
          [ 0.5248,  0.4554,  0.6289,  ...,  0.5178,  0.3928,  0.2265],
          [ 0.2265,  0.2265,  0.2265,  ...,  0.2265,  0.2265,  0.2265]],

         [[ 0.5768,  0.4982,  0.4561,  ...,  0.4663,  0.4978,  0.2504],
          [ 0.5807,  0.6176,  0.7808,  ...,  0.5769,  0.3198,  0.2504],
          [ 0.2692,  0.6964,  0.5180,  ...,  0.6125,  0.4428,  0.2504],
          ...,
          [ 0.6028,  0.6727,  0.6887,  ...,  0.6235,  0.3214,  0.2504],
          [ 0.1296,  0.3230, -0.0013,  ...,  0.3186,  0.2721,  0.2504],
          [ 0.2504,  0.2504,  0.2504,  ...,  0.2504,  0.2504,  0.2504]]]