Loading model, test accuracy drops

Julius · August 10, 2018, 8:07am

I have a PyTorch model that has test accuracy of about 97%. I save it using torch.save(my_model.state_dict(), PATH) , but whenever I try to reload it using my_model.load_state_dict(torch.load(PATH)) and test it on the same data using test_fn(my_model) my test accuracy goes down to about 0.06%. The same thing happens if I use test_fn(my_model.eval()) . Is there an extra step I need to take?

ptrblck · August 10, 2018, 8:19am

Could you post a small executable code snippet, so that we can have a look?

Julius · August 10, 2018, 1:22pm

my_model = GraphConv(w2i, p2i, l2i, r2i, s2i, words, pos, lems, 512, 512, 3) ## Initialise model & params
my_model.cuda()
loss_function = nn.NLLLoss()
optimizer = optim.Adam(my_model.parameters(), lr=0.001)

for epoch in range(15):
    ...   ### Apply training steps
    print(test_fn(my_model))  ### Will be over 95%
    torch.save(my_model.state_dict(), PATH)

...
my_model2 = GraphConv(w2i, p2i, l2i, r2i, s2i, words, pos, lems, 512, 512, 3) ## Initialise new model
my_model2.load_state_dict(torch.load('PATH'))
print(test_fn(my_model2))  ### Is about 0.06%

Jelee · March 17, 2019, 10:46am

@ptrblck
Hello
Please help me
I have similar problem but I cant find what’s wrong
Thank you

avinash_m · March 17, 2019, 11:39am

Hi,
I faced similar issue trying to save using .pth. When I saved as .pt it worked fine.
Try this:

Saving the model using

torch.save(my_model.state_dict(),‘somename.pt’)

Load the model using

my_model2 = GraphConv(w2i, p2i, l2i, r2i, s2i, words, pos, lems, 512, 512, 3)
my_model2.load_state_dict(torch.load(‘somename.pt’))
print(test_fn(my_model2))

Jelee · March 17, 2019, 12:01pm

Thank you! @avinash_m

I have one more question regarding this problem.
May be it is overfitting but my training loss drop well but my validation loss keep same value.
Do you have any advise for me?
Thanks.

avinash_m · March 17, 2019, 12:16pm

Try adding Dropouts. If layers are Convolution layers add BatchNormalization.

Jelee · March 17, 2019, 12:19pm

@avinash_m
I already have Dropout, data augmentation.
You mean use dropout several times?

my dropout layer looks like below.

self.dropout = torch.nn.Dropout3d(dropout_prob) # 0.3

I’m not sure using BN because my batch size = 1. Even that now I’m using it after every conv layer.(I use model i3d())

Due to my gpu capability, it can only deal batch size 1.

Deeply · March 17, 2019, 12:24pm

After loading your model, you need to run model.eval() prior to testing to set dropout and batch normalization layers to evaluation mode. See *this for more details.

Deeply · March 17, 2019, 12:26pm

See my answer above.

avinash_m · March 17, 2019, 12:29pm

If your dataset is small then try using any pretrained models depending upon your problem. For a small dataset, stacking up many layers doesn’t help much.

Jelee · March 17, 2019, 12:53pm

@Deeply
Thanks But I already use model.eval for validation and test.

@avinash_m
my data is 3d imaging…! and I think I have enough data…
I try to solve 3d detection problem without using r-cnn or Yolo or common detection model.
I’m not sure using only cnn for detection problem will work… But it’s what i’m doing…

Could you give me some advice?
Thanks