[resolved] Different outputs for the same inputs with loaded model

vdw · September 30, 2017, 8:26am

I’ve adopted this PyTorch Tutorial to experiment with using it for Named Entity Recognition (NER). It trains just fine and saves the model in a file. Now I want to copy and load this model file on a different machine. In principle, no problems as far as I can see.

My problem is now that I often get different results for the same input (a) every time a load the model again, and (b) even when the model is loaded once and I give it the identical input. Particularly (b) worries me.

I tried to search online for the problem. Here are some things I’ve tried:

I don’t have any dropout layer, so I cannot see any point where there is randomization involved
After loading the model, I do a model.eval() just to be sure
I’ve set torch.manual_seed(0)
Right now, I use a CPU machine only, so there shouldn’t be any GPU/CPU issues

Another thing I’ve tried: When - directly once the training routine is done - (1) I give the network some test data, (2) save the network, (3) re-initialize the network, (4) load the model, (5) give the network the same test data, then everything looks consistent. Only when I load the model “outside” the training, I get these different results for the same input.

At the moment, I’m test with rather small datasets, so the results are not very accurate. Still, I would assume the same output for the same input. What am I missing here?

vdw · October 1, 2017, 11:42pm

Found the problem. Short answer: My stupidity.

Long answer: I did have randomness just not in the network but in the data preparation step. After loading the data into array, I first shuffled it before splitting it in training and test data. The problem here is that each time the word-to-index mapping is different. Thus before loading the state, I initialized the model each time with a different word-to-index mapping.

I now used a shuffled dataset file and skip the shuffling during the data preparation step. Now all is consistent.