How to keep the prediction results the same when model runs each time

tlsshh · October 23, 2018, 8:30am

I’m faced with a problem. After saving and then reloading my model, the test accuracy changes a lot. The problem is some parameters from submodels haven’t been saved.

Now, I want to debug it. I need to make test predictions the same when it runs each time. I know the performance of some layers is non-deterministic. I use this code:

        torch.backends.cudnn.deterministic = True
        torch.manual_seed(999)

Now, after loading the model, test accuracy is the same each time. But it’s still very different from the training model. Training model means the model before saving and reloading. I have already turned the model into eval mode by

model.eval()

Here are some questions:

I put the code at this position. Does it really make any effect? Or I must put it at the front of the whole project. My model has dropout layers.
```
# train the model
torch.backends.cudnn.deterministic = True
torch.manual_seed(999)
model.eval()
# test the model
```
After evaluation on the eval_data, I want to make the model non-deterministic again. I think this is better for training. What should I do?
My model performs differently on the test data each time, partly because of the non-deterministic model, partly because of the wrong saving and reloading. I guess some submodules haven’t been registered. Therefore, those parameters are not in model.state_dict(). Could you please tell me what kind of submodels will be registered in model.state_dict()?

I use PyTorch 0.3.1.

Thanks a lot！

ptrblck · October 23, 2018, 9:00pm

Could you post the model definition with the submodules, which weren’t registered?
You should register tensors as buffers or nn.Parameters directly:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.a = nn.Parameter(torch.randn(1))
        self.b = torch.randn(1)
        self.register_buffer('c', torch.randn(1))
        
    def forward(self, x):
        x = x + 1
        return x

model = MyModel()
model.state_dict()
> OrderedDict([('a', tensor([-0.0560])), ('c', tensor([-1.2713]))])