Hi,
I just noticed that if I want to save a model and load to continue the training, dropout is the problem.
I save everything I could save in order to continue the training precisely.
I saved the followings:
"net": self.net.state_dict(),
"optimizer": self.optimizer.state_dict(),
"scheduler": self.scheduler.state_dict(),
'train_data_provider': self.train_data_provider.get_state(),
'test_data_provider': self.test_data_provider.get_state(),
'torch_random': torch.get_rng_state(),
'torch_cuda_state': torch.cuda.get_rng_state()}
But I found I still can not reload the net from epoch 10, for example, and produce the same results. Then I looked into my network, I found that the dropout layer is the fundamental problem. If I remove it, the procedure is exactly the same after loading the pre-trained weights.
So, I hope that PyTorch could make modifications such that we could save the state of the generator of dropout and load it when we use the function net.state_dict().
Best regards