Why do I have to load the optimizer state dict for pytorch in order to make a good prediction, which is not for training?

Why do I have to load the optimizer state dict for pytorch in order to make a good prediction, which is not for training?

Could you describe the issue a bit more or post a dummy code snippet, which explains the issue?
Based on your sentence it seems that something like this:

# 1.
model = MyModel()
model.load_state_dict(torch.load(...))
# 2.
optimizer = torch.optim.SGD(model.parameters())
optimizer.load_state_dict(torch.load(...))
# 3. 
model.eval()
output = model(input)

works better when #2 is skipped?

1 Like

Sorry, I should reply it at here: Did any one know how to load the pth pre-trained model from fastai to pytorch?

On that post, Jemery said the reason why learn.model(input_image) does not work is because the opt(optimizer) problem, even though I set requires_grad to false and learn.model.eval(). I was thinking we don’t use the optimizer for training, and I was just want to do prediction. Why do I need the optimizer?

Does skipping #2 works better normally?