Optimizer params not in the optimizer state dict

Intel_Novel · April 18, 2019, 4:35pm

Holla,

Considering my previous post I found out I can add new params to the optimizer, but I somehow expected they will be also in the optimizer state dict.

Here is the full code:

import torch
import torch.optim as optim

model = torch.nn.Linear(5, 2)

# Initialize optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
new = torch.randn(5, 5)
print(new)
optimizer.param_groups.append({'params': new})

print("---")
print(optimizer.param_groups[1])
print("---")
print(optimizer.state_dict())

And here is the output:

tensor([[ 0.7939, -0.3140, -0.3523,  0.8560, -0.4632],
        [-1.0833, -0.3396, -0.3500, -1.1630,  1.1092],
        [ 1.4704,  0.1906, -2.1621,  0.4114,  0.6820],
        [ 0.2702, -0.4855,  0.3020, -1.2136,  0.0537],
        [ 0.1583, -0.4217, -0.3520, -0.5498, -0.9010]])
---
{'params': tensor([[ 0.7939, -0.3140, -0.3523,  0.8560, -0.4632],
        [-1.0833, -0.3396, -0.3500, -1.1630,  1.1092],
        [ 1.4704,  0.1906, -2.1621,  0.4114,  0.6820],
        [ 0.2702, -0.4855,  0.3020, -1.2136,  0.0537],
        [ 0.1583, -0.4217, -0.3520, -0.5498, -0.9010]])}
---
{'state': {}, 'param_groups': [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [140147531102680, 140147531102752]}, {'params': [140147531031176, 140147531102896, 140147531031176, 140147531102896, 140147531031176]}]}

The problem is the output. It doesn’t look right the state dict has no correct params. Instead it has some 140147531102680ish numbers? Can you add some comments why?

ptrblck · April 18, 2019, 4:42pm

These numbers represent the ids of the parameters you’ve passed to the optimizer.
E.g. if your model has a linear layer called fc1, you can check the id using:

print(id(model.fc1.weight))

and should find this id also in the optimizer’s state_dict.

Intel_Novel · April 18, 2019, 7:23pm

Aha, I had rather unclear hint that this may be the case. Great.
But then if I

torch.save(optimizer.state_dict(),...)

that would not save the parameters. So sad. I had the clue before state dict should bear all about the optimizer. Any comment.

ptrblck · April 18, 2019, 7:30pm

The optimizer without the actual model would be useless, wouldn’t it? If you save both, you could continue the training after loading them.

Intel_Novel · April 18, 2019, 7:34pm

Yes, but when I save optimizer state dict, I haven’t saved the parameters. So I need to save the parameters as well.
As for the model when I save the model’s state_dict I have saved all.

state = {
    'epoch': epoch,
    'state_dict': model.state_dict(),
    'optimizer': optimizer.state_dict(),
    #add params in here also...
}
torch.save(state, filepath)

Would be a decent save.

Arun_Kumar2 · July 30, 2020, 6:07pm

Hi,

Is there a way of entering the id to return the layer? In other words: the reverse operation?

Thank you.
Arun

ptrblck · July 30, 2020, 6:41pm

You could probably iterate all parameters, get theirs ids, and compare it to the passed id:

model = models.resnet18()
my_id = id(model.fc.weight)

for name, param in model.named_parameters():
    if my_id == id(param):
        print(name, ' found')