Optimizer params not in the optimizer state dict

Holla,

Considering my previous post I found out I can add new params to the optimizer, but I somehow expected they will be also in the optimizer state dict.

Here is the full code:

import torch
import torch.optim as optim

model = torch.nn.Linear(5, 2)

# Initialize optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
new = torch.randn(5, 5)
print(new)
optimizer.param_groups.append({'params': new})

print("---")
print(optimizer.param_groups[1])
print("---")
print(optimizer.state_dict())

And here is the output:

tensor([[ 0.7939, -0.3140, -0.3523,  0.8560, -0.4632],
        [-1.0833, -0.3396, -0.3500, -1.1630,  1.1092],
        [ 1.4704,  0.1906, -2.1621,  0.4114,  0.6820],
        [ 0.2702, -0.4855,  0.3020, -1.2136,  0.0537],
        [ 0.1583, -0.4217, -0.3520, -0.5498, -0.9010]])
---
{'params': tensor([[ 0.7939, -0.3140, -0.3523,  0.8560, -0.4632],
        [-1.0833, -0.3396, -0.3500, -1.1630,  1.1092],
        [ 1.4704,  0.1906, -2.1621,  0.4114,  0.6820],
        [ 0.2702, -0.4855,  0.3020, -1.2136,  0.0537],
        [ 0.1583, -0.4217, -0.3520, -0.5498, -0.9010]])}
---
{'state': {}, 'param_groups': [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [140147531102680, 140147531102752]}, {'params': [140147531031176, 140147531102896, 140147531031176, 140147531102896, 140147531031176]}]}

The problem is the output. It doesn’t look right the state dict has no correct params. Instead it has some 140147531102680ish numbers? Can you add some comments why?

1 Like

These numbers represent the ids of the parameters you’ve passed to the optimizer.
E.g. if your model has a linear layer called fc1, you can check the id using:

print(id(model.fc1.weight))

and should find this id also in the optimizer’s state_dict.

4 Likes

Aha, I had rather unclear hint that this may be the case. Great.
But then if I

torch.save(optimizer.state_dict(),...)

that would not save the parameters. So sad. I had the clue before state dict should bear all about the optimizer. Any comment.

The optimizer without the actual model would be useless, wouldn’t it? If you save both, you could continue the training after loading them.

Yes, but when I save optimizer state dict, I haven’t saved the parameters. So I need to save the parameters as well.
As for the model when I save the model’s state_dict I have saved all.

state = {
    'epoch': epoch,
    'state_dict': model.state_dict(),
    'optimizer': optimizer.state_dict(),
    #add params in here also...
}
torch.save(state, filepath)

Would be a decent save.

Hi,

Is there a way of entering the id to return the layer? In other words: the reverse operation?

Thank you.
Arun

You could probably iterate all parameters, get theirs ids, and compare it to the passed id:

model = models.resnet18()
my_id = id(model.fc.weight)

for name, param in model.named_parameters():
    if my_id == id(param):
        print(name, ' found')
1 Like