Optimizer params not looking good

Intel_Novel · April 17, 2019, 8:06pm

When I print new

new = torch.randn(5, 5)
print(new)
optimizer.param_groups.append({'params': new})

I get


tensor([[-0.6726, -1.2241, -1.2249, -0.3685, -0.7413],
        [ 0.6767, -0.5882, -1.0260, -0.4722, -1.9021],
        [-0.9020, -0.8121,  1.3669, -1.5776,  0.3921],
        [ 1.2827, -0.5978, -1.2384, -1.2176,  0.5792],
        [-0.0673, -0.5546, -1.8438,  0.4854, -0.6051]])

but then when I try to print the optimizer

print("Optimizer's state_dict:")
print(optimizer.state_dict)

I get:

Optimizer's state_dict:
<bound method Optimizer.state_dict of SGD (
Parameter Group 0
    dampening: 0
    lr: 0.001
    momentum: 0.9
    nesterov: False
    weight_decay: 0

I cannot find my 5x5 params in the optimizer. Any idea?

ptrblck · April 17, 2019, 8:50pm

Could you try to print optimizer.param_groups[0]?

Intel_Novel · April 18, 2019, 7:04am

It returns

{'params': [Parameter containing:
tensor([[ 0.0788, -0.3615,  0.4415,  0.2667, -0.3098],
        [-0.3995,  0.0706,  0.4255, -0.3650,  0.3453]], requires_grad=True), Parameter containing:
tensor([ 0.1224, -0.2431], requires_grad=True)], 'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False}

Here is the full code:


import torch
import torch.optim as optim

model = torch.nn.Linear(5, 2)

# Initialize optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
new = torch.randn(5, 5)
print(new)
optimizer.param_groups.append({'params': new})

print("---")
print(optimizer.param_groups[0])

I would like to see new tensor weights somehow in the optimizer.

vabh · April 18, 2019, 8:44am

I think you can try with optimizer.add_param_group() https://pytorch.org/docs/stable/optim.html#torch.optim.Optimizer.add_param_group

If you look at the source code of this function, there is more going on under the hood .

Intel_Novel · April 18, 2019, 12:15pm

I am afraid the result is completely the same as with param_groups.append in my case.

import torch.optim as optim

model = torch.nn.Linear(5, 2)

# Initialize optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
new = torch.randn(5, 5)
print(new)

optimizer.add_param_group({"params": new})
#optimizer.param_groups.append({'params': new})

print("---")
print(optimizer.param_groups[0])

MariosOreo · April 18, 2019, 12:42pm

Hello,

I try your code snippet on my machine, it works fine.

[{'nesterov': False, 'weight_decay': 0, 'lr': 0.001, 'dampening': 0, 'params': [Parameter containing:
tensor([[ 0.2869, -0.1661,  0.1697,  0.4048, -0.3160],
        [-0.1766,  0.2777, -0.0006, -0.0400,  0.1492]], requires_grad=True), Parameter containing:
tensor([-0.2891,  0.1473], requires_grad=True)], 'momentum': 0.9}, {'nesterov': False, 'weight_decay': 0, 'lr': 0.001, 'dampening': 0, 'params': [tensor([[ 1.1468, -0.0929,  0.4642,  0.1088, -0.5372],
        [ 1.3683, -0.6719,  0.6465, -0.5709,  0.1006],
        [-0.9102,  0.1764,  0.4354,  1.3183, -0.1849],
        [-0.0164, -0.8461, -0.5933,  0.8613,  2.3292],
        [ 0.6723,  0.7489,  0.4820,  1.0486,  0.4104]])], 'momentum': 0.9}]

DoubtWang · April 18, 2019, 2:16pm

Why do this error occur ?

>>> new.require_grad = True
>>> optimizer.add_param_group({"params": new})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wangshen/.local/lib/python3.5/site-packages/torch/optim/optimizer.py", line 193, in add_param_group
    raise ValueError("optimizing a parameter that doesn't require gradients")
ValueError: optimizing a parameter that doesn't require gradients

Intel_Novel · April 18, 2019, 4:02pm

OK @MariosOreo, it was in the second parameter group.

print(optimizer.param_groups[1]) returned:

{'params': [tensor([[-0.7205,  0.7753,  0.3941,  0.2800,  1.1731],
        [-0.4623,  0.6911,  0.1038,  1.3109,  0.0888],
        [-2.0659,  0.5808,  1.0869, -0.4624,  1.9842],
        [-0.5490,  0.6185, -0.3175, -0.4815,  1.5397],
        [-0.2848,  0.3884, -0.4705, -0.1319,  0.3818]])], 'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False}

where ‘params’ is exactly what is in new tensor.

I understood that after printing the optimizer object.

SGD (
Parameter Group 0
    dampening: 0
    lr: 0.001
    momentum: 0.9
    nesterov: False
    weight_decay: 0

Parameter Group 1
    dampening: 0
    lr: 0.001
    momentum: 0.9
    nesterov: False
    weight_decay: 0
)

that gave me the idea of the other parameter group.

Intel_Novel · April 18, 2019, 4:06pm

I found out @Anuvabh that

optimizer.add_param_group({"params": new})
optimizer.param_groups.append({'params': new})

have exactly the same effect in my case.