Under The Hood: Optimizers and Parameter Lists

I am trying to understand what is happening under the hood of any generic type of optimizer when it is passed a list of parameters. (Specifically, I am playing with CycleGANs and am trying to emulate the more Tensorflow-style of passing parameters to special purpose optimizers, rather than the PyTorch-style of selective detachment. This may be the only Tensorflow style issue I prefer over PyTorch…)

My confusion stems from the following test code:

device       = torch.device("cuda:1") 
netD_A, netD_B = models.Discriminator1(), models.Discriminator1()
netD_A, netD_B = netD_A.to(device), netD_B.to(device)

print(" ********** Net D_A **********")
print(len(list(netD_A.parameters())))
for name, param in netD_A.named_parameters():
    print (id(name), '\t\t', name)

print()
print(" ********** Opt D_A **********")
opt_D_A = torch.optim.Adam(netD_A.parameters(), lr=disc_lr, betas=opt_betas)        
print(len(opt_D_A.state_dict()['param_groups'][0]['params']))
pprint(opt_D_A.state_dict())
pprint(opt_D_A.state_dict()['param_groups'][0]['params'])

Which produces the following output:

********** Net D_A **********
30
140598512227440 conv2d_1.weight
140598512227824 conv2d_1.bias
140598517391840 conv2d_1G.weight
140598512228080 conv2d_1G.bias
140598512207528 ds1.conv2d.weight
140598512228336 ds1.conv2d.bias
140598512207600 ds1.conv2dG.weight
140598512207672 ds1.conv2dG.bias
140598512207744 ds1.inst2d.weight
140598512228656 ds1.inst2d.bias
140598512207816 ds1.inst2dG.weight
140598512207888 ds1.inst2dG.bias
140598512207960 ds2.conv2d.weight
140598512228976 ds2.conv2d.bias
140598512208032 ds2.conv2dG.weight
140598512208104 ds2.conv2dG.bias
140598512208176 ds2.inst2d.weight
140598512229296 ds2.inst2d.bias
140598512208248 ds2.inst2dG.weight
140598512208320 ds2.inst2dG.bias
140598512208392 ds3.conv2d.weight
140598512229616 ds3.conv2d.bias
140598512208464 ds3.conv2dG.weight
140598512208536 ds3.conv2dG.bias
140598512208608 ds3.inst2d.weight
140598512229936 ds3.inst2d.bias
140598512208680 ds3.inst2dG.weight
140598512208752 ds3.inst2dG.bias
140598512230192 fc.weight
140598512688296 fc.bias

********** Opt D_A **********
30
{‘param_groups’: [{‘amsgrad’: False,
‘betas’: (0.5, 0.999),
‘eps’: 1e-08,
‘lr’: 0.0001,
‘params’: [140598512670112,
140598512670184,
140598512670256,
140598512670328,
140598512670400,
140598512670472,
140598512670544,
140598512670616,
140598512670688,
140598512670760,
140598512670832,
140598512670904,
140598512670976,
140598512671048,
140598512671120,
140598512671192,
140598512671264,
140598512671336,
140598512671408,
140598512671480,
140598512671552,
140598512671624,
140598512671696,
140598512671768,
140598512671840,
140598512671912,
140598512671984,
140598512672056,
140598512672128,
140598512672200],
‘weight_decay’: 0}],
‘state’: {}}
[140598512670112,
140598512670184,
140598512670256,
140598512670328,
140598512670400,
140598512670472,
140598512670544,
140598512670616,
140598512670688,
140598512670760,
140598512670832,
140598512670904,
140598512670976,
140598512671048,
140598512671120,
140598512671192,
140598512671264,
140598512671336,
140598512671408,
140598512671480,
140598512671552,
140598512671624,
140598512671696,
140598512671768,
140598512671840,
140598512671912,
140598512671984,
140598512672056,
140598512672128,
140598512672200]

This almost makes sense to me. The first block of output corresponding to the model is a list of object identifiers and names of the various parameters. The names make sense, the number of parameter makes sense, etc.

The second block, corresponding to the optimizer is where I have prized out a similar list of object identifiers (and a glance at the source code for the optimizer base class indicates that is what they are.) The quantity is correct, but they don’t match the IDs of the model itself.

There is probably some layer of indirection or container that I haven’t grasped, but I am very curious: What is actually happening here?

If you print id(param) in the first print loop, you’ll get the same ids. :wink:
Currently you are printing the id of the name object.

D’ohhh!

Yes, yes of course. The kind of error I could have stared at for days and probably not found.

Thank you.

1 Like