Gradident slightly change even not added to Optimizier

Erica_Zheng · June 30, 2022, 8:24pm

Hi, I encounter a strange effect in Pytorch.

model_1 = Net()
model_2 = Net()

optimizer = optim.Adam([{'params': model_1.parameters(), 'lr': 0.001},  # only update model
                     ])

However, I found that model_2 grad has slightly changed.

model_1.conv1.weight.grad =  tensor([0.1468, 0.1595, 0.1519, 0.1456, 0.1540], device='cuda:1')
model_2.conv1.weight.grad =  tensor([ 0.0480,  0.0073, -0.0298, -0.0482, -0.0545], device='cuda:1')

model_1.conv1.weight.grad =  tensor([0.0545, 0.0608, 0.0546, 0.0486, 0.0517], device='cuda:1')
model_2.conv1.weight.grad =  tensor([ 0.0503,  0.0087, -0.0272, -0.0466, -0.0535], device='cuda:1')

ptrblck · July 1, 2022, 5:52am

Could you post a minimal, executable code snippet reproducing this issue, please?