Hi, I encounter a strange effect in Pytorch.
model_1 = Net()
model_2 = Net()
optimizer = optim.Adam([{'params': model_1.parameters(), 'lr': 0.001}, # only update model
])
However, I found that model_2
grad has slightly changed.
model_1.conv1.weight.grad = tensor([0.1468, 0.1595, 0.1519, 0.1456, 0.1540], device='cuda:1')
model_2.conv1.weight.grad = tensor([ 0.0480, 0.0073, -0.0298, -0.0482, -0.0545], device='cuda:1')
model_1.conv1.weight.grad = tensor([0.0545, 0.0608, 0.0546, 0.0486, 0.0517], device='cuda:1')
model_2.conv1.weight.grad = tensor([ 0.0503, 0.0087, -0.0272, -0.0466, -0.0535], device='cuda:1')