Hi,
I have a problem with optimization.step()
, my train code:
previous = model.head[12].weight
for i, (image, target) in enumerate(bar):
image, target = image.to(device), target.to(device)
b_size = image.size()[0]
output = model(image)
loss = criterion(output, target)
now = model.head[12].weight
print('='*10)
print(previous,now)
print(model.head[12].weight.grad)
optimizer.zero_grad()
loss.backward()
optimizer.step()
The twelfth layer is nn.Linear(32, 1)
The previous
and now
variables always return the same:
tensor([[ 0.0034, -0.2354, -0.1383, -0.2392, -0.5482, 0.4603, 0.1331, 0.3651,
-0.1942, -0.1649, 0.2016, 0.3514, 0.2886, -0.0520, 0.1155, -0.3433,
-0.3107, 0.1798, 0.0551, -0.0161, 0.5021, 0.1313, 0.3340, 0.2885,
0.3380, -0.1972, 0.0067, 0.4465, -0.0814, -0.3551, 0.1053, 0.2180]],
device='cuda:0', requires_grad=True)
Even that layer has the gradient value:
# Iter 1: None
# Iter 2:
tensor([[-0.4360, -0.2952, -0.6197, -0.0216, -1.0340, 0.7521, 0.2088, 0.7050,
-0.6158, 0.4012, 1.0144, -0.5273, 0.6514, 0.2796, -0.6339, -0.8657,
-0.2789, 0.2030, -0.0475, 0.4096, 0.9676, 0.3229, 0.5224, 0.1033,
0.5484, -0.3745, -0.1743, 0.3179, -0.5385, -0.2508, 0.3502, 0.7884]],
device='cuda:0')
# Iter 3:
tensor([[-0.5874, -0.7407, -0.6531, 0.2646, -1.6728, 0.4518, 0.1702, 1.0924,
-1.2420, 0.1817, 0.7247, 0.9656, 0.0686, 0.1332, 0.3749, -1.8020,
-2.0433, 0.4756, -0.8883, -0.4920, 1.8353, -0.5856, -0.2931, 0.8012,
1.2388, -0.1963, 0.3465, 1.1411, -0.7416, -0.5985, -0.1804, 0.0322]],
device='cuda:0')
Is it a bug, or I did something wrong?