I am trying to modify my gradients by multiplying them by a constant. As a test, I multiply the gradients by zero, so I expect that the losses should stay constant.
However it looks like the model still updates the weights of the network even though I have tried to force the weights gradients to be zero. Why aren’t the gradients being updated?
GRADIENT_MULTIPLIER = 0.
model.train()
for epoch in range(100):
for ind, (input_data,labels) in enumerate(train_iterator):
optimizer.zero_grad()
logits = model(input_data,labels)
loss = model.loss(logits, labels)
loss.backward()
for p in model.parameters():
p.grad *= GRADIENT_MULTIPLIER
global_norm = np.sqrt(sum([torch.sum(w.grad**2).item() for w in model.parameters()]))
print(f"global_norm2 : {global_norm}")
optimizer.step()
print(f"loss : {loss.data}")
Outputs are:
global_norm : 0.0
loss : 0.6851892471313477
global_norm : 0.0
loss : 0.6985365748405457
global_norm : 0.0
loss : 0.6622101664543152
global_norm : 0.0
loss : 0.45273280143737793
global_norm : 0.0
loss : 0.8967741131782532
global_norm : 0.0
loss : 0.28941503167152405