Hi. I am trying the gradient mannual to explore how the modified gradient changes the ultimate performance.
I am wondering why it doesn’t show any difference in terms of performance it is multipled by 0.000001?
optimizer.zero_grad()
loss_batch.backward()
for param in model_head.parameters():
param.grad.data *= 0.000001
optimizer.step()
Would it be possible if it is due to the Adam optimizer?
Thank you.
Best Regards