The .grad is zero, the value change

Huaxiu_Yao · August 9, 2018, 12:40pm

I see a very strange scenario, for one parameter. The .grad is zero, while the value is change. What’s wrong? Thanks.

albanD · August 9, 2018, 12:42pm

Hi,

There are few threads in this forum about this.
Things like l2 regularization, or momentum terms in Adam will change the parameters even when the gradients are 0.

Huaxiu_Yao · August 9, 2018, 12:42pm

Thanks. I found that.

chen_dongdong · September 21, 2019, 11:23am

Hi, albanD, is there any way to avoid changing the parameters and relevant statistics in such cases when the gradient is zeros.

albanD · September 22, 2019, 4:11pm

That depends on your optimizer really. Chaging these parameters and statistics is the “right thing to do” if you really want to apply the optimizer.
If you want to freeze some weights manually, the right way to do it is not to give it to the optimizer or you can try and set weight.grad = None.