Question of assigning optimizer param

jhp · August 29, 2021, 7:29pm

Hi, I have a quick question about parameters passed into the optimizer.
I knew that only the parameters assigned in the optimizer were updated by gradients, but after the optimizer was defined and start training, I make the parameters that were not assigned to the optimizer to compute gradient(requires_grad=True), which caused a huge change in performance.
In this situation, does the parameter not assigned to the optimizer updated? All does that parameter affects other assigned parameter when updating?

tom · August 29, 2021, 8:09pm

The training step parts of “run backward” and “optimizer step” are independent in PyTorch, so the first computes gradients for everything that has requires_grad=True and is ignorant of whether you use them in the optimizer step. So you would keep allocating gradients for the unused parameters but never use (nor zero) them.

Best regards

Thomas