Updatation of Parameters without using optimizer.step()

In general it is better to use inplace because other components that have reference to your parameters still have valid references. If you change the Tensor, then you need to make sure to update all other places that had reference to this Tensor. Also you need to make sure that it is properly a leaf to ensure that following grad step will populate the .grad field.

1 Like

Hey @albanD,

Does this method also works in DistributedDataParallel training?
I am modifying the model weight to apply Orthogonality on them and it works fine with single GPU.
But when I am using DDP, the weights become unstable and cause assertion error in my code!

Hi,

DDP is doing a lot of extra things to sync the weights across machines. So you definitely want to use inplace in that case.

1 Like