10/11/2022 17:28:42 - Despite set_to_none is set to False, opacus will set p.grad_sample and p.summed_grad to None due to non-trivial gradient accumulation behaviour
I ran almost the same two pieces of code, why does this warning appear in one, under what circumstances does this warning appear, does this warning have a big impact?
It is about the behavior of optimizer.zero_grad(). The set_to_none argument only affects p.grad; p.grad_sample and p.summed_grad are never zeroed out and always set to None.
We switched the level from warning to debug, hence the potential difference between the two codebases?