I have a few layers in my model whose .grad
exists but .grad_sample
is None. How is .grad_sample
different from .grad
and how can one be None and the other not?
Thanks for reaching out! Can you share a minimal reproducing example? The grad_sample
is populated by Opacus during the backward and it is aggregated, added to Gaussian noise into grad
during optimizer.step()
.