Why is closure not supported in GradScaler ?

In the step function of GradScaler, if a closure is given as a member of kwargs, RuntimeError occurs by the following codes.

The error message says “not currently supported”. Are there any plans to support this feature? Or Would you tell me reasons why closure isn’t supported? If possible, I want to try to make a patch to solve this issue.

My motivation is to support SAM with native amp easily.

Making closures work with dynamic gradient scaling (specifically, the fact that dynamic gradient scaling occasionally skips optimizer.step() if any grads were inf/nan) is tricky, and we haven’t heard any use cases that absolutely needed it (LBFGS is the only one I’m aware of and no one’s asked for that).

Can you implement SAM without a closure like this?

Thank you for your reply!
As you pointed, I use davda54’s SAM implementation without closure.
And it works well.

But I use pytorch-lightning. In pytorch-lightning environment, I have to write some extra codes to calculate gradients two times. My motivation is to solve this problem. I want to use SAM as other optimizers.

I have understood that closure support is tricky. I decided to use SAM without closure. Thank!!