Optimizer.step(closure)

When shall I use ‘optimizer.step(closure)’ instead of ‘optimizer.step()’ ?
I have read the PyTorch Docs, however i’m not aware of its description.

"
Some optimization algorithms such as Conjugate Gradient and LBFGS need to reevaluate the function multiple times, so you have to pass in a closure that allows them to recompute your model. The closure should clear the gradients, compute the loss, and return it.
"

What does ‘reevaluate the function multiple times’ mean ? What does the ‘function’ refer to ?

1 Like

‘function’ is a callable that returns a differentiable loss (i.e. main nn.Module with attached loss function), these algorithms have an [inner] training loop inside optimizer.step, hence this ‘closure’ is mostly a boilerplate to have a loop turned inside out (i.e. a delegate allowing nested forward+backward+change_params iterations)

But why PyTorch Docs emphasizes that ‘Some optimization algorithms such as Conjugate Gradient and LBFGS…’. When i create a SGD or an Adam optimizer, i just code in ‘optimizer.step()’.

BFGS & co are batch (whole dataset) optimizers, they do multiple steps on same inputs. Though docs illustrate them with an outer loop (mini-batches), that’s a bit unusual use, I think. Anyway, the inner loop enabled by ‘closure’ does parameter search with inputs fixed, it is not a stochastic gradient loop you do with SGD or Adam.

Thanks for your help, i got you!!