When shall I use ‘optimizer.step(closure)’ instead of ‘optimizer.step()’ ?
I have read the PyTorch Docs, however i’m not aware of its description.

Some optimization algorithms such as Conjugate Gradient and LBFGS need to reevaluate the function multiple times, so you have to pass in a closure that allows them to recompute your model. The closure should clear the gradients, compute the loss, and return it.

What does ‘reevaluate the function multiple times’ mean ? What does the ‘function’ refer to ?

1 Like

‘function’ is a callable that returns a differentiable loss (i.e. main nn.Module with attached loss function), these algorithms have an [inner] training loop inside optimizer.step, hence this ‘closure’ is mostly a boilerplate to have a loop turned inside out (i.e. a delegate allowing nested forward+backward+change_params iterations)

But why PyTorch Docs emphasizes that ‘Some optimization algorithms such as Conjugate Gradient and LBFGS…’. When i create a SGD or an Adam optimizer, i just code in ‘optimizer.step()’.

BFGS & co are batch (whole dataset) optimizers, they do multiple steps on same inputs. Though docs illustrate them with an outer loop (mini-batches), that’s a bit unusual use, I think. Anyway, the inner loop enabled by ‘closure’ does parameter search with inputs fixed, it is not a stochastic gradient loop you do with SGD or Adam.

Thanks for your help, i got you!!