Max_iter in LBFGS: optimal choice in NN

Desi20 · April 18, 2018, 10:37am

Hey all,
I don’t understand what max_iter exactly does in the LBFGS algorithm when optimizing the parameters in a NN. Isn’t max_iter the maximum number of data points and therefore depends on the mini-batch size when training a neural network like

optimizer = torch.optim.LBFGS(model.parameters(), lr=0.1, max_iter=10)
for epoch in range(epochs):
   for i, (images, labels) in enumerate(train_loader):
...

So how should max_iter be chosen in a Neural Network when it is trained this way? Does an optimal choice exist?