Storing loss each iteration when using a closure (e.g. LBFGS)

When you estimate using LBFGS, you have to wrap the optimization steps in a closure, so each pass through has something like the following:

def closure(): 
    prediction = model(data)
    loss = criterion(prediction, target) 
    return loss


Having the loss within the closure makes it difficult to stash what the current loss is each time (i.e. can’t just do losses += [loss.detach().item()]).

One option is to re-evaluate the model and criterion each time outside of the closure (with torch.no_grad()), but this is a waste of compute.

Is there a better way?