My understanding of the LBFGS algorithm is that it updates the parameters by using a line-search algorithm. That is in opposition to using a fixed-size update given by the learning rate, which is what most other optimizers do. However, LBFGS has an lr
parameter, like othe roptimizers. I am wondering if someone knows what role precisely lr
plays in the LBFGS optimizer.