rerere_L
(rerere L.)
January 29, 2020, 8:35pm
#1
I have a non linear regression NN, i wanted know if there are a optimizer Leverberg-Marquardt like that can i use in my case?

```
N, D_in, H, D_out = x.shape[0], x.shape[1], 6, y.shape[1]
model = nn.Sequential(OrderedDict([ ('fc1', nn.Linear(D_in, H)),
#('Sig', nn.Sigmoid()),
('ISRU', ISRU()), # Add ISRU
('fc2', nn.Linear(H, D_out))]))
# Error -----
loss_fn = torch.nn.L1Loss(reduction='mean')
# Train -----
optimizer = *****
```

crowsonkb
(Katherine Crowson)
January 29, 2020, 9:29pm
#2
I tried Googling it and I couldn’t find any implementations of that optimizer for PyTorch. Most of what exists is variations on first-order gradient descent. If your gradients are not stochastic you might try to use torch.optim’s implementation of the second-order optimizer L-BFGS (be sure to set `line_search_fn='strong_wolfe'`

or you risk the optimizer ‘blowing up’ due to accepting a step which increases the loss).

rerere_L
(rerere L.)
January 29, 2020, 10:07pm
#3
an Example for train loop with this optimizer?

crowsonkb
(Katherine Crowson)
January 30, 2020, 1:47am
#4
Here’s an example of minimizing the Rosenbrock function with L-BFGS:

```
from functools import partial
import torch
from torch import optim
def rosenbrock(x):
return (1 - x[0])**2 + 100*(x[1] - x[0]**2)**2
a = torch.tensor([1., 1.]) + torch.rand(2)
a.requires_grad_()
loss = partial(rosenbrock, a)
def print_iter(i, a, loss):
print(f'{i} [{a[0]:.6f}, {a[1]:.6f}], loss: {loss:.6f}')
opt = optim.LBFGS([a], line_search_fn='strong_wolfe')
print_iter(0, a, loss())
for i in range(200):
opt.zero_grad()
loss().backward()
opt.step(loss)
print_iter(i+1, a, loss())
```

rerere_L
(rerere L.)
January 30, 2020, 8:22am
#5
in the case i want use nn.MSELoss, who to change this axample ?