There are a Leverberg-Marquardt like optimizer for pytorch?

I have a non linear regression NN, i wanted know if there are a optimizer Leverberg-Marquardt like that can i use in my case?

N, D_in, H, D_out = x.shape[0], x.shape[1], 6, y.shape[1]

model = nn.Sequential(OrderedDict([ ('fc1', nn.Linear(D_in, H)), 
                                    #('Sig', nn.Sigmoid()),
                                    ('ISRU', ISRU()), # Add ISRU
                                    ('fc2', nn.Linear(H, D_out))]))

# Error -----
loss_fn = torch.nn.L1Loss(reduction='mean')

# Train -----
optimizer = *****

I tried Googling it and I couldn’t find any implementations of that optimizer for PyTorch. Most of what exists is variations on first-order gradient descent. If your gradients are not stochastic you might try to use torch.optim’s implementation of the second-order optimizer L-BFGS (be sure to set line_search_fn='strong_wolfe' or you risk the optimizer ‘blowing up’ due to accepting a step which increases the loss).

an Example for train loop with this optimizer?

Here’s an example of minimizing the Rosenbrock function with L-BFGS:

from functools import partial

import torch
from torch import optim


def rosenbrock(x):
    return (1 - x[0])**2 + 100*(x[1] - x[0]**2)**2


a = torch.tensor([1., 1.]) + torch.rand(2)
a.requires_grad_()
loss = partial(rosenbrock, a)


def print_iter(i, a, loss):
    print(f'{i} [{a[0]:.6f}, {a[1]:.6f}], loss: {loss:.6f}')


opt = optim.LBFGS([a], line_search_fn='strong_wolfe')
print_iter(0, a, loss())
for i in range(200):
    opt.zero_grad()
    loss().backward()
    opt.step(loss)
    print_iter(i+1, a, loss())

in the case i want use nn.MSELoss, who to change this axample ?