Just wanted to ask if there will be implemented more optimization algorithms such as full Newton or Levenberg-Marquardt algorithm in the future?
I’m not very familiar with these algorithms, but we are always open & willing to implement the widely used / standard / useful ones. Feel free to open an issue on GitHub to start a discussion!
I’m also searching for this kind of optimization
do you’ve opend this issue?
FYI, Matlab can train NN with Levenberg- Marquardt. There’s also a 90s software called neuralLab which has some interesting optimizer for NN. I just realized that for engineers it’s beneficial to take a look at some older tools. Framework like pytorch/tensorflow are more tailored for modern DNN applications. They might not be suitable for engineers (I mean, non-CS engineers). The only second order optimizer in pytorch is LBFGS. The reason why they don’t implement algorithms like Levenberg- Marquardt is they have DNN applications in mind, which these algorithms are not suitable for. But it could be a different story in some other fields.
Yes, now I am using basic fitting by MLP and Levenberg-Marquardt algorithm in Matlab outperforms PyTorch algorithms by many orders of magnitude (yes, orders, like 1e3 … 1e5) for same data and about same accuracy. It is expected for any least-square problems (second order algorithms vs first order algorithms) and Matlab was doing a good job for that, starting long time ago. It would be nice to have that algorithm in PyTorch. Now, I have pushed to right my own converter from Matlab DL model to PyTorch model, in order to be productive.
There are some other options.
A small package called pyrenn in pure python and numpy could train MLP and RNN with LM. It reaches similar accuracy as MATLAB.
I write my own NN with JAX, and use scipy optimizer, including LM and many others, to train it. You can also evaluate Jacobian and error in pytorch and pass that to scipy. I just find this more convenient in JAX or tensorflow.
Thanks for giving some references. Pyrenn uses directly inverted Jacobian, which can lead to problems for matrix with rank-deficiency. Scipy optimizer is more mature and use trusted region methods and LM implementation as in MINPACK with right factorizations. So using Pytorch for Jacobian and errors evaluation and scipy.optimizer can be an option. Still, not as convenient as a hypothetical direct implementation in PyTorch.
Actually pytorch might not work since it’s using reverse mode auto differentiation (back propagation). Backprop is super inefficient when output number is much larger than input number, which is likely to be the case in calculating Jacobian (num_samples v.s. num_parameters).
The easiest option for now is JAX if you don’t want to write everything by hand; it only takes me 30 lines to define a MLP class.
Hi Oleg, I had the same issue trying to port an old work from MatLab to PyTorch. Using Levenberg-Marquardt on MatLab the trainning was quite fast and had better results. Could you share how did you solve this?
Hello everyone, I hope you are having a great time.
Could you please inform me whether Levenberg-Marquardt can be implemented on PyTorch?
I appreciate you sharing your knowledge on this issue.
Check out theseus, they have a working version. I haven’t digged into it yet.
It’s certainly possible, I’ve built a general purpose gauss-newton ago with pytorch jacobians and lingalg.pinv and am now looking into Levelberg-Marquardt.