Adaptive l2 regularization experiment

derby · March 16, 2024, 4:58am

hi all, i am an amateur getting caught up with pytorch, started tinkering around and had a little idea to make the regularization adapt with respect to the loss per epoch. take these results with a grain of salt.

maybe not enough data/runs to be make any conclusions, but this was mainly an exercise in learning for me but who knows maybe this is useful to someone else.
here is a paper that confirmed my suspicions and gave me some more empirical backing (though i admit my implementation is much simpler)…

any feedback would be appreciated, thanks

ptrblck · March 16, 2024, 1:49pm

I haven’t read the paper but would be interested to know how many runs were performed to create these plots? Do these lines represent a single run?

derby · March 16, 2024, 6:35pm

yeah they are single run, should be very easily reproducible to test (like copy paste and run the github code if you have all the dependencies) though results may vary (i hope not a lot though lol). this was with MNIST dataset and just on cpu