Introducing L1 regularization in optimizer in line with weight-decay?

bapi · March 18, 2019, 2:37pm

As this post https://bbabenko.github.io/weight-decay/ suggests about the equivalence of weight-decay and L2 regularizer, can introducing a parameter as below, say weight_decay_one, in an optimizer like SGD, give an equivalent L1 regularizer?:
In the definition of the method step:

if weight_decay_one != 0:
              d_p.add_(weight_decay_one)

The intuition here is that the differentiation of an L1 regularizer gives a constant.