Hi, can someone explain me in newbie words (i´m new at deep learning word), what does the parameter weight decay on torch adam? And whats the impact if i change it from 1e-2 to 0.
Thank you.
Hi
@julioeu99 weight decay in simple terms just reduces weights calculated with a constant(here 1e-2). This ensures that one does not have large weight values which sometimes leads to early overfilling.
Weight decay sometimes makes the model to converge slower.
By default pytorch has weight_decay=0
Some useful discussions on the same:
4 Likes