I know pytorch optimizer have parameter ‘weight_decay’ but how to choose a suitable value, I always got a too large or too small value
In my experience, n * e-4 is available, n = [1…9]. I choose n = 5 usually.
2 Likes