The part that I circled doesn’t seem right to me:
In L2 regularization, you modify the cost as follows
The weight update should be then
The way PyTorch applied the weight decay seems correct to me (you can drop the factor 2)
The part that I circled doesn’t seem right to me:
In L2 regularization, you modify the cost as follows
The weight update should be then
The way PyTorch applied the weight decay seems correct to me (you can drop the factor 2)