here is huggingface AdamW and there is correct_bias parameter. And I want to know which parameter it corresponds to in pytorch optim.Adamw. Is it amsgrad ? Thx!
From a quick look at the paper from which AMSgrad is taken, the answer is NO. I am not an expert, though, so this may be wrong.