Hi, guys,
I want to know whether torch.amp
would cause a slower convergence. I found after using torch.amp
, my custom model seems to converge slower than training it without torch.amp
.
Your answer and guide will be appreciated!
Hi, guys,
I want to know whether torch.amp
would cause a slower convergence. I found after using torch.amp
, my custom model seems to converge slower than training it without torch.amp
.
Your answer and guide will be appreciated!
Hi, @ptrblck, I noticed, at the beginning of the the curve, the loss convergence with FP32 seems to be a little faster than Mixed Precision, while they reach similar accuracy eventually.
That sounds reasonable assuming you are not seeing a large divergence.
Using amp
will not create bitwise-identical results to the float32
run, so the loss curves will see a bit of jitter during the training and will not map perfectly.
Thank you sincerely.