Hello, I am wondering how do the part code do in a training loop that I saw them frequently used.
Especially update
and scale
do.
optimizer.zero_grad()
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
Also, in what situation we need to use torch.amp.autocast()
?
Thank you