Hi, 'm new to mixed precision training.
In my case, I have two optimizers and update twice in a epoch.
Only one scaler is enough to use mixed precision training ?
(What about scale(), step() and update() ? )
Here is my pseudo-code.
(Training process is quite complex, so the codes are summarized in a abbreviated version)
for epoch in epochs:
...
optimizer.zero_grad()
optimizer2.zero_grad()
scaler.scale(loss2).backward()
scaler.step(optimizer2)
scaler.update()
...
optimizer.zero_grad()
optimizer2.zero_grad()
scaler.scale(loss1).backward()
scaler.step(optimizer)
scaler.update()