What are the downsides to using both mixed precision packages together? Currently, my script is something like this to choose one or the other:
with autocast(enabled=device.type == 'cuda' and apex is None):
logps = model(images)
loss = criterion(logps, labels)
if apex is None:
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
else:
with amp.scale_loss(loss, optimizer) as scaled_loss: # type: torch.FloatTensor
scaled_loss.backward()
optimizer.step()
Is this conditional split necessary? Is there a specific reason not to mix them together, or is it relatively benign? It would make my code cleaner to remove the apex is None
checks if unneeded but I’m not an expert on mixed precision so I don’t know the potential issues this may cause. I am considering something like this:
with autocast(enabled=device.type == 'cuda'):
logps = model(images)
loss = criterion(logps, labels)
with amp.scale_loss(loss, optimizer) as scaled_loss:
scaler.scale(scaled_loss).backward()
scaler.step(optimizer)
scaler.update()