Convert_syncbn_model causes gradient overflow with apex mixed precision

hanzCV · September 8, 2020, 6:50am

I measure the performance following the post bellow. Instead of while I have the forward, backward, optimizer step and zeroing the parameter gradients, then I print the time each iteration takes.

I can confirm that the training did not face any problems yet, apart from the slow training .
Do you think the cuda version might be causing this? Is there any verbose I can enable for debugging, or if it can print something like apex when there is gradient overflow and is adjusting the scale loss?

Is there any difference between x.to('cuda') vs x.cuda()? Which one should I use?

torch.cuda.synchronize()
t1 = time.time()
while i< 500:
    a += 1
    a -= 1
    i+=1
torch.cuda.synchronize()
t2 = time.time()
print('cuda string', t2-t1)

Thanks for the help .