autocyz
(chenyongzhi)
August 7, 2019, 8:21am
1
When training, if loss is very small, I want loss not backward to save training time by this code:
total_loss = criterion(anchor_f, pos_f, neg_f)
if total_loss.item() > 1e-6:
optimizer.zero_grad()
total_loss.backward()
optimizer.step()
else:
pass
this will raise cuda memory error:
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.76 GiB total capacity; 9.93 GiB already allocated; 22.19 MiB free; 39.84 MiB cached)
Is my code not correct?
Since you don’t actually call backward, your graph doesn’t get destroyed. I think it would be better, to always backward and (depending on the value of your loss) call an additional zero_grad (or don’t call the optimizer step):
total_loss = criterion(anchor_f, pos_f, neg_f)
optimizer.zero_grad()
total_loss.backward()
if total_loss.item() > 1e-6:
optimizer.step()
else:
optimizer.zero_grad()
autocyz
(chenyongzhi)
August 8, 2019, 2:16am
3
Is there a way to skip backward?
You could put total_loss.backward
inside the condition using @justusschock ’s code.