How to skip backward if loss is very small in training

autocyz · August 7, 2019, 8:21am

When training, if loss is very small, I want loss not backward to save training time by this code:

total_loss = criterion(anchor_f, pos_f, neg_f)        
if total_loss.item() > 1e-6:
    optimizer.zero_grad()
    total_loss.backward()
    optimizer.step()
else:
    pass

this will raise cuda memory error:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.76 GiB total capacity; 9.93 GiB already allocated; 22.19 MiB free; 39.84 MiB cached)

Is my code not correct?

justusschock · August 7, 2019, 9:55am

Since you don’t actually call backward, your graph doesn’t get destroyed. I think it would be better, to always backward and (depending on the value of your loss) call an additional zero_grad (or don’t call the optimizer step):

total_loss = criterion(anchor_f, pos_f, neg_f)        
optimizer.zero_grad()
total_loss.backward()
if total_loss.item() > 1e-6:
    optimizer.step()
else:
    optimizer.zero_grad()

autocyz · August 8, 2019, 2:16am

Is there a way to skip backward?

ptrblck · August 10, 2019, 9:05pm

You could put total_loss.backward inside the condition using @justusschock’s code.