Hi, my code runs extremely slow so I was checking each step for the reason, however, there seems to be something wrong with the timing in PyTorch
t1 = torch.cuda.Event(enable_timing=True)
t2 = torch.cuda.Event(enable_timing=True)
t3 = torch.cuda.Event(enable_timing=True)
torch.cuda.synchronize()
t1.record()
mask_loss_2 = nn.functional.binary_cross_entropy(out_mask, target_mask)
torch.cuda.synchronize()
t2.record()
# self.mask_loss = nn.BCELoss(reduction='mean')
mask_loss = self.mask_loss(out_mask, target_mask)
torch.cuda.synchronize()
t3.record()
if show_time:
torch.cuda.synchronize()
print(" Loss time ", t1.elapsed_time(t2), t2.elapsed_time(t3))
and the results are
Loss time 594.671630859375 4.7564802169799805
But then I switched the order of 2 step:
torch.cuda.synchronize()
t1.record()
mask_loss = self.mask_loss(out_mask, target_mask)
torch.cuda.synchronize()
t2.record()
mask_loss_2 = nn.functional.binary_cross_entropy(out_mask, target_mask)
torch.cuda.synchronize()
t3.record()
and the results are
Loss time 580.7462158203125 4.7769598960876465
So it seems the first BCEloss in each step took much longer time than the second
I also feel that this timing isn’t accurate because I’m pretty sure the major time consumption happened in some complex operations after this step. Do you know why is this happening?
Thanks!