Time used per iteration

Hi,

I’m training a network. After plotting the time used per iteration, I found it goes linearly with number of iterations. There is no accumulation in architecture. What could be possible reasons? Thank you for sharing your experience

Are you training on GPU?

If yes you would have to call torch.cuda.synchronize() before you start timing and once before each time.time() since cuda calls are asynchronous.

I’m training on GPU. Thank you! I’ll try