The answer comes from here - Why the training slow down with time if training continuously? And Gpu utilization begins to jitter dramatically?
I used torch.cuda.empty_cache()
at end of every loop
The answer comes from here - Why the training slow down with time if training continuously? And Gpu utilization begins to jitter dramatically?
I used torch.cuda.empty_cache()
at end of every loop