Hello, I am running a UNET in order to train a melspectrogram denoiser I am using 4 2080 Tis to train it. everything works fine for the first ~800 iterations of the first epoch (out of 1200) having approx 7-8 iterations per second. But after that 800iterations something happens (and I can’t tell what) but somehow I only get to 1-2 iterations per second. I am using BATCH_SIZE = 32 pin_memory=True DataLoader num_workers = 4 non_blocking = True
this is what I see when I do some profiling I can observe that it takes some time for the CPU to processes the next batches??? But I can’t understand why at first everything works fine having a high number of iterations per second
There is a gap of 1.5 seconds between batch 755 and batch 756
This behavior resets after each epoch