Unusual CPU Stalls and Significant Training Speed Unstable During First Epoch

And I also found this might be relative: CPU thread slow to enqueue GPU and communication kernels but i’m not sure.