LeeDoYup
(Doyup Lee)
5
Thanks Tom.
I checked both time.perf_counter() and time.process_time() with torch.cuda.synchronize(), and got similar results to time.time()
iv) use time.perf_counter() w/ torch.cuda.synchronize()
- shuffle time: 0.0650 s
- inf time: 0.0587 s
v) use time.process_time() w/ torch.cuda.synchronize()
- shuffle time: 0.0879 s
- inf time: 0.0584 s
When comparing all the results, the inference time is consistent,
but the shuffle time is inconsistent by the profiling method.
the shuffle time is shuffleBN is Moco
which gathers all samples, shuffles the index, and reallocates mini-batch to each GPU.
I cannot infer a reason why only the record time of this shuffle operation has varied by measuring method.