Profiling pytorch on GPU: synchronization issue

I was wondering how I could enforce synchronization for all cuda operations when profiling on GPU (in order to find the operations/function calls that are slow). Thanks!

I think:

torch.cuda.synchronize()
1 Like