Any analog of torch.cuda.syncronize() in C++ API?


(Nikita Petrenko) #1

I’ve been trying to record batch processing time, but stumbled upon asynchronous execution of torch operations. I know that I can sync with forward pass by calling loss.item() for example, but that looks ugly

Is there any analog of torch.cuda.syncronize() in C++ API?


(Alban D) #2

Hi,

You can call directly the cudaDeviceSynchronize() method as is done by the python code here.


(Nikita Petrenko) #3

That’s nice, thank you!

Do you know if CPU code is also executed asynchronously?


(Alban D) #4

No all cpu code is synchronous !